Inference
Optimize and deploy with Optimum-Intel and OpenVINO GenAI
Intel has released Optimum-Intel, an extension of the Optimum library, which integrates with OpenVINO to optimize and deploy generative AI models. This toolkit supports model quantization, pruning, and deployment to Intel hardware, enhancing performance on CPUs and VPUs. Practitioners can leverage these optimizations to improve inference speed and reduce resource consumption in production environments.
optimumopenvinogenai