Release OpenVINO™ Model Server 2025.0 · openvinotoolkit/model_server

The 2025.0 is a major release adding support for Windows native deployments and improvements to the generative use cases.

New feature - Windows native server deployment

This release enables model server deployment on Windows operating systems as a binary application
Full support for generative endpoints – text generation and embeddings based on OpenAI API, reranking based on Cohere API
Functional parity with linux version with several minor differences: cloud storage, CAPI interface, DAG pipelines - read more
It is targeted on client machines with Windows 11 and Data Center environment with Windows 2022 Server OS
Demos are updated to work both on Linux and Windows. Check the installation guide

Added official support for Battle Mage GPU, Arrow Lake CPU, iGPU, NPU and Lunar Lake CPU, iGPU and NPU
Updated base docker images – added Ubuntu 24 and RedHat UBI 9, dropped Ubuntu 20 and RedHat UBI 8
Extended chat/completions API to support max_completion_tokens parameter and messages content as an array. Those changes are to make the API keep compatibility with OpenAI API.
Truncate option in embeddings endpoint – It is now possible to export the embeddings model with option to truncate the input automatically to match the embeddings context length. By default, the error is raised when too long input is passed.
Speculative decoding algorithm added to text generations – Check the demo
Added direct support for models without named outputs – when models don’t have named outputs, generic names will be assigned in the model initialization with a pattern out_<index>
Added histogram metric for tracking MediaPipe graph processing duration
Performance improvements

You can use an OpenVINO Model Server public Docker images based on Ubuntu via the following command:

docker pull openvino/model_server:2025.0 - CPU device support
docker pull openvino/model_server:2025.0-gpu - GPU, NPU and CPU device support

or use provided binary packages.
The prebuilt image is available also on RedHat Ecosystem Catalog