Config options to speed up inference for ce-esci-MiniLM-L12-v2 #1448

megh-khaire · 2025-02-16T02:16:06Z

megh-khaire
Feb 16, 2025

Hello, we've deployed metarank with ce-esci-MiniLM-L12-v2 for reranking.
Wanted to know if there are any configuration options available that would speed up the inference process.
Our use case requires the reranking to be as fast as possible...

Here is my current config:

inference:
  msmarco:
    type: cross-encoder
    model: metarank/ce-esci-MiniLM-L12-v2

The batch size is 20 (fixed) with small documents restricted to 255 chars.
We are deploying this on a CPU, with 16GB RAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config options to speed up inference for ce-esci-MiniLM-L12-v2 #1448

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Config options to speed up inference for ce-esci-MiniLM-L12-v2 #1448

megh-khaire Feb 16, 2025

Replies: 0 comments

megh-khaire
Feb 16, 2025