Config options to speed up inference for ce-esci-MiniLM-L12-v2 #1448
Unanswered
megh-khaire
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, we've deployed metarank with
ce-esci-MiniLM-L12-v2
for reranking.Wanted to know if there are any configuration options available that would speed up the inference process.
Our use case requires the reranking to be as fast as possible...
Here is my current config:
The batch size is 20 (fixed) with small documents restricted to 255 chars.
We are deploying this on a CPU, with 16GB RAM.
Beta Was this translation helpful? Give feedback.
All reactions