[BUG]: llama3.1 8B Context Size Max Tokens Ignored in Both Performance Modes #2442
Labels
needs info / can't replicate
Issues that require additional information and/or cannot currently be replicated, but possible bug
possible bug
Bug was reported but is not confirmed or is unable to be replicated.
How are you running AnythingLLM?
AnythingLLM desktop app
What happened?
When using "Base" as the "Performance Mode", the Max Tokens setting is ignored and Llama 3.1 is invoked with 8K context size. When setting Performance Mode to "Maximum", the Max Tokens settings is ignored and Llama 3.1 is invoked with 128K context size. Created a modelfile to enforce 32K context size but the result was 128K. Workspace was set to use the system defined LLM settings.
Are there known steps to reproduce?
See above
The text was updated successfully, but these errors were encountered: