Max Tokens Parameter Not Functioning Correctly #9900
MitraSafarinejad
started this conversation in
General
Replies: 1 comment 1 reply
-
It seems there might be some confusion between the max_tokens parameter and the model's context size. The max_tokens parameter limits the number of tokens generated in a response, while the context size defines the maximum number of tokens the model can process, including both input and output. For instance, Claude's Sonnet model has a substantial context size of up to 200,000 tokens. This means that we may not encounter token limit errors until this threshold is reached |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Self Checks
Content
hello,
max_tokens parameter is supposed to limit the number of tokens processed, it appears not to be functioning correctly. For instance, in the Claude 3.5 Sonnet model, the official maximum token limit is indicated as 8192 tokens. However, I was able to input around 10,000 tokens without encountering any errors or warnings, and the model responded without enforcing the token limit. Additionally, could you please provide clarification on what the maximum token limit is and how I can change it?
Beta Was this translation helpful? Give feedback.
All reactions