-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Bedrock token count and IDs for Anthropic models #341
base: main
Are you sure you want to change the base?
Changes from 4 commits
4b49834
801f637
698aa9c
7f3807a
11e3afe
783ab1a
49f9b9a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,21 +1,51 @@ | ||
import re | ||
import sys | ||
from typing import Any, List | ||
|
||
from packaging import version | ||
|
||
|
||
def enforce_stop_tokens(text: str, stop: List[str]) -> str: | ||
"""Cut off the text as soon as any stop words occur.""" | ||
return re.split("|".join(stop), text, maxsplit=1)[0] | ||
|
||
|
||
def _get_anthropic_client() -> Any: | ||
def check_anthropic_tokens_dependencies() -> List[str]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Would anthropic_tokens_supported be a better name here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated this 👍 |
||
"""Check if we have all requirements for Anthropic count_tokens() and get_tokenizer().""" | ||
bad_deps = [] | ||
|
||
python_version = sys.version_info | ||
if python_version > (3, 12): | ||
bad_deps.append(f"Python 3.12 or earlier required, found {'.'.join(map(str, python_version[:3]))})") | ||
|
||
bad_anthropic = None | ||
try: | ||
import anthropic | ||
anthropic_version = version.parse(anthropic.__version__) | ||
if anthropic_version > version.parse("0.38.0"): | ||
bad_anthropic = anthropic_version | ||
except ImportError: | ||
bad_anthropic = "none installed" | ||
|
||
bad_httpx = None | ||
try: | ||
import httpx | ||
httpx_version = version.parse(httpx.__version__) | ||
if httpx_version > version.parse("0.27.2"): | ||
bad_httpx = httpx_version | ||
except ImportError: | ||
raise ImportError( | ||
"Could not import anthropic python package. " | ||
"This is needed in order to accurately tokenize the text " | ||
"for anthropic models. Please install it with `pip install anthropic`." | ||
) | ||
bad_httpx = "none installed" | ||
|
||
if bad_anthropic: | ||
bad_deps.append(f"anthropic<=0.38.0 required, found {bad_anthropic}.") | ||
if bad_httpx: | ||
bad_deps.append(f"httpx<=0.27.2 required, found {bad_httpx}.") | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Curious about these other checks here, would only checking the anthropic version not suffice? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The
However, this is incompatible with
So, users must install There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
But |
||
return bad_deps | ||
|
||
|
||
def _get_anthropic_client() -> Any: | ||
import anthropic | ||
return anthropic.Anthropic() | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this cause duplicate logger messages, as
super().get_num_tokens(text)
ends up calling theget_token_ids
method. Also, adding a logger message would just fall under the radar for most users, unless they have the debug logging switched on, should we rather use a warning?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch - removed the warning from
get_num_tokens
, and updated the message as suggested.