What is the best way to get token usage via api response? #497

teeppp · 2024-01-31T02:49:34Z

teeppp
Jan 31, 2024

I want to get the token usage to display token count in frontend application.
I found it has already included in callback_events when include_callback_event=True.
But, there are too much information.
is there how to exclude unnessesary infomartions from callback_events, or other way？

Answered by eyurtsev

Feb 5, 2024

One option is to do the calculation on the server side and return it as part of the response:

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableMap, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

prompt = PromptTemplate.from_template("{input}")

def calculate_tokens(text: str) -> int:
    return len(text) # Replace with appropriate tokenization logic

model = ChatOpenAI()

chain = prompt | RunnableMap({
    "output": model,
    "input_tokens": lambda prompt: calculate_tokens(prompt.text)
}) | RunnablePassthrough.assign(output_tokens=lambda x: calculate_tokens(x['…

View full answer

eyurtsev · 2024-02-01T21:47:39Z

eyurtsev
Feb 1, 2024
Maintainer

If you're using the remote runnable client with invoke use:

https://python.langchain.com/docs/modules/callbacks/token_counting

If you're doing it with your own custom code, then you'll need to filter until you encounter the appropriate callback event that corresponds to on llm end.

Keep in mind that:

callbacks are not considered stable yet, they're part of the API, but I haven't tested extensively the callbacks so not sure how well they work
callbacks are not included with streaming/stream_log/stream_events at the moment
If you're using open AI: open AI does not include token counts with streamin (if I'm not mistaken), so callbacks wouldn't help (at least not unless the token counting code above is modified to work with streaming response from open ai).
If you're using other model providers, you will need to check if there's a way to count tokens with them.

One simple thing you could do:

Get the response on the client side
Use a tokenizer to estimate the number of tokens

0 replies

teeppp · 2024-02-02T13:04:42Z

teeppp
Feb 2, 2024
Author

@eyurtsev
Thank you for your reply. I understood it.
as you point it out, my application is not use the runnable client.
And, the your simple solution to use tokenizer on clientside is not good way for me because we have some intermediate steps depends on RAG. there are some internal prompts only in the backend application. this case, we cannot estimate it on client side accurately.

But, fortunately, my current application not use streaming for now. i will use callback as a short term solution.
As a future request, i hope it to implement to easily to some additional information to api response if possible.

0 replies

eyurtsev · 2024-02-05T17:23:59Z

eyurtsev
Feb 5, 2024
Maintainer

One option is to do the calculation on the server side and return it as part of the response:

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableMap, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

prompt = PromptTemplate.from_template("{input}")

def calculate_tokens(text: str) -> int:
    return len(text) # Replace with appropriate tokenization logic

model = ChatOpenAI()

chain = prompt | RunnableMap({
    "output": model,
    "input_tokens": lambda prompt: calculate_tokens(prompt.text)
}) | RunnablePassthrough.assign(output_tokens=lambda x: calculate_tokens(x['output'].content))

chain.invoke({'input': 'hello'})

0 replies

teeppp · 2024-02-14T08:16:36Z

teeppp
Feb 14, 2024
Author

thanks. it seems the option is better way than using callback infos.
i will try it.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the best way to get token usage via api response? #497

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

What is the best way to get token usage via api response? #497

teeppp Jan 31, 2024

Replies: 4 comments

eyurtsev Feb 1, 2024 Maintainer

teeppp Feb 2, 2024 Author

eyurtsev Feb 5, 2024 Maintainer

teeppp Feb 14, 2024 Author

teeppp
Jan 31, 2024

eyurtsev
Feb 1, 2024
Maintainer

teeppp
Feb 2, 2024
Author

eyurtsev
Feb 5, 2024
Maintainer

teeppp
Feb 14, 2024
Author