-
I want to get the token usage to display token count in frontend application. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
If you're using the remote runnable client with invoke use: https://python.langchain.com/docs/modules/callbacks/token_counting If you're doing it with your own custom code, then you'll need to filter until you encounter the appropriate callback event that corresponds to on llm end. Keep in mind that:
One simple thing you could do:
|
Beta Was this translation helpful? Give feedback.
-
@eyurtsev But, fortunately, my current application not use streaming for now. i will use callback as a short term solution. |
Beta Was this translation helpful? Give feedback.
-
One option is to do the calculation on the server side and return it as part of the response: from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableMap, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
prompt = PromptTemplate.from_template("{input}")
def calculate_tokens(text: str) -> int:
return len(text) # Replace with appropriate tokenization logic
model = ChatOpenAI()
chain = prompt | RunnableMap({
"output": model,
"input_tokens": lambda prompt: calculate_tokens(prompt.text)
}) | RunnablePassthrough.assign(output_tokens=lambda x: calculate_tokens(x['output'].content))
chain.invoke({'input': 'hello'}) |
Beta Was this translation helpful? Give feedback.
-
thanks. it seems the option is better way than using callback infos. |
Beta Was this translation helpful? Give feedback.
One option is to do the calculation on the server side and return it as part of the response: