Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding TPOT and ITL metrics #4

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Edwinhr716
Copy link
Contributor

Added Time Per Output Token (TPOT) and Inter-Token Latency (ITL) metrics. In order to get these metrics, need to add the --stream-request flag.

Sample output:

====Result for Model: deepseek-ai/DeepSeek-R1====
Errors: {'ClientConnectorError': 0, 'TimeoutError': 0, 'ContentTypeError': 0, 'ClientOSError': 0, 'ServerDisconnectedError': 0, 'unknown_error': 0}
Total time: 115.56 s
Successful/total requests: 120/120
Requests/min: 62.30
Output_tokens/min: 13539.97
Input_tokens/min: 5607.26
Tokens/min: 19147.23
Average Time to First Token (s): 0.51
Average Inter-Token Latency (s): 0.08
Average Time Per Output Token (s): 0.08
Average seconds/token (includes waiting time on server): 0.06
Average milliseconds/request (includes waiting time on server): 18217.93
Average milliseconds/output_token (includes waiting time on server): 83.83
Average input length: 90.00
Average output length: 217.32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant