forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Core] Create metrics for external cache services (#3)
* Enable vineyard llm kv cache in vLLM Based on another version of vllm: sighingnow@d347dab Cherry-pick from commit d347dab Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com> (cherry picked from commit 1545f6bf7edcd667e305d3fbcadd913066f04747) resolving vllm update diff temporarily comment out torch.distributed for single node env add VineyardCacheConfig with https://github.com/v6d-io/v6d/blob/ebe8f077e3d3780a27d49238c501854b6b8e29df/modules/llm-cache/ds/kv_cache_block.cc#L163 commented out; cache_ops fix remove CacheConfig from argument (configure through ENV) v6d: fix integration w/ v1 APIs Signed-off-by: Haiyang Shi <haiyang.shi@bytedance.com> Change model_runner to latest version cherry pick model_runner from d347dab source sighingnow@d347dab fix reshape_and_cache_flash argument add cache prefetch/update to work_base clean up Fix after rebase to 029c71d remove tensor copy from cache managed address to pin memory clean up Add fixes to address comments adding cache service metrics initial adding cache service metrics initial update ttft metrics update prefix caching with max num seqs argument * fix token_len stat; Add median metrics * FIX: avg metrics collection; using cuda_event to collect metrics * add reshape time * Address comments --------- Co-authored-by: Tao He <linzhu.ht@alibaba-inc.com>
1 parent
7cdac48
commit a8ae12c
Showing
10 changed files
with
149 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters