Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[Bugfix] Allow fallback to AWQ from AWQMarlin at per-layer granularity quantization ready ONLY add when PR is ready to merge/full CI is needed
#13119 opened Feb 11, 2025 by mgoin Loading…
[ci] Consolidate Qwen models ready ONLY add when PR is ready to merge/full CI is needed structured-output
#13118 opened Feb 11, 2025 by khluu Loading…
[Misc] Log time consumption of sleep and wake-up
#13115 opened Feb 11, 2025 by waltforme Loading…
Support AWQMarlin with MLA
#13109 opened Feb 11, 2025 by mgoin Draft
[Bug] [V1] Try fetching stop_reason from EngineOutput before checking the request ready ONLY add when PR is ready to merge/full CI is needed v1
#13108 opened Feb 11, 2025 by bnellnm Loading…
Further reduce the HTTP calls to huggingface.co ready ONLY add when PR is ready to merge/full CI is needed
#13107 opened Feb 11, 2025 by maxdebayser Loading…
[Quant] Add SupportsQuant to phi3 and clip
#13104 opened Feb 11, 2025 by kylesayrs Loading…
[Bugfix] Do not crash V0 engine on input errors
#13101 opened Feb 11, 2025 by joerunde Loading…
[Bugfix] deepseek_r1_reasoning_parser put reason content in wrong field in certain edge case frontend ready ONLY add when PR is ready to merge/full CI is needed
#13097 opened Feb 11, 2025 by LikeSundayLikeRain Loading…
Consolidate Llama model usage in tests ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding v1
#13094 opened Feb 11, 2025 by hmellor Loading…
[V1][Minor] Restore V1 compatibility with LLMEngine class
#13090 opened Feb 11, 2025 by Ryp Loading…
[Misc] Add model list API in disagg proxy
#13083 opened Feb 11, 2025 by ggaaooppeenngg Loading…
[V1] Allow sliding window + prefix caching
#13069 opened Feb 11, 2025 by WoosukKwon Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.