vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5.6k
Star 37.4k

Code
Issues 1.2k
Pull requests 507
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 57 Milestones 0

New pull request New

507 Open 5,719 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[ci] Add more source file dependencies for some tests ci/build

#13123 opened Feb 12, 2025 by khluu

Loading…

[Bugfix] Allow fallback to AWQ from AWQMarlin at per-layer granularity quantization ready

ONLY add when PR is ready to merge/full CI is needed

#13119 opened Feb 11, 2025 by mgoin

Loading…

[ci] Consolidate Qwen models ready

ONLY add when PR is ready to merge/full CI is needed

structured-output

#13118 opened Feb 11, 2025 by khluu

Loading…

[Misc] Log time consumption of sleep and wake-up

#13115 opened Feb 11, 2025 by waltforme

Loading…

[Frontend] Pass pre-created socket to uvicorn frontend

#13113 opened Feb 11, 2025 by russellb

Loading…

[Attention] Update to lastest FA3 code that supports different K and V head dims ci/build perf-benchmarks

#13111 opened Feb 11, 2025 by LucasWilkinson

Loading…

[Kernel] LoRA - Refactor sgmv kernels

#13110 opened Feb 11, 2025 by varun-sundar-rabindranath

Loading…

Support AWQMarlin with MLA

#13109 opened Feb 11, 2025 by mgoin • Draft

[Bug] [V1] Try fetching stop_reason from EngineOutput before checking the request ready

ONLY add when PR is ready to merge/full CI is needed

#13108 opened Feb 11, 2025 by bnellnm

Loading…

Further reduce the HTTP calls to huggingface.co ready

ONLY add when PR is ready to merge/full CI is needed

#13107 opened Feb 11, 2025 by maxdebayser

Loading…

[BUG] Addreses #3935 and #3683, by making intial_incremental_detokenization_offset configurable frontend v1

#13106 opened Feb 11, 2025 by sinamoeini

Loading…

[Quant] Add SupportsQuant to phi3 and clip

#13104 opened Feb 11, 2025 by kylesayrs

Loading…

[Bugfix] Do not crash V0 engine on input errors

#13101 opened Feb 11, 2025 by joerunde

Loading…

[Bugfix] deepseek_r1_reasoning_parser put reason content in wrong field in certain edge case frontend ready

ONLY add when PR is ready to merge/full CI is needed

#13097 opened Feb 11, 2025 by LikeSundayLikeRain

Loading…

[V1] LoRA - Add triton kernels for V1 v1

#13096 opened Feb 11, 2025 by varun-sundar-rabindranath • Draft

[V1][Kernel] Refactor the prefix_prefill kernel so that the caller no longer has to pass in the context lengths

#13095 opened Feb 11, 2025 by SageMoore

Loading…

Consolidate Llama model usage in tests ready

ONLY add when PR is ready to merge/full CI is needed

speculative-decoding v1

#13094 opened Feb 11, 2025 by hmellor

Loading…

[V1][Bugfix] DeepSeek-V3 v1 attn_backend miss q_lora_rank

#13092 opened Feb 11, 2025 by AoyuQC

Loading…

[V1][Minor] Restore V1 compatibility with LLMEngine class

#13090 opened Feb 11, 2025 by Ryp

Loading…

[core][dist] init device with current_platform.device_type

#13086 opened Feb 11, 2025 by MengqingCao

Loading…

[Misc] Add model list API in disagg proxy

#13083 opened Feb 11, 2025 by ggaaooppeenngg

Loading…

[V1][Pixtral-HF] Add custom slice_encoder_output for Pixtral v1

#13080 opened Feb 11, 2025 by lk-chen

Loading…

Support logit_bias in v1 Sampler v1

#13079 opened Feb 11, 2025 by houseroad • Draft

[V1] Allow sliding window + prefix caching

#13069 opened Feb 11, 2025 by WoosukKwon

Loading…

Run v1 latency benchmark and integrate with PyTorch OSS benchmark database ci/build

#13068 opened Feb 11, 2025 by huydhn

Loading…

Previous 1 2 3 4 5 … 20 21 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly