-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[ci] Add more source file dependencies for some tests
ci/build
#13123
opened Feb 12, 2025 by
khluu
Loading…
[Bugfix] Allow fallback to AWQ from AWQMarlin at per-layer granularity
quantization
ready
ONLY add when PR is ready to merge/full CI is needed
#13119
opened Feb 11, 2025 by
mgoin
Loading…
[ci] Consolidate Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
#13118
opened Feb 11, 2025 by
khluu
Loading…
[Frontend] Pass pre-created socket to uvicorn
frontend
#13113
opened Feb 11, 2025 by
russellb
Loading…
[Attention] Update to lastest FA3 code that supports different K and V head dims
ci/build
perf-benchmarks
#13111
opened Feb 11, 2025 by
LucasWilkinson
Loading…
[Kernel] LoRA - Refactor sgmv kernels
#13110
opened Feb 11, 2025 by
varun-sundar-rabindranath
Loading…
[Bug] [V1] Try fetching stop_reason from EngineOutput before checking the request
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#13108
opened Feb 11, 2025 by
bnellnm
Loading…
Further reduce the HTTP calls to huggingface.co
ready
ONLY add when PR is ready to merge/full CI is needed
#13107
opened Feb 11, 2025 by
maxdebayser
Loading…
[BUG] Addreses #3935 and #3683, by making
intial_incremental_detokenization_offset
configurable
frontend
v1
#13106
opened Feb 11, 2025 by
sinamoeini
Loading…
[Bugfix] deepseek_r1_reasoning_parser put reason content in wrong field in certain edge case
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#13097
opened Feb 11, 2025 by
LikeSundayLikeRain
Loading…
[V1] LoRA - Add triton kernels for V1
v1
#13096
opened Feb 11, 2025 by
varun-sundar-rabindranath
•
Draft
[V1][Kernel] Refactor the prefix_prefill kernel so that the caller no longer has to pass in the context lengths
#13095
opened Feb 11, 2025 by
SageMoore
Loading…
Consolidate Llama model usage in tests
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
v1
#13094
opened Feb 11, 2025 by
hmellor
Loading…
[V1][Bugfix] DeepSeek-V3 v1 attn_backend miss q_lora_rank
#13092
opened Feb 11, 2025 by
AoyuQC
Loading…
[V1][Minor] Restore V1 compatibility with LLMEngine class
#13090
opened Feb 11, 2025 by
Ryp
Loading…
[core][dist] init device with current_platform.device_type
#13086
opened Feb 11, 2025 by
MengqingCao
Loading…
[V1][Pixtral-HF] Add custom
slice_encoder_output
for Pixtral
v1
#13080
opened Feb 11, 2025 by
lk-chen
Loading…
Run v1 latency benchmark and integrate with PyTorch OSS benchmark database
ci/build
#13068
opened Feb 11, 2025 by
huydhn
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.