Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge main fix streaming #3239

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .clang-format-ignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sgl-kernel/3rdparty/tensorrt_llm/*
35 changes: 35 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
From lmsysorg/sglang:dev

# Create non-root user with specified UID and GID
# NOTE: Replace with your own UID and GID. This is a workaround from https://github.com/microsoft/vscode-remote-release/issues/49#issuecomment-489060908.
ARG HOST_UID=1003
ARG HOST_GID=1003
RUN groupadd -g $HOST_GID devuser && \
useradd -m -u $HOST_UID -g $HOST_GID -s /bin/zsh devuser

# Give devuser sudo access
RUN apt-get update && apt-get install -y sudo && \
echo "devuser ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/devuser && \
rm -rf /var/lib/apt/lists/* && \
apt-get clean

# Set up oh-my-zsh for devuser
RUN cp -r /root/.oh-my-zsh /home/devuser/.oh-my-zsh && \
cp /root/.zshrc /home/devuser/.zshrc && \
cp /root/.vimrc /home/devuser/.vimrc && \
cp /root/.tmux.conf /home/devuser/.tmux.conf && \
sed -i 's|/root/.oh-my-zsh|/home/devuser/.oh-my-zsh|g' /home/devuser/.zshrc && \
chown -R devuser:devuser /home/devuser/

# Set workspace directory and ownership
WORKDIR /sgl-workspace/sglang
RUN chown -R devuser:devuser /sgl-workspace

# Switch to devuser
USER devuser

# Install uv
RUN curl -LsSf https://astral.sh/uv/install.sh | sh

# Install rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
24 changes: 24 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"name": "sglang",
"build": {
"dockerfile": "Dockerfile"
},
"remoteUser": "devuser",
"customizations": {
"vscode": {
"extensions": [
// Python development
"ms-python.python",
"charliermarsh.ruff",
// Rust development
"rust-lang.rust-analyzer",
"tamasfe.even-better-toml"
]
}
},
"forwardPorts": [],
"runArgs": [
"--gpus",
"all"
]
}
4 changes: 3 additions & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
/python/sglang/srt @merrymercy @Ying1123 @hnyls2002 @zhyncs @ispobock @ByronHsu
/python/sglang/srt/constrained @hnyls2002
/python/sglang/srt/layers @merrymercy @Ying1123 @zhyncs @ispobock
/python/sglang/srt/layers/moe/fused_moe_triton @zhyncs @ispobock @HaiShaw
/python/sglang/srt/lora @Ying1123
/python/sglang/srt/managers @merrymercy @Ying1123 @hnyls2002
/python/sglang/srt/mem_cache @merrymercy @Ying1123 @hnyls2002
Expand All @@ -11,4 +12,5 @@
/python/sglang/srt/sampling @merrymercy @hnyls2002
/test/lang @merrymercy @Ying1123 @ByronHsu
/test/srt @merrymercy @Ying1123 @zhyncs
/rust @ByronHsu @Ying1123
/sgl-router @ByronHsu @Ying1123
/sgl-kernel @zhyncs @ispobock @HandH1998 @BBuf @yizhang2077 @merrymercy
7 changes: 4 additions & 3 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

## Checklist

- [ ] Format your code according to the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/references/contributor_guide.md).
- [ ] Add unit tests as outlined in the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/references/contributor_guide.md).
- [ ] Update documentation as needed, including docstrings or example tutorials.
- [ ] Format your code according to the [Code Formatting with Pre-Commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit).
- [ ] Add unit tests as outlined in the [Running Unit Tests](https://docs.sglang.ai/references/contribution_guide.html#running-unit-tests-adding-to-ci).
- [ ] Update documentation / docstrings / example tutorials as needed, according to [Writing Documentation](https://docs.sglang.ai/references/contribution_guide.html#writing-documentation-running-docs-ci).
- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to [Benchmark and Profiling](https://docs.sglang.ai/references/benchmark_and_profiling.html).
2 changes: 1 addition & 1 deletion .github/workflows/execute-notebook.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
python -m ipykernel install --user --name python3 --display-name "Python 3"
- name: Execute notebooks
timeout-minutes: 30
timeout-minutes: 40
run: |
cd docs
make clean
Expand Down
30 changes: 30 additions & 0 deletions .github/workflows/experiment-runner.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Experiment Runner

on:
workflow_dispatch:
inputs:
script:
description: "Experiment Runner Script"
default: "configs/sharegpt_config.yaml"

concurrency:
group: experiment-runner-${{ github.ref }}
cancel-in-progress: true

jobs:
experiment-runner-1-gpu:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: 1-gpu-runner
steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Install dependencies
run: |
bash scripts/ci_install_dependency.sh
- name: Test experiment runner
timeout-minutes: 120
run: |
cd test/srt
python3 experiment_runner.py --config ${{ inputs.script }}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Nightly Evaluation
name: Nightly Test

on:
schedule:
Expand All @@ -11,11 +11,11 @@ on:
workflow_dispatch:

concurrency:
group: nightly-eval-${{ github.ref }}
group: nightly-test-${{ github.ref }}
cancel-in-progress: true

jobs:
nightly-eval-2-gpu:
nightly-test:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: 2-gpu-runner
steps:
Expand All @@ -27,14 +27,8 @@ jobs:
bash scripts/ci_install_dependency.sh
pip install --upgrade "evalplus[vllm] @ git+https://github.com/evalplus/evalplus"
- name: Test gsm8k
- name: Run test
timeout-minutes: 120
run: |
cd test/srt
python3 test_nightly_gsm8k_eval.py
- name: Test human eval
timeout-minutes: 120
run: |
cd test/srt
python3 test_nightly_human_eval.py
python3 run_suite.py --suite nightly --timeout-per-file 2400
19 changes: 10 additions & 9 deletions .github/workflows/pr-test-rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ on:
push:
branches: [ main ]
paths:
- "rust/**"
- "sgl-router/**"
pull_request:
branches: [ main ]
paths:
- "rust/**"
- "sgl-router/**"
workflow_dispatch:

concurrency:
Expand All @@ -30,17 +30,17 @@ jobs:
- name: Run fmt
run: |
source "$HOME/.cargo/env"
cd rust/
cd sgl-router/
cargo fmt -- --check

- name: Run test
timeout-minutes: 20
run: |
source "$HOME/.cargo/env"
cd rust/
cd sgl-router/
cargo test

e2e-rust:
e2e-python:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: 2-gpu-runner
steps:
Expand All @@ -54,17 +54,18 @@ jobs:
- name: Build python binding
run: |
source "$HOME/.cargo/env"
cd rust
cd sgl-router
pip install setuptools-rust wheel build
python3 -m build
pip install dist/*.whl
pip install --force-reinstall dist/*.whl
- name: Run e2e test
run: |
cd rust/py_test
bash scripts/killall_sglang.sh "nuk_gpus"
cd sgl-router/py_test
python3 run_suite.py

finish:
needs: [unit-test-rust, e2e-rust]
needs: [unit-test-rust, e2e-python]
runs-on: ubuntu-latest
steps:
- name: Finish
Expand Down
103 changes: 103 additions & 0 deletions .github/workflows/pr-test-sgl-kernel.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
name: PR Test (sgl-kernel)

on:
push:
branches: [ main ]
paths:
- "sgl-kernel/**"
pull_request:
branches: [ main ]
paths:
- "sgl-kernel/**"
workflow_dispatch:

concurrency:
group: pr-test-sgl-kernel-${{ github.ref }}
cancel-in-progress: true

jobs:
lint:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Check clang-format
uses: DoozyX/[email protected]
with:
source: sgl-kernel
extensions: h,c,cpp,hpp,cu,cuh,cc
clangFormatVersion: 16
style: file

build-wheels:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
runs-on: sgl-kernel-build-node
strategy:
matrix:
python-version: ['3.9']
cuda-version: ['12.4']

steps:
- name: Cleanup
run: |
sudo rm -rf $GITHUB_WORKSPACE/* || true
- uses: actions/checkout@v4
with:
submodules: 'recursive'

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Build wheels for Python ${{ matrix.python-version }} and CUDA ${{ matrix.cuda-version }}
run: |
cd sgl-kernel
chmod +x ./build.sh
./build.sh "${{ matrix.python-version }}" "${{ matrix.cuda-version }}"
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: wheel-python${{ matrix.python-version }}-cuda${{ matrix.cuda-version }}
path: sgl-kernel/dist/*

unit-test:
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request'
needs: build-wheels
runs-on: 1-gpu-runner
steps:
- uses: actions/checkout@v4

- name: Download artifacts
uses: actions/download-artifact@v4
with:
path: sgl-kernel/dist/
merge-multiple: true
pattern: wheel-*

- name: Install
run: |
pip3 install torch==2.5.1 && pip3 install pytest && pip3 install vllm==0.6.4.post1
pip3 uninstall sgl-kernel -y || true
pip3 install sgl-kernel/dist/*whl --force-reinstall --no-deps
pip3 list | grep sgl-kernel
- name: Run test
timeout-minutes: 30
run: |
cd sgl-kernel
find tests -name "test_*.py" | xargs -n 1 python3
- name: Uninstall dependencies
run: |
pip3 uninstall sgl-kernel -y
finish:
needs: [unit-test, lint]
runs-on: ubuntu-latest
steps:
- name: Finish
run: echo "This is an empty step to ensure that all jobs are completed."
Loading