forked from sgl-project/sglang
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' of https://github.com/EvolvingLMMs-Lab/sglang
- Loading branch information
Showing
282 changed files
with
24,307 additions
and
8,083 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
name: 🐞 Bug report | ||
description: Create a report to help us reproduce and fix the bug | ||
title: "[Bug] " | ||
labels: ['Bug'] | ||
|
||
body: | ||
- type: checkboxes | ||
attributes: | ||
label: Checklist | ||
options: | ||
- label: 1. I have searched related issues but cannot get the expected help. | ||
- label: 2. The bug has not been fixed in the latest version. | ||
- label: 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback. | ||
- label: 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. | ||
- label: 5. Please use English, otherwise it will be closed. | ||
- type: textarea | ||
attributes: | ||
label: Describe the bug | ||
description: A clear and concise description of what the bug is. | ||
validations: | ||
required: true | ||
- type: textarea | ||
attributes: | ||
label: Reproduction | ||
description: | | ||
What command or script did you run? Which **model** are you using? | ||
placeholder: | | ||
A placeholder for the command. | ||
validations: | ||
required: true | ||
- type: textarea | ||
attributes: | ||
label: Environment | ||
description: | | ||
Please provide necessary environment information here with `python3 -m sglang.check_env`. Otherwise the issue will be closed. | ||
placeholder: Environment here. | ||
validations: | ||
required: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
name: 🚀 Feature request | ||
description: Suggest an idea for this project | ||
title: "[Feature] " | ||
|
||
body: | ||
- type: checkboxes | ||
attributes: | ||
label: Checklist | ||
options: | ||
- label: 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed. | ||
- label: 2. Please use English, otherwise it will be closed. | ||
- type: textarea | ||
attributes: | ||
label: Motivation | ||
description: | | ||
A clear and concise description of the motivation of the feature. | ||
validations: | ||
required: true | ||
- type: textarea | ||
attributes: | ||
label: Related resources | ||
description: | | ||
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
<!-- Thank you for your contribution! We appreciate it. The following guidelines will help improve your pull request and facilitate feedback. If anything is unclear, don't hesitate to submit your pull request and ask the maintainers for assistance. --> | ||
|
||
## Motivation | ||
|
||
<!-- Explain the purpose of this PR and the goals it aims to achieve. --> | ||
|
||
## Modifications | ||
|
||
<!-- Describe the changes made in this PR. --> | ||
|
||
## Checklist | ||
|
||
- [ ] Format your code according to the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/en/contributor_guide.md). | ||
- [ ] Add unit tests as outlined in the [Contributor Guide](https://github.com/sgl-project/sglang/blob/main/docs/en/contributor_guide.md). | ||
- [ ] Update documentation as needed, including docstrings or example tutorials. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
name: Accuracy Test | ||
|
||
on: | ||
push: | ||
branches: [ main ] | ||
paths: | ||
- "python/sglang/**" | ||
- "test/**" | ||
pull_request: | ||
branches: [ main ] | ||
paths: | ||
- "python/sglang/**" | ||
- "test/**" | ||
workflow_dispatch: | ||
|
||
concurrency: | ||
group: accuracy-test-${{ github.ref }} | ||
cancel-in-progress: true | ||
|
||
jobs: | ||
accuracy-test: | ||
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request' | ||
runs-on: 1-gpu-runner | ||
|
||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v3 | ||
|
||
- name: Install dependencies | ||
run: | | ||
pip install --upgrade pip | ||
pip install -e "python[all]" | ||
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall | ||
git clone https://github.com/merrymercy/human-eval.git | ||
cd human-eval | ||
pip install -e . | ||
- name: Evaluate Accuracy | ||
timeout-minutes: 20 | ||
run: | | ||
cd test/srt | ||
python3 test_eval_accuracy_large.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
name: Weekly Cache Purge | ||
|
||
on: | ||
schedule: | ||
- cron: '0 0 * * 0' # Every Sunday at 00:00 | ||
workflow_dispatch: | ||
|
||
jobs: | ||
purge-cache: | ||
if: github.repository == 'sgl-project/sglang' | ||
runs-on: self-hosted | ||
|
||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v3 | ||
|
||
- name: Purge pip cache | ||
run: | | ||
source $HOME/venv/bin/activate | ||
echo "$HOME/venv/bin" >> $GITHUB_PATH | ||
pip cache purge | ||
- name: Update dependencies | ||
run: | | ||
pip install --upgrade pip | ||
pip install -e "python[all]" | ||
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: Cancel PR Workflows on Merge | ||
|
||
on: | ||
pull_request_target: | ||
types: | ||
- closed | ||
|
||
permissions: | ||
actions: write | ||
|
||
jobs: | ||
cancel: | ||
if: github.event.pull_request.merged == true | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Cancel Previous Runs | ||
uses: styfle/[email protected] | ||
with: | ||
workflow_id: all | ||
access_token: ${{ secrets.GITHUB_TOKEN }} | ||
ignore_sha: true | ||
pr_number: ${{ github.event.pull_request.number }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
name: Close Inactive Issues | ||
|
||
on: | ||
schedule: | ||
- cron: '0 0 * * *' | ||
workflow_dispatch: | ||
|
||
permissions: | ||
issues: write | ||
contents: read | ||
|
||
jobs: | ||
close-inactive-issues: | ||
if: github.repository == 'sgl-project/sglang' | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Check and close inactive issues | ||
uses: actions/github-script@v6 | ||
with: | ||
github-token: ${{secrets.GITHUB_TOKEN}} | ||
script: | | ||
const sixtyDaysAgo = new Date(Date.now() - 60 * 24 * 60 * 60 * 1000); | ||
const [owner, repo] = process.env.GITHUB_REPOSITORY.split('/'); | ||
console.log(`Owner: ${owner}, Repo: ${repo}`); | ||
async function fetchIssues(page = 1) { | ||
console.log(`Fetching issues for ${owner}/${repo}, page ${page}`); | ||
return await github.rest.issues.listForRepo({ | ||
owner, | ||
repo, | ||
state: 'open', | ||
sort: 'updated', | ||
direction: 'asc', | ||
per_page: 100, | ||
page: page | ||
}); | ||
} | ||
async function processIssues() { | ||
console.log('Starting to process issues'); | ||
console.log(`Repository: ${owner}/${repo}`); | ||
let page = 1; | ||
let hasMoreIssues = true; | ||
while (hasMoreIssues) { | ||
try { | ||
const issues = await fetchIssues(page); | ||
console.log(`Fetched ${issues.data.length} issues on page ${page}`); | ||
if (issues.data.length === 0) { | ||
hasMoreIssues = false; | ||
break; | ||
} | ||
for (const issue of issues.data) { | ||
if (new Date(issue.updated_at) < sixtyDaysAgo) { | ||
try { | ||
await github.rest.issues.update({ | ||
owner, | ||
repo, | ||
issue_number: issue.number, | ||
state: 'closed', | ||
labels: [...issue.labels.map(l => l.name), 'inactive'] | ||
}); | ||
await github.rest.issues.createComment({ | ||
owner, | ||
repo, | ||
issue_number: issue.number, | ||
body: 'This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.' | ||
}); | ||
console.log(`Closed issue #${issue.number} due to inactivity.`); | ||
} catch (error) { | ||
console.error(`Failed to close issue #${issue.number}: ${error.message}`); | ||
} | ||
} else { | ||
console.log(`Issue #${issue.number} is still active. Stopping processing.`); | ||
hasMoreIssues = false; | ||
break; | ||
} | ||
} | ||
page += 1; | ||
} catch (error) { | ||
console.error(`Error fetching issues on page ${page}: ${error.message}`); | ||
hasMoreIssues = false; | ||
} | ||
} | ||
console.log('Finished processing issues'); | ||
} | ||
await processIssues(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
name: E2E Test | ||
|
||
on: | ||
push: | ||
branches: [ main ] | ||
paths: | ||
- "python/sglang/**" | ||
- "test/**" | ||
pull_request: | ||
branches: [ main ] | ||
paths: | ||
- "python/sglang/**" | ||
- "test/**" | ||
workflow_dispatch: | ||
|
||
concurrency: | ||
group: e2e-test-${{ github.ref }} | ||
cancel-in-progress: true | ||
|
||
jobs: | ||
e2e-test: | ||
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request' | ||
runs-on: 1-gpu-runner | ||
|
||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v3 | ||
|
||
- name: Install dependencies | ||
run: | | ||
pip install --upgrade pip | ||
pip install -e "python[all]" | ||
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall | ||
- name: Benchmark Serving Throughput | ||
timeout-minutes: 10 | ||
run: | | ||
cd test/srt | ||
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default | ||
- name: Benchmark Serving Latency | ||
timeout-minutes: 10 | ||
run: | | ||
python3 -m sglang.bench_latency --model meta-llama/Meta-Llama-3.1-8B-Instruct --batch-size 1 --input 128 --output 8 | ||
- name: Benchmark Serving Throughput (w/o RadixAttention) | ||
timeout-minutes: 10 | ||
run: | | ||
cd test/srt | ||
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default_without_radix_cache | ||
- name: Benchmark Serving Throughput (w/o ChunkedPrefill) | ||
timeout-minutes: 10 | ||
run: | | ||
cd test/srt | ||
python3 -m unittest test_serving_throughput.TestServingThroughput.test_default_without_chunked_prefill |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
name: Lint | ||
|
||
on: [push, pull_request] | ||
|
||
jobs: | ||
lint: | ||
runs-on: ubuntu-20.04 | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.8 | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.8 | ||
- name: Install pre-commit hook | ||
run: | | ||
python -m pip install pre-commit | ||
pre-commit install | ||
- name: Linting | ||
run: pre-commit run --all-files |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
name: MoE Test | ||
|
||
on: | ||
push: | ||
branches: [ main ] | ||
paths: | ||
- "python/sglang/**" | ||
- "test/**" | ||
pull_request: | ||
branches: [ main ] | ||
paths: | ||
- "python/sglang/**" | ||
- "test/**" | ||
workflow_dispatch: | ||
|
||
concurrency: | ||
group: moe-test-${{ github.ref }} | ||
cancel-in-progress: true | ||
|
||
jobs: | ||
moe-test: | ||
if: github.repository == 'sgl-project/sglang' || github.event_name == 'pull_request' | ||
runs-on: 2-gpu-runner | ||
|
||
steps: | ||
- name: Checkout code | ||
uses: actions/checkout@v3 | ||
|
||
- name: Install dependencies | ||
run: | | ||
pip install --upgrade pip | ||
pip install -e "python[all]" | ||
pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/ --force-reinstall | ||
- name: Benchmark MoE Serving Throughput | ||
timeout-minutes: 10 | ||
run: | | ||
cd test/srt | ||
python3 -m unittest test_moe_serving_throughput.TestServingThroughput.test_default | ||
- name: Benchmark MoE Serving Throughput (w/o RadixAttention) | ||
timeout-minutes: 10 | ||
run: | | ||
cd test/srt | ||
python3 -m unittest test_moe_serving_throughput.TestServingThroughput.test_default_without_radix_cache |
Oops, something went wrong.