Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add vllm-ascend usage doc & fix doc format #53

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shen-shanshan
Copy link
Contributor

@shen-shanshan shen-shanshan commented Feb 12, 2025

What this PR does / why we need it?

  1. Add vllm-ascend tutorial doc for Qwen/Qwen2.5-7B-Instruct model serving doc
  2. fix format of files in docs dir, e.g. format tables, add underline for links, add line feed...

Does this PR introduce any user-facing change?

no.

How was this patch tested?

no.

@shen-shanshan shen-shanshan marked this pull request as draft February 12, 2025 09:04
@shen-shanshan
Copy link
Contributor Author

cc:

@Yikun @wangxiyuan @MengqingCao

@wangxiyuan
Copy link
Collaborator

No need to update installation and quick start doc. They will be updated in new PR.

@shen-shanshan
Copy link
Contributor Author

No need to update installation and quick start doc. They will be updated in new PR.

ok.

docs/source/index.md Outdated Show resolved Hide resolved
docs/source/installation.md Outdated Show resolved Hide resolved
docs/source/quick_start.md Outdated Show resolved Hide resolved
docs/source/running_vllm_with_ascend.md Outdated Show resolved Hide resolved
docs/source/running_vllm_with_ascend.md Outdated Show resolved Hide resolved
docs/source/running_vllm_with_ascend.md Outdated Show resolved Hide resolved
docs/source/running_vllm_with_ascend.md Outdated Show resolved Hide resolved
docs/source/running_vllm_with_ascend.md Outdated Show resolved Hide resolved
docs/source/running_vllm_with_ascend.md Outdated Show resolved Hide resolved

```bash
cd /usr/local/Ascend/ascend-toolkit/latest/<arch>-linux # <arch>: aarch64 or x86_64
cat ascend_toolkit_install.info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just use one instruction

cat ~/Ascend/ascend-toolkit/latest/"$(uname -i)"-linux/ascend_toolkit_install.info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed now.

Signed-off-by: Shanshan Shen <[email protected]>
Copy link
Collaborator

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it has been greatly improved compared to the previous version, thank you!

```bash
# Use Modelscope mirror to speed up model download
export VLLM_USE_MODELSCOPE=True
export MODELSCOPE_CACHE=/root/models/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export MODELSCOPE_CACHE=/root/models/

you can use default cache -v /root/.cache:/root/.cache

-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/models:/root/models \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-v /root/models:/root/models \
-v /root/.cache:/root/.cache \

-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/models:/root/models \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-v /root/models:/root/models \
-v /root/.cache:/root/.cache \

-v /root/models:/root/models \
-p 8000:8000 \
-e VLLM_USE_MODELSCOPE=True \
-e MODELSCOPE_CACHE=/root/models/ \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-e MODELSCOPE_CACHE=/root/models/ \

-v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
-v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /root/models:/root/models \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-v /root/models:/root/models \
-v /root/.cache:/root/.cache \

```bash
# Use Modelscope mirror to speed up model download
export VLLM_USE_MODELSCOPE=True
export MODELSCOPE_CACHE=/root/models/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export MODELSCOPE_CACHE=/root/models/

Comment on lines +160 to +164
def clean_up():
destroy_model_parallel()
destroy_distributed_environment()
gc.collect()
torch.npu.empty_cache()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a little bit wired, would you mind taking a look? @wangxiyuan

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is only a simple example, no need to do

del llm
clean_up()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants