Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm Chart Lacks Clear Support for Multi-Node vLLM Deployment #50

Open
shohamyamin opened this issue Jan 31, 2025 · 10 comments
Open

Helm Chart Lacks Clear Support for Multi-Node vLLM Deployment #50

shohamyamin opened this issue Jan 31, 2025 · 10 comments
Labels
help wanted Extra attention is needed

Comments

@shohamyamin
Copy link

The current Helm chart does not explicitly support deploying vLLM across multiple vllm nodes on Kubernetes, or it's unclear how to configure it. Improved documentation or multi-node support is needed for deploying LLM that require multi-node

@ahg-g
Copy link

ahg-g commented Jan 31, 2025

For reference, and if someone would like to add that to helm charts, the LeaderWorkerSet API can be used to deploy multi-node vllm on k8s: https://docs.vllm.ai/en/latest/deployment/frameworks/lws.html

Also check examples on the LWS repo: https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm

@YuhanLiu11
Copy link
Collaborator

Thanks for submitting the issue!

Multi-node deployment should be supported. If you encounter any issues running a multi-node deployment, feel free to let us know.

We will improve the documentation to clarify how to configure multi-node deployment on Kubernetes.

@ApostaC ApostaC added the help wanted Extra attention is needed label Feb 7, 2025
@ApostaC
Copy link
Collaborator

ApostaC commented Feb 9, 2025

@shohamyamin we recently did some local tests and will update the helm charts and the docs soon!

@shohamyamin
Copy link
Author

@ApostaC great to hear that. Did you know if this configuration can be run on a rootless environment(like Openshift or other k8s rootless environments)?

@ApostaC
Copy link
Collaborator

ApostaC commented Feb 9, 2025

Yeah, you should be able to do that.

I tried to setup a rootless k8s environment with kubeadm + 2 physical nodes recently and successfully got it run.

@shohamyamin
Copy link
Author

Great I will try at the moment that the chart will be updated

@ApostaC
Copy link
Collaborator

ApostaC commented Feb 11, 2025

@shohamyamin Hey, the tensor parallelism support is added in PR #105

@simon-mo
Copy link
Contributor

On LWS, it doesn't work out of the box on vLLM container image. Please submit a PR to vLLM repo so the vLLM container has the ability to start ray cluster.

@0xThresh
Copy link
Contributor

While this is specific to EKS, it looks like AWS has a Dockerfile that can support it for AWS here: https://aws-ia.github.io/terraform-aws-eks-blueprints/patterns/machine-learning/multi-node-vllm/#dockerfile

It also looks like there is a PR already open to help address this, so I'll keep an eye on that: vllm-project/vllm#12913

@ahg-g
Copy link

ahg-g commented Feb 18, 2025

I think we can make the multi-node deployment generic if we have the following script as part of the vllm image: vllm-project/vllm#12913

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants