-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm Chart Lacks Clear Support for Multi-Node vLLM Deployment #50
Comments
For reference, and if someone would like to add that to helm charts, the LeaderWorkerSet API can be used to deploy multi-node vllm on k8s: https://docs.vllm.ai/en/latest/deployment/frameworks/lws.html Also check examples on the LWS repo: https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm |
Thanks for submitting the issue! Multi-node deployment should be supported. If you encounter any issues running a multi-node deployment, feel free to let us know. We will improve the documentation to clarify how to configure multi-node deployment on Kubernetes. |
@shohamyamin we recently did some local tests and will update the helm charts and the docs soon! |
@ApostaC great to hear that. Did you know if this configuration can be run on a rootless environment(like Openshift or other k8s rootless environments)? |
Yeah, you should be able to do that. I tried to setup a rootless k8s environment with kubeadm + 2 physical nodes recently and successfully got it run. |
Great I will try at the moment that the chart will be updated |
@shohamyamin Hey, the tensor parallelism support is added in PR #105 |
On LWS, it doesn't work out of the box on vLLM container image. Please submit a PR to vLLM repo so the vLLM container has the ability to start ray cluster. |
While this is specific to EKS, it looks like AWS has a Dockerfile that can support it for AWS here: https://aws-ia.github.io/terraform-aws-eks-blueprints/patterns/machine-learning/multi-node-vllm/#dockerfile It also looks like there is a PR already open to help address this, so I'll keep an eye on that: vllm-project/vllm#12913 |
I think we can make the multi-node deployment generic if we have the following script as part of the vllm image: vllm-project/vllm#12913 |
The current Helm chart does not explicitly support deploying vLLM across multiple vllm nodes on Kubernetes, or it's unclear how to configure it. Improved documentation or multi-node support is needed for deploying LLM that require multi-node
The text was updated successfully, but these errors were encountered: