Can RamaLama support Kubernetes-based inference clustering on a Mac Mini M4? #742

hotwa · 2025-02-01T10:56:20Z

hotwa
Feb 1, 2025

I’m really impressed with the work being done on RamaLama and its ability to handle various models for inference. I have been exploring the possibility of leveraging Kubernetes (K8s) for distributed model in my local environment, specifically on a Mac Mini.

Is there currently support within RamaLama to deploy and manage model inference workloads using Kubernetes on a Mac Mini?
If not natively supported, are there any recommendations or best practices for integrating RamaLama with Kubernetes for local development/testing purposes?
Are there any known limitations or considerations when running Kubernetes-based inference clusters on ARM-based hardware (like the M4 chip) using RamaLama?

ericcurtin · 2025-02-01T11:34:39Z

ericcurtin
Feb 1, 2025
Maintainer

It can be done, you need to install podman-machine with krunkit that's the first step https://podman.io/ . There's also kubelet generators via:

ramalama serve --generate

Most of the bits are there, just need someone to tie it all together.

0 replies

ericcurtin · 2025-02-01T14:45:30Z

ericcurtin
Feb 1, 2025
Maintainer

Would make a great blog post!

0 replies

rhatdan · 2025-02-01T21:53:35Z

rhatdan
Feb 1, 2025
Maintainer

ramalama serve --generate kube MODEL

Will generate a k8s deployment for the containerized AI Model.

0 replies

hotwa · 2025-02-05T14:04:25Z

hotwa
Feb 5, 2025
Author

Can Docker Desktop serve as a replacement for Podman in this setup?

0 replies

ericcurtin · 2025-02-05T14:10:07Z

ericcurtin
Feb 5, 2025
Maintainer

docker desktop should be relatively compatible, not that I've tried it personally. I recommend trying Docker VMM beta if you want accelerated GPU support on Docker Desktop (Podman Desktop is more tested with RamaLama though). Docker VMM beta uses krunkit https://github.com/containers/krunkit

https://docs.docker.com/desktop/features/vmm/

0 replies

rhatdan · 2025-02-05T16:55:28Z

rhatdan
Feb 5, 2025
Maintainer

I am going to move this to a discussion.

0 replies

hotwa · 2025-02-09T13:32:36Z

hotwa
Feb 9, 2025
Author

maybe podman is a better choice

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can RamaLama support Kubernetes-based inference clustering on a Mac Mini M4? #742

{{title}}

Replies: 7 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Can RamaLama support Kubernetes-based inference clustering on a Mac Mini M4? #742

hotwa Feb 1, 2025

Replies: 7 comments

ericcurtin Feb 1, 2025 Maintainer

ericcurtin Feb 1, 2025 Maintainer

rhatdan Feb 1, 2025 Maintainer

hotwa Feb 5, 2025 Author

ericcurtin Feb 5, 2025 Maintainer

rhatdan Feb 5, 2025 Maintainer

hotwa Feb 9, 2025 Author

hotwa
Feb 1, 2025

ericcurtin
Feb 1, 2025
Maintainer

ericcurtin
Feb 1, 2025
Maintainer

rhatdan
Feb 1, 2025
Maintainer

hotwa
Feb 5, 2025
Author

ericcurtin
Feb 5, 2025
Maintainer

rhatdan
Feb 5, 2025
Maintainer

hotwa
Feb 9, 2025
Author