-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ramalama container: Make it possible to build basic container on all RHEL architectures #722
Conversation
Reviewer's Guide by SourceryThis pull request enables building the ramalama container on all RHEL architectures by adding support for OpenBLAS when the architecture is not x86_64 or aarch64. Additionally, the platform detection logic was updated to use the container manager to determine the platform. Flow diagram for container build architecture selectiongraph TD
A[Start Build] --> B{Check Architecture}
B -->|x86_64 or aarch64| C[Install Vulkan Dependencies]
B -->|Other architectures| D[Install OpenBLAS Dependencies]
C --> E[Build with Kompute]
D --> F[Build with OpenBLAS]
E --> G[Final Container]
F --> G
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jcajka - I've reviewed your changes - here's some feedback:
Overall Comments:
- Please expand the PR description to specify which RHEL architectures are now supported and explain the different build approaches used (Kompute for x86_64/aarch64 vs OpenBLAS for others).
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Could you describe what OpenBlas give us that it's only applicable to ppc and s390? Worth putting in the commit message also |
This PR needs a rebase. |
RHEL architectures Signed-off-by: Jakub Čajka <[email protected]>
Feels kind of nitpicky, arbitrary. Why are you using kompute instead of naive CPU backends? |
It's ok, don't let it block you, just curious. Know nothing about OpenBlas personally. For the generic RamaLama container, the intent is that it is used for CPU based inferencing or used as a generic GPU backend. The most generic GPU API around is probably Vulkan, there are two Vulkan backends available for llama.cpp. One enabled via GGML_VULKAN=ON (this is probably more popular), the other enabled via GGML_KOMPUTE=ON One of the gpu's we use this for is the one exposed via krunkit on podman-machine on macOS. @slp has had great joy using kompute there. If it felt nitpicky I apologise. I am curious what are the advantages of this backend over the other. It's good that we give people choice over backends. |
Make it possible to build ramalama container on all RHEL arches.
Summary by Sourcery
Build: