Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Build wheel for cuml-cpu #6279

Open
marcosgalleterobbva opened this issue Jan 30, 2025 · 1 comment
Open

[FEA] Build wheel for cuml-cpu #6279

marcosgalleterobbva opened this issue Jan 30, 2025 · 1 comment
Labels
? - Needs Triage Need team to review and classify feature request New feature or request

Comments

@marcosgalleterobbva
Copy link

Is your feature request related to a problem? Please describe.
We are trying to use cuml-cpu in an environment that cannot access a GPU and can only access a private PyPI repository. Right now the cuml-cpu package is only listed in the rapidsai conda repository, not in PyPI. There are contributions that create the wheel for the cuml package, but not for the cuml-cpu.

Describe the solution you'd like
We have seen a contribution by @jameslamb that creates the wheel for the cuml library. I have tried to create a script that could use the rapidsai.disable-cuda=true flag in the pip wheel command, but this package has a high level of complexity and do not think I am capable of doing it correctly. We would love to have a /ci/build_wheel_cuml-cpu.sh script that would create the wheel for the cuml-cpu version.

Describe alternatives you've considered
We could maybe install the dependencies listed for the conda package and then build and install from source using the code found in the build.sh section for cuml-cpu. This could work, but we work in an environment that does not allow us much more than just installing a wheel from a PyPI repository prior to the creation of the environment.

Additional context
We are trying to use the very promising NVIDIA/spark-rapids-ml package in our Spark clusters so that we could use DBSCAN, UMAP and other algorithms. Unfortunately, we do not have access to clusters that have GPUs. We were hoping to be able to use the cuml-cpu package as a dependency for spark-rapids-ml and thus bypass the dependency that spark-rapids-ml has on GPUs and CUDA. Do you think that this could be possible?

Thank you very much for the amazing work that you are doing.

@marcosgalleterobbva marcosgalleterobbva added ? - Needs Triage Need team to review and classify feature request New feature or request labels Jan 30, 2025
@dantegd
Copy link
Member

dantegd commented Feb 4, 2025

Thanks for the issue @marcosgalleterobbva. The usecase of spark-rapids-ml on CPU is not one I had thought about before, let me circle with the more Spark savy folks of the team and circle back here about enabling spark-rapids-ml on CPUs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants