Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] See if we can set spark GPU resource configs from the plugin #12108

Open
revans2 opened this issue Feb 11, 2025 · 2 comments
Open

[FEA] See if we can set spark GPU resource configs from the plugin #12108

revans2 opened this issue Feb 11, 2025 · 2 comments
Assignees
Labels
feature request New feature or request

Comments

@revans2
Copy link
Collaborator

revans2 commented Feb 11, 2025

Is your feature request related to a problem? Please describe.
In order to reduce the friction in adopting the RAPIDS accelerator it would be great if we could set/update Spark resource configs for the user when the plugin is enabled.

These configs are

spark.executor.resource.gpu.amount=1 and spark.task.resource.gpu.amount

We need to test to see if we can set these values from the plugin and have them properly apply to all of the various spark versions and configurations that we support. As for the value of spark.task.resouce.gpu.amount we can either set it to 1/16 (0.0625) to try to limit the task to GPU ration to be 16 to 1 at the most, or set it to be very very small so that we are not limited by the GPU.

@revans2 revans2 added ? - Needs Triage Need team to review and classify feature request New feature or request labels Feb 11, 2025
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Feb 11, 2025
@revans2
Copy link
Collaborator Author

revans2 commented Feb 13, 2025

I have tried to set these configs in the plugin for both the driver and the executor, but it looks like the configs are not showing up. The issue appears to be with the worker. The worker does not include the plugin, but it is the one that reads some of these configs and reports back to the driver.

@revans2
Copy link
Collaborator Author

revans2 commented Feb 14, 2025

I just confirmed that our plugin also comes up too late, even on the driver to set the configs we want. So this is something that can only be set by the auto-tuner or some kind of a bootstrap script.

I know that the auto-tuner already sets the task amount.

https://github.com/NVIDIA/spark-rapids-tools/blob/86cacaf968cdd1434d4bf1bb6a5fb54a587854ea/user_tools/src/spark_rapids_pytools/rapids/rapids_tool.py#L398

But spark complains if you do not set the executor.gpu.amount too. @amahussein have we seen issue with the executor.gpu.amount not being set? If not then I will just close this issue as cannot fix. If so then we need to spend some time to understand what is the right way/value to set it.

@revans2 revans2 self-assigned this Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants