Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AUDIT 4.0] Evaluate new tunables for Python UDFs #12085

Open
mythrocks opened this issue Feb 7, 2025 · 0 comments
Open

[AUDIT 4.0] Evaluate new tunables for Python UDFs #12085

mythrocks opened this issue Feb 7, 2025 · 0 comments
Labels
audit_4.0.0 Audit related tasks for 4.0.0 performance A performance related task/issue

Comments

@mythrocks
Copy link
Collaborator

We should evaluate whether the batch-size, buffer-size tunables introduced as part of SPARK-50752 have any bearing on spark-rapids's support for Python UDFs.

It doesn't appear that this change affects the code here. These might turn out to be knobs that we might use to tune things on our end. This task has been raised out of an abundance of caution.

The configs in question are:

  1. spark.sql.execution.python.udf.maxRecordsPerBatch
  2. spark.sql.execution.python.udf.buffer.size
@mythrocks mythrocks added ? - Needs Triage Need team to review and classify audit_4.0.0 Audit related tasks for 4.0.0 labels Feb 7, 2025
@mattahrens mattahrens added performance A performance related task/issue and removed ? - Needs Triage Need team to review and classify labels Feb 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
audit_4.0.0 Audit related tasks for 4.0.0 performance A performance related task/issue
Projects
None yet
Development

No branches or pull requests

2 participants