-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DataFrame::map
utility .map function for DataFrame for modifying internal LogicalPlan
#14317
Comments
This seems like a good idea to me As I understand it, it would allow things like let df = ctx.sql("SELECT * from foo");
let df = df.map(my_awesome_rewrite)?;
...
fn my_awesome_plan(plan: LogicalPlan| ) -> Result<LogicalPlan> {
...
Ok(new_plan)
} Any thoughts @timsaucer or @Omega359 ? |
DataFrame::map
utility .map function for DataFrame for modifying internal LogicalPlan
We have something similar in In python you can currently do something like this (this is from our unit tests) def add_string_col(df_internal) -> DataFrame:
return df_internal.with_column("string_col", literal("string data"))
def add_with_parameter(df_internal, value: Any) -> DataFrame:
return df_internal.with_column("new_col", literal(value))
df = df.transform(add_string_col).transform(add_with_parameter, 3) If you had a DataFrame But what I'm doing here doesn't exactly match up with what the issue requests, which is to work on the LogicalPlan. So we could do something similar but @phisn would that meet your needs? |
I personally haven't had the need to go into a LogicalPlan from a dataframe and back again but I could see it being useful. |
@timsaucer My specific use case is what @alamb described. The problem with the transform approach is that I am forced to The problem arises with applying extensions and using functions that are not always called using a |
Is your feature request related to a problem or challenge?
Currently when applying custom
LogicalPlan
s (extensions or transformation functions) onDataFrame
s I need to convertDataFrame
to aLogicalPlan
and then back to aDataFrame
making cumbersome code.Describe the solution you'd like
It would be very helpful to have a
.map
function which transforms theLogicalPlan
insideDataFrame
s without reconstructing it.Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: