Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tutorial of multidplyr #154

Open
wbvguo opened this issue Jan 12, 2024 · 0 comments
Open

tutorial of multidplyr #154

wbvguo opened this issue Jan 12, 2024 · 0 comments

Comments

@wbvguo
Copy link

wbvguo commented Jan 12, 2024

Dear multidplyr developer,

Thank you for maintaining this package, I was wondering where we could find a more detailed tutorial of this package besides the documentation page https://multidplyr.tidyverse.org/articles/multidplyr.html?

For example, it take me a while to figure out the correct usage of mutate after data partition

# create data
set.seed(123)  # For reproducibility

num_groups = 5000
num_grp_obs= 10

df <- data.frame(
  id = 1:num_groups*num_grp_obs,
  group = rep(seq(num_groups), each = num_grp_obs),
  x = rnorm(num_groups*num_grp_obs),
  y = rnorm(num_groups*num_grp_obs)
)

df$x[c(5, 15)] <- NA # Introduce some NA values


# parallel setting
library(multidplyr)
cluster <- new_cluster(4)
cluster_library(cluster, c("dplyr"))


# partition
x_part = df %>% group_by(group) %>% nest() %>% partition(cluster) 

this will not work

x = x_part %>% mutate(fit = lm(y~x, data = .)) %>% collect()

Error in cluster_call():
! Remote computation failed in worker 1
Caused by error:
ℹ In argument: fit = lm(y ~ x, data = .).
ℹ In group 1: group = 1.
Caused by error:
! Native call to processx_connection_write_bytes failed
Caused by error:
! Invalid connection object @processx-connection.c:960 (processx_c_connection_write_bytes)
Run rlang::last_trace() to see where the error occurred.

this will work

x = mutate(fit = purrr::map(data, ~lm(y~x, data = .))) %>% collect()

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants