You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example, it take me a while to figure out the correct usage of mutate after data partition
# create data
set.seed(123) # For reproducibilitynum_groups=5000num_grp_obs=10df<-data.frame(
id=1:num_groups*num_grp_obs,
group= rep(seq(num_groups), each=num_grp_obs),
x= rnorm(num_groups*num_grp_obs),
y= rnorm(num_groups*num_grp_obs)
)
df$x[c(5, 15)] <-NA# Introduce some NA values# parallel setting
library(multidplyr)
cluster<- new_cluster(4)
cluster_library(cluster, c("dplyr"))
# partitionx_part=df %>% group_by(group) %>% nest() %>% partition(cluster)
this will not work
x = x_part %>% mutate(fit = lm(y~x, data = .)) %>% collect()
Error in cluster_call():
! Remote computation failed in worker 1
Caused by error:
ℹ In argument: fit = lm(y ~ x, data = .).
ℹ In group 1: group = 1.
Caused by error:
! Native call to processx_connection_write_bytes failed
Caused by error:
! Invalid connection object @processx-connection.c:960 (processx_c_connection_write_bytes)
Run rlang::last_trace() to see where the error occurred.
this will work
x = mutate(fit = purrr::map(data, ~lm(y~x, data = .))) %>% collect()
Thanks!
The text was updated successfully, but these errors were encountered:
Dear
multidplyr
developer,Thank you for maintaining this package, I was wondering where we could find a more detailed tutorial of this package besides the documentation page https://multidplyr.tidyverse.org/articles/multidplyr.html?
For example, it take me a while to figure out the correct usage of
mutate
after data partitionthis will not work
this will work
Thanks!
The text was updated successfully, but these errors were encountered: