-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring of analyze_vars
with specifics for .stat_names
#1361
base: main
Are you sure you want to change the base?
Conversation
all obs | ||
————————————————————— | ||
n 4 | ||
Mean (SD) 2.5 (1.3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be NAs
n 5 | ||
a 2 (40%) | ||
b 1 (20%) | ||
NA 2 (40%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lost NA
8 mean_ci (4.57, 5.69) 0 Mean 95% CI | ||
9 mean_sei (4.89, 5.38) 0 Mean -/+ 1xSE | ||
10 mean_sdi (4.35, 5.91) 0 Mean -/+ 1xSD | ||
2 sum 48.7 0 Sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this calc changed?
#' | ||
#' @return A `list` with modified elements `x`, `.formats`, `.labels`, and `.indent_mods`. | ||
#' @return A `list` with modified elements `stat_out`, `.formats`, `.labels`, `.levels`, and `.indent_mods`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to fix: ungroup_stats should be removed in favor of get_*_from_stats
empty_pval <- "pval" %in% names(stat_out) && length(stat_out[["pval"]]) == 0 | ||
empty_pval_counts <- "pval_counts" %in% names(stat_out) && length(stat_out[["pval_counts"]]) == 0 | ||
stat_out <- unlist(stat_out, recursive = FALSE) | ||
|
||
# If p-value is empty it is removed by unlist and needs to be re-added | ||
if (empty_pval) x[["pval"]] <- character() | ||
if (empty_pval_counts) x[["pval_counts"]] <- character() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are if pval is NULL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pval handling to be refactored
n 5 | ||
FALSE 1 (20%) | ||
TRUE 2 (40%) | ||
NA 2 (40%) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lost NA
|
||
if (na.rm) { | ||
if (na_rm) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
na_rm can be reverted into na.rm if you think it is just creating issues downstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have always thought it would be nicer for it to be na_rm
but na.rm
was the standard so we never changed. I'm not sure if this will cause any issues but if not it would be nice to change, or at least start a deprecation cycle.
# Compare with reference group | ||
if (isTRUE(compare_with_ref_group)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compare functions have been deprecated and merged in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a clearer error message/check for the presence of a reference group?
) | ||
|
||
y$n_blq <- sum(grepl("BLQ|LTR|<[1-9]|<PCLLOQ", x)) | ||
y$n_blq <- list("n_blq" = c("n_blq" = sum(grepl("BLQ|LTR|<[1-9]|<PCLLOQ", x)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useful to get_stat_names
dots_extra_args <- list(...) | ||
|
||
# Check if there are user-defined functions | ||
default_and_custom_stats_list <- .split_std_from_custom_stats(.stats) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is possible to add custom stats
.N_col, # nolint | ||
.N_row, # nolint | ||
.var = NULL, | ||
.df_row = NULL, | ||
.ref_group = NULL, | ||
.in_ref_col = FALSE, | ||
compare = FALSE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these vars are broadcasted from analyze through ...
# Auto format handling | ||
.formats <- apply_auto_formatting(.formats, x_stats, .df_row, .var) | ||
# Get and check statistical names from defaults | ||
.stat_names <- get_stat_names(x_stats, .stat_names_in) # note is x_stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
core stat_names modification
@@ -140,7 +141,7 @@ afun_riskdiff <- function(df, | |||
stat <- ifelse("count_fraction" %in% names(s_x), "count_fraction", "unique") | |||
if ("flag_variables" %in% names(s_args)) { | |||
var_nms <- s_args$flag_variables | |||
} else if (!is.null(names(s_x[[stat]]))) { | |||
} else if (is.list(s_x[[stat]]) && !is.null(names(s_x[[stat]]))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now statistics are character vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*named numeric vectors
@@ -351,6 +387,72 @@ get_indents_from_stats <- function(stats, indents_in = NULL, row_nms = NULL) { | |||
out | |||
} | |||
|
|||
# Function to loop over each stat and levels to set correct values | |||
.adjust_stats_desc_by_in_def <- function(levels_per_stats, user_in, tern_defaults) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only to be extra sure that you can change the labels in anyway you want (to do add this for formats and indentation)
@@ -1361,7 +1603,7 @@ | |||
---------------------------- | |||
row_name formatted_cell indent_mod row_label | |||
1 pval <0.0001 3 pvalue | |||
2 median_ci -0.41 - 1.10 3 Median 95% CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!! Just a few comments from me regarding some improvements that could be made to the documentation, but the code all seems good!
# Compare with reference group | ||
if (isTRUE(compare_with_ref_group)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a clearer error message/check for the presence of a reference group?
|
||
if (na.rm) { | ||
if (na_rm) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have always thought it would be nicer for it to be na_rm
but na.rm
was the standard so we never changed. I'm not sure if this will cause any issues but if not it would be nice to change, or at least start a deprecation cycle.
.var, | ||
...) { | ||
checkmate::assert_flag(na.rm) | ||
s_summary <- function(x, denom, control, ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...
is still described as "additional parameters passed to s_summary
" which I think should be updated with more detail since a lot of the additional parameters were collapsed into ...
. Maybe move the s_summary
methods to their own file since there are a lot of them and their param descriptions no longer align very well with analyze_vars
and the help file is getting pretty long? What do you think?
Also, since the description for na.rm
was in the argument_convention.R
file you should also add one for na_rm
there.
.N_col, # nolint | ||
...) { | ||
if (na.rm) x <- x[!is.na(x)] | ||
s_summary.logical <- function(x, denom = c("n", "N_col", "N_row"), ...) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why denom
wasn't put into the ...
here but was in the other methods?
Pull Request
Fixes #1352