You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi axelalmet ~
I noticed that your analysis code only uses two values for n_gems, 10 and 20. Could you please explain what criteria you used to determine the appropriate value for the n_gems parameter? Thanks! @axelalmet
This is a good question. For the applications considered in the paper, setting n_gems to be 10 or 20 worked pretty well in terms of capturing meaningful differences with respect to 1) cell state heterogeneity or 2) capturing spatially meaningful modules (that lined up with cell region annotation) 3) different biological conditions, e.g., healthy vs moderate vs severe COVID-19. This was evaluated by me looking at mean cellwise membership for each of the modules with respect to meaningful cell labels like condition or cell type annotation. When I originally was analysing the datasets, I considered the case where I had set 5, 10, 15, ... etc GEMs and found that, often, 10 or 20 worked best.
But in general, there's no reason you have to pick 10 or 20 GEMs. I think picking the right number of GEMs is an incredibly non-trivial exercise, and, to the best of my knowledge, there's no single good method for choosing the number of factors for a matrix factorisation-based method. I know cNMF uses the silhouette score, but the literature has shown that this can have its drawbacks.
Hi axelalmet,
Thanks for your inspired answer! I am very interested to know which aspects of the results will be impacted after altering the parameters of nGEM?
Hoping for your earliest reply!
Hi axelalmet ~
I noticed that your analysis code only uses two values for
n_gems
, 10 and 20. Could you please explain what criteria you used to determine the appropriate value for then_gems
parameter? Thanks! @axelalmetThe text was updated successfully, but these errors were encountered: