Running CNAqc analyze_peaks by chromosome does not populate the object for fragmented samples #40

kmavrommatis · 2025-01-14T13:38:20Z

Hi, thank you for your help so far with #39

Going over your code and regarding the last point of checking by chromosome, in a different sample I have two versions of CNV predictions, from the same CNV caller with different arguments. One is very fragmented, the other much less. I would like to use CNAqc to distinguish which version of the segmentation is (more) correct - in this case I know the overfragmented is not correct, from orthogonal data, but I am trying to devise a methodology to apply to similar situations where I don't have any prior knowledge of the ploidy/segmentation of the sample and have to rely on what the CNV method is producing

For the fragmented version of the sample I reach a situation that now all karyotypes have PASSed qc after analyze_peaks, however, when i run by chromosome only a couple chromosomes have results. In the resulting object, even for chromosomes that don't have peak analysis, the x$cna object has QC_PASS = TRUE for all segments having the karyotypes tested.
The less fragmented version produces results for all chromosomes.

Does this mean that in the case of the fragmented sample the overall QC estimation of the sample is determined by a few peaks, on a couple chromosomes only?
should this be an indication of weak (?) estimation and be weighted by chromosome size or some other metric?

Thanks again for your help

cnvs-notfragmented.rds.gz
mutations.rds.gz
cnvs-fragmented.rds.gz

require(dplyr)
require(CNAqc)

mut = readRDS("~/Downloads/mutations.rds")
cnvs.f= readRDS("~/Downloads/cnas-fragmented.rds")
cnvs.u= readRDS("~/Downloads/cnas-notfragmented.rds")

x.f=CNAqc::init(
  mutations=mut,  # mutations predicted using GATK Mutect2 on tumor/matched normal 
  cna=cnvs.f,
  purity=0.33,  
  sample='test',
  ref='hg38'
)

x.u=CNAqc::init(
  mutations=mut,  # mutations predicted using GATK Mutect2 on tumor/matched normal 
  cna=cnvs.u,
  purity=0.33,  
  sample='test',
  ref='hg38'
)
x.u = CNAqc::analyze_peaks(x.u, n_bootstrap = 10)
plot_peaks_analysis(x.u)
x.f = CNAqc::analyze_peaks(x.f, n_bootstrap = 10)
plot_peaks_analysis(x.f)


x_chr.f = x.f %>% 
  split_by_chromosome() %>% 
  lapply(function(w) {analyze_peaks(w)})

x_chr.u = x.u %>% 
  split_by_chromosome() %>% 
  lapply(function(w) {analyze_peaks(w)})


x_chr.u$chr2 %>% plot_peaks_analysis() # works

x_chr.f$chr2 %>% plot_peaks_analysis() # does not produce anything
Warning message:
In plot_peaks_analysis(.) :
  Input does not have peaks, see ?peaks_analysis to run peaks analysis.

The text was updated successfully, but these errors were encountered:

caravagn · 2025-01-23T18:07:30Z

Hi, I see your points. Base on MY experience, whereas fragmented chromosome do exist, the patterns of over-fragmentation are usually localised. For instance, chromotripsis is often localised in a few megabases, and involves up to a handful of chromosomes.

Your fragmented solution has

x.f=CNAqc::init(
  mutations=mut,  # mutations predicted using GATK Mutect2 on tumor/matched normal 
  cna=cnvs.f,
  purity=0.33,  
  sample='test',
  ref='hg38'
) %>% detect_arm_overfragmentation()

x.f %>% plot_segments()
x.f %>% plot_arm_fragmentation()

21 overfragmented chromosomes or so, which to me it's very suspicious. On the other hand, your fragmented solution suggests a low purity sample to work with, so the situation is complicated.

QC per chromosome is difficult here for one reason: right now the algorithm pools down mutations on chromosome segments with at least N mutations. If I try to subset your segments they are so tiny that nothing remains, that's mainly a filtering issue.

x.f=CNAqc::init(
  mutations=mut,  # mutations predicted using GATK Mutect2 on tumor/matched normal 
  cna=cnvs.f,
  purity=0.33,  
  sample='test',
  ref='hg38'
) %>% detect_arm_overfragmentation() %>% 
  subset_by_segment_minmutations(50)

x.f %>% plot_segments()

I think we have to find specific parameters/ workaround for your samples -- it should not be impossible. You work in a difficult setup though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running CNAqc analyze_peaks by chromosome does not populate the object for fragmented samples #40

Running CNAqc analyze_peaks by chromosome does not populate the object for fragmented samples #40

kmavrommatis commented Jan 14, 2025 •

edited

Loading

caravagn commented Jan 23, 2025

Running CNAqc analyze_peaks by chromosome does not populate the object for fragmented samples #40

Running CNAqc analyze_peaks by chromosome does not populate the object for fragmented samples #40

Comments

kmavrommatis commented Jan 14, 2025 • edited Loading

caravagn commented Jan 23, 2025

kmavrommatis commented Jan 14, 2025 •

edited

Loading