Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There are too few interaction signals to manually adjust them in Juicebox #3

Open
2benaszq opened this issue Dec 26, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@2benaszq
Copy link

Hello,

First of all, thank you very much for developing such an excellent phasing and scaffolding tool. However, I have encountered some issues while using it.

I first assembled hap1.fa and hap2.fa using hifiasm, each approximately 1 GB in size. Then, I merged them into a single file named output.fa. Subsequently, I used the following commands:

cphasing pipeline -f output.fa -pcd porec.fq.gz -t 100 -n 18:2 -hcr
ln -sf cphasing_output/porec.pairs.gz ./
ln -sf cphasing_output/4.scaffolding/groups.agp ./
cphasing pairs2mnd porec.pairs.gz -o porec.mnd.txt
cphasing utils agp2assembly groups.agp > groups.assembly
docker run -i --rm -w ${PWD} -u $(id -u):$(id -g) -v /calculate:/calculate -v /data:/data hic:v2.1 /software/3d-dna-201008/visualize/run-assembly-visualizer.sh groups.assembly porec.mnd.txt

I then loaded the resulting groups.assembly and groups.hic files into Juicebox for manual adjustment, but I found that the interaction signals were too sparse to make adjustments effectively.
Here is a screenshot of the Juicebox view:
image

Additionally, the scaffolding plot generated directly by cphasing (without manual adjustment) looks as follows:
image

The results seem excellent, and I only need to adjust a few contigs to achieve a nearly perfect phased genome. However, due to the low signal visibility in Juicebox, I am unable to complete this task.

As mentioned earlier, the merged genome is 2 GB in size, and the porec data is only about 25 GB. I wonder if the low data volume in phasing mode is causing the weak signal display in Juicebox, or if there is any way to enhance the signal visibility in Juicebox to help me adjust misassembled contigs. Alternatively, what amount of porec data would be required for this genome to display strong interaction signals in Juicebox

Thanks!

@wangyibin
Copy link
Owner

Sorry for my late reply,
Firstly, thank you for your feedback on this issue.

The heatmap plotting from cphasing shows that 25 Gb of pore-c is enough to adjust the genome in Juicebox.

Your genome’s high homozygosity may result in most of the mapping quality of pore-c fragments being smaller than 1.
By default, cphasing pairs2mnd removes the interaction with a quality <1; you can set -q 0 to load all the interactions of pore-c to the Juicebox.

cphasing pairs2mnd porec.pairs.gz -o porec.mnd.txt  -q 0

Best regards,
Yibin

@2benaszq
Copy link
Author

Dear Dr. Yibin,

Thank you for your response.
With your help, I was indeed able to achieve the results I wanted.
image

However, I have one more question that I didn’t mention last time.
I noticed that CPhasing seems to call the program Partig, which appears to have a limitation on the contig length, requiring contig lengths not to exceed 2**27 bp, approximately 134 Mb. My genome, however, has a contig as long as 136 Mb. To address this, I manually split it into 100 Mb and 36 Mb segments, completed the scaffolding, and then merged them back together. While this approach works, it feels a bit inconvenient.

Is there any way to avoid this issue, or will this minor limitation be fixed in future updates?

Best regards,
Iron Man

@wangyibin wangyibin added the enhancement New feature or request label Jan 2, 2025
@wangyibin
Copy link
Owner

Thank you for your suggestion.

We will fix this limitation in the future or add a function to split long contigs.

Best regards,
Yibin.

@2benaszq
Copy link
Author

2benaszq commented Jan 2, 2025

Dear Dr. Yibin,
Thank you for your reply, I look forward to the next update of CPhasing with great anticipation!

Best regards,
Iron Man

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants