You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for providing such an excellent module as scirpy. I recently upgraded from version 0.11.2 to 0.20.0 and noticed that scirpy now uses an awkward array to store IR information. This is a great change, and I fully understand the reasoning behind it.
However, when I use scirpy.io.read_10x_vdj to read the filtered_contig_annotations.csv file, I found that the contig_id column is not stored in obsm['airr']. I tried renaming the contig_id column in the file to sequence_id to comply with AIRR standards, but it still wasn't stored correctly.
Upon inspecting the _read_10x_vdj_csv function, I found that only a subset of columns from the filtered_contig_annotations.csv file is written into the awkward matrix. Key columns like contig_id and origin, which are important to me, were discarded.
As a result:
After running ir.pp.index_chains, I cannot map the VJ and VDJ chains back to their specific contigs in the filtered_contig_annotations.csv file.
If the sample names of my GEX and AIRR data differ, I cannot match them due to the absence of the origin column.
How should I address this issue? If I modify the chain_dict.update section in your _read_10x_vdj_csv function and add the sequence_id information, would this satisfy the AIRR format requirements and correctly add the information to adata.obsm['airr'].sequence_id?
Thank you for your time and help!
The text was updated successfully, but these errors were encountered:
thanks for opening the issue!
I think mapping contig_id to sequence_id in both read_10x_csv and read_10x_json is the correct thing to do. This would allow you to retreive it from adata.obsm['airr'].sequence_id and also add it to adata.obs using scirpy.get.airr.
Thank you for providing such an excellent module as scirpy. I recently upgraded from version 0.11.2 to 0.20.0 and noticed that scirpy now uses an awkward array to store IR information. This is a great change, and I fully understand the reasoning behind it.
However, when I use scirpy.io.read_10x_vdj to read the filtered_contig_annotations.csv file, I found that the contig_id column is not stored in obsm['airr']. I tried renaming the contig_id column in the file to sequence_id to comply with AIRR standards, but it still wasn't stored correctly.
Upon inspecting the _read_10x_vdj_csv function, I found that only a subset of columns from the filtered_contig_annotations.csv file is written into the awkward matrix. Key columns like contig_id and origin, which are important to me, were discarded.
As a result:
After running ir.pp.index_chains, I cannot map the VJ and VDJ chains back to their specific contigs in the filtered_contig_annotations.csv file.
If the sample names of my GEX and AIRR data differ, I cannot match them due to the absence of the origin column.
How should I address this issue? If I modify the chain_dict.update section in your _read_10x_vdj_csv function and add the sequence_id information, would this satisfy the AIRR format requirements and correctly add the information to adata.obsm['airr'].sequence_id?
Thank you for your time and help!
The text was updated successfully, but these errors were encountered: