-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting GP identifiers from UniProt to mod ID's #4
Comments
Just to clarify: no conversion is needed for human at the moment as GO currently treats UniProt as canonical for human. But this may change in future. |
The URIs used by NEO and GO-CAMs for mouse GPs are for instance of the type http://identifiers.org/mgi/MGI:1926134. If we do not follow this syntax, we have no access to the information in NEO to describe this gene, and because it is not linked with other models, it's like we would be describing a totally different GP. We need to have a data model consistency and apply the same rules to all models and annotations. As an example, because of this misalignment between GP URIs (this time between the way we reference GPs in GO-CAMs and those referenced in SYNGO), we have no access to the recommended name of the gene (nor other meta information): |
@lpalbou In case you're interested, I have the UniProt-to-MOD-ID conversion code here: |
@dustine32 thanks, but this is not a viable solution for live queries. Converting 2k+ (and hopefully soon 20k+) ids on the fly would take too much time for the website. And it wouldn't solve the data model consistency nor would it help when looking for overlaps of annotations and go-cams, and possibly when wanting to merge them in larger go-cams: we should all use the same URIs for same entities. |
@lpalbou Yep, agreed! This is just the pre-data-loading, data-massaging step for generating the models. |
@cmungall Just making sure: Is this the right URL to cross-reference prefixes for the SynGO model UniProt ID's? And to confirm your earlier clarification, I'm not converting the human gene ID's to HGNC and just going with the UniProt ID's provided by SynGO. Only mouse, rat, fly and worm ID's are being converted. Thanks! |
For geneontology/go-site#617
This ticket is to update the gene product identifiers in SynGO models to use the mod-specific identifier. For example:
http://identifiers.org/uniprot/Q9JIR3 -> http://identifiers.org/rgd/628762
The prefixes will be sourced from here:
https://github.com/geneontology/minerva/blob/master/minerva-core/src/main/resources/go_context.jsonld
By having consistent identifiers for GP's we can have cross-model aggregation of gene information, maintain label data, and more easily merge models.
To run this conversion, I have a script that grabs MOD IDs (if present) from UniProt's web service and appends this data to the SynGO annotation JSON that gets consumed by David OS' model code. I can add this code to my syngo2lego fork.
@ftwkoopmans I know we discussed this in the past. Will using non-UniProt ID's cause any issue for SynGO? Tagging @lpalbou @cmungall and @thomaspd for their opinions too.
The text was updated successfully, but these errors were encountered: