Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update goa.yaml to remove and add bacterial and viral species #2449

Open
pgaudet opened this issue Mar 4, 2025 · 2 comments
Open

Update goa.yaml to remove and add bacterial and viral species #2449

pgaudet opened this issue Mar 4, 2025 · 2 comments

Comments

@pgaudet
Copy link
Contributor

pgaudet commented Mar 4, 2025

We noticed that the list requested in geneontology/neo#116
included organisms without a reference proteome, so we were attempting to load >1.5M entries.

New list: https://docs.google.com/spreadsheets/d/1dQ0gN2HQjq2IFuXQrUdE5uRmMvZaUuylYQ1RvclrCqc/edit?gid=1998866383#gid=1998866383

<style type="text/css"></style>

Citrobacter koseri (Citrobacter diversus) 545 UP000270272 5,617
Klebsiella pneumoniae subsp. ozaenae (subspecies) 574 UP000255382 6,726
Shigella flexneri 623 UP000001006 4,103
Enterococcus gallinarum 1353 UP000254807 3,635
Bacillus anthracis 1392 UP000000594 5,493
Streptomyces clavuligerus 1901 UP000002357 7,290
Vaccinia virus (strain Western Reserve) (VACV) (Vaccinia virus (strain WR)) 10254 UP000000344 218
Human herpesvirus 1 (strain 17) (HHV-1) (Human herpes simplex virus 1) 10299 UP000009294 73
Varicella-zoster virus (strain Dumas) (HHV-3) (Human herpesvirus 3) 10338 UP000002602 69
Human cytomegalovirus (strain AD169) 10360 UP000008992 148
Foot-and-mouth disease virus serotype O (FMDV) 12118 UP000008765 5
Hepatitis C virus genotype 1a (isolate H77) (HCV) 63746 UP000000518 2
Zika virus (ZIKV) 64320 UP000054557 2
Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) 71421 UP000000579 1,704
Escherichia coli (strain K12) 83333 EcoCyc  
Escherichia coli O157:H7 83334 UP000000558 5,057
Helicobacter pylori (strain ATCC 700392 / 26695) 85962 UP000000429 1,554
Staphylococcus aureus (strain NCTC 8325 / PS 47) 93061 UP000008816 2,889
Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) 99287 UP000001014 4,533
Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145) 100226 UP000001973 8,038
Neisseria meningitidis serogroup B (strain ATCC BAA-335 / MC58) 122586 UP000000425 2,001
Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e) 169963 UP000000817 2,844
Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) 170187 UP000000585 2,109
Streptococcus pneumoniae (strain ATCC BAA-255 / R6) 171101 UP000000586 2,031
Campylobacter jejuni subsp. jejuni serotype O:2 (strain ATCC 700819 / NCTC 11168) 192222 UP000000799 1,623
Pseudomonas aeruginosa (strain ATCC 15692 / DSM 22644 / CIP 104116 / JCM 14847 / LMG 12228 / 1C / PRS 101 / PAO1) 208964 UP000002438 5,563
Bacillus subtilis (strain 168) 224308 UP000001570 4,260
Enterococcus faecalis (strain ATCC 700802 / V583) 226185 UP000001415 3,240
Mycoplasma genitalium (strain ATCC 33530 / DSM 19775 / NCTC 10195 / G37) (Mycoplasmoides genitalium) 243273 UP000000807 483
Halalkalibacterium halodurans (strain ATCC BAA-125 / DSM 18197 / FERM 7344 / JCM 9153 / C-125) (Bacillus halodurans) 272558 UP000001258 4,006
Mycoplasma pneumoniae (strain ATCC 29342 / M129 / Subtype 1) (Mycoplasmoides pneumoniae) 272634 UP000000808 686
Human cytomegalovirus (strain Merlin) (HHV-5) (Human herpesvirus 5) 295027 UP000000938 168
Streptococcus pyogenes serotype M1 (Strain: ATCC 700294 / SF370 / Serotype M1) 301447 UP000000750 1,690
Human immunodeficiency virus type 1 group O (isolate ANT70) 327105 UP000007689 9
HPV16 333760 UP000009251 9
Streptococcus pneumoniae serotype 2 (strain D39 / NCTC 7466) 373153 UP000001452 1,915
Mycobacterium tuberculosis (strain ATCC 25177 / H37Ra) 419947 UP000001988 3,990
Enterococcus casseliflavus EC20 565655 UP000012675 3,112
Acinetobacter baumannii (strain ATCC 19606 / DSM 30007 / JCM 6841 / CCUG 19606 / CIP 70.34 / NBRC 109757 / NCIMB 12457 / NCTC 12156 / 81) 575584 UP000005740 3,765
Variola virus (isolate Human/India/Ind3/1967) 587200 UP000002060 199
Measles virus (strain Ichinose-B95a) (MeV) (Subacute sclerose panencephalitis virus) 645098 UP000008699 8
Hepatitis B virus genotype C subtype ayr (isolate Human/Japan/Okamoto/-) (HBV-C) 928302 UP000008591 5
Klebsiella pneumoniae subsp. pneumoniae (strain HS11286) 1125630 UP000007841 5,728

Did not include E. coli 83333 because we load that one via EcoCyc.


For this PR: removed the following taxa:

253
287
470
546
548
550
562
571
573
615
1313
1352
1390
10255
10298
10335
10359
11251
35703
36352
37734
90371
128958
286636
367830
416870
529507
941280

And added
<style type="text/css"></style>

Organism  Taxon ID
Klebsiella pneumoniae subsp. ozaenae (subspecies) 574
Human herpesvirus 1 (strain 17) (HHV-1) (Human herpes simplex virus 1) 10299
Varicella-zoster virus (strain Dumas) (HHV-3) (Human herpesvirus 3) 10338
Human cytomegalovirus (strain AD169) 10360
Staphylococcus aureus (strain NCTC 8325 / PS 47) 93061
Human cytomegalovirus (strain Merlin) (HHV-5) (Human herpesvirus 5) 295027
Streptococcus pyogenes serotype M1 (Strain: ATCC 700294 / SF370 / Serotype M1) 301447
Enterococcus casseliflavus EC20 565655
Variola virus (isolate Human/India/Ind3/1967) 587200
@pgaudet
Copy link
Contributor Author

pgaudet commented Mar 4, 2025

@kltm PR is here:
#2448

@kltm
Copy link
Member

kltm commented Mar 4, 2025

@pgaudet There were some inconsistencies, if you could update the PR to iron those out: #2448 (review)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants