ICBO_2018_2: Adapting Disease Vocabularies for Curation at the Rat Genome Database

TitleICBO_2018_2: Adapting Disease Vocabularies for Curation at the Rat Genome Database
Publication TypeConference Paper
Year of Publication2018
AuthorsLaulederkind, S, G. Hayman, T, Wang, S-J, Bolton, E, Smith, JR, Tutaj, M, de Pons, J, Shimoyama, M, Dwinell, M
Conference NameInternational Conference on Biomedical Ontology (ICBO 2018)
Date Published08/06/2018
PublisherInternational Conference on Biological Ontology
Keywordscuration, disease vocabularies, online resource, Rat Genome Database

The Rat Genome Database (RGD) has been annotating genes, QTLs, and strains to disease terms for over 15 years. During that time the controlled vocabulary used for disease curation has changed a few times. The changes were necessitated because no single vocabulary or ontology was freely accessible and complete enough to cover all of the disease states described in the biomedical literature. The first disease vocabulary used at RGD was the “C” branch of the National Library of Medicine’s Medical Subject Headings (MeSH). By 2011 RGD had switched disease curation to the use of MEDIC (MErged DIsease voCabulary), which is a combination of MeSH and OMIM (Online Mendelian Inheritance in Man) constructed by curators at the Comparative Toxicogenomics Database (CTD). MEDIC was an improvement over MeSH, because of the added coverage of OMIM terms, but it was not long before RGD curators saw the need for more disease terms. So within a couple of years, RGD began to add terms to MEDIC under the guise of the RGD Disease Ontology (RDO). Since RGD assigned a unique ID to every MEDIC term imported from CTD, it was easy to add specially coded IDs to indicate those additional terms from a separate, supplemental file. Meanwhile, the human disease ontology (DO) had slowly been developing and expanding. As early as 2010, members of RGD were contributing to the development of DO. Based on the promise of improvements, it was determined that the Alliance of Genome Resources could use the DO as a unifying disease vocabulary across model organism databases. Despite the improvements in DO, RGD still had more than 1000 custom terms and 3800 MEDIC terms with annotations to deal with if RGD would convert to the use of DO. If RGD mapped those non-DO disease terms to DO, much granularity of meaning would be lost. To avoid the loss of granularity it was decided to extend the DO after import of the merged, already axiomized DO file. So after mapping DO completely to the RGD version of MEDIC, a broader, deeper disease vocabulary has been achieved.