W01: Phenotype Ontologies Traversing All The Organisms (POTATO)

ICBO2018 Workshop-W01

Subheading: Aligning phenotype ontologies using design patterns

Dates:

Monday, August 06 - 9am-5:30pm (invite only)
Location: Willamette Room 115A-B

Tuesday, August 07 - 4-6pm (open to all) 
Location: Cascade Ballroom 110

Organizer(s):

  • David Osumi-Sutherland
  • Nico Matentzoglu
  • Nicole Vasilevsky
  • Melissa Haendel
  • Chris Mungall
  • Jim Balhoff

Workshop type:

  • Workshop

The following software are required for our hands-on sessions:

  • Protégé 5 (https://protege.stanford.edu/products.php)
  • ELK Reasoner Plugin for Protégé (we have made good experiences with the latest unofficial release that can be obtained from here)
  • Docker (Community Edition) - this is a framework for running container-based software; this is needed for running some of the ontology processing tools.
  • A git client (if you are on Windows or Mac and not familiar with git version control, consider installing the Github Desktop client)

You will also need an account on GitHub if you do not already have one. If you have problems with installation or other questions about the workshop, you can let us know in our Gitter channel or send an email to nicolas.matentzoglu@ebi.ac.uk.

Workshop Abstract

One of the challenges in computing over or integrating data annotated using multiple phenotype ontologies is limited of interoperability between ontologies built for different species. While most phenotype ontologies use logical definitions composed of terms from species independent ontologies such as Uberon and PATO, the design patterns details of these logical definitions often differ in ways that make cross-species mapping and ontology maintenance difficult. To fix this, we need mechanisms for specifying and sharing design patterns as a way of standardising logical definitions both within and between phenotype ontologies.

A design pattern, in this context, is simply a template for generating OWL classes by specifying OWL entities (classes, relations) and literals (e.g. text strings) that fill variable slots in the template. For example, a template that specifies logical definitions for phenotypic classes might have variable slots for anatomical entities and their phenotypic qualities such as ‘abnormal’, ‘increased size’ or ‘decreased amount’.

Recent innovations have made design-pattern driven ontology development relatively straightforward. We now have formally specified standards for representing and validating design patterns templates and software to generate ontology terms using these templates.  We also have user-friendly tools for constructing and maintaining tables specifying the information required to apply these templates to generate sets of new terms. We introduce the phenotype ontology community to the idea and benefits of pattern-based ontology development and present two suites of tools, one for analysing the state of alignment between phenotype ontologies and prioritising design pattern changes and another for implementing these changes. We illustrate how pattern standardization across species facilitates the generation of species-specific phenotype ontologies, while adding significant value to important problems such as disease identification. The workshop will be divided into two parts: (1) a pre-meeting portion (1 day) for which we explicitly invite experienced phenotype ontology engineers involved in the definition of phenotypes for various organisms to work on cross-species phenotype patterns and pattern reconciliation and (2) a main, public tutorial that will introduce the life cycle for pattern based development, provide instruction using relevant tools, and illustrate the benefits of pattern alignment across ontologies.

The public workshop will take the form a half day tutorial. This will start with an introduction to pattern-based ontology development emphasising its benefits and importance, illustrated using the dead simple OWL design pattern (DOSDP framework) and examples from phenotype ontologies.  Following this we will present and discuss strategies to maintain patterns in a multi-species context.  We will run a hands on practical session on setting up ontology repositories for pattern-based development.

Rational:

Understanding the genetic bases of human diseases and phenotypes is a central aim of much modern biomedical research. While we have increasing volumes of evidence from human studies linking genes to diseases and phenotypes, the majority of this evidence is only correlative. In contrast, evidence from model organism studies is a very rich source of evidence for causal links between phenotypes and mutations in specific genes.  We can use genetic similarity to leverage this data - inferring the functions and phenotypes of human genes based on those of their model organism orthologs. Alternatively, we can use phenotypic similarity - predicting the genes that underlie human disease and phenotype by finding genes that cause similar phenotypes in model organisms. This latter approach is especially important when the genetic basis of a phenotype or disease in unknown, but this requires a unified semantics for disease and phenotypes.

Much progress has been made in achieving this. In particular, many phenotype ontologies make extensive use of formal, logical definitions referencing classes from a shared or related set of ontologies: for using species neutral ontologies to refer to anatomy (Uberon) and Cell type (CL). However, limited alignment of formal definition patterns limits their usefulness for improving ontology structure by automating classification and for cross-species inference. For some species (e.g. Xenopus), formal recording of phenotypes is only just getting started so guidance is needed to choose patterns that maximise alignment with existing ontologies.

Funding:

NIH: 1U13CA221044