Infectious Disease Ontology 2008
Background
The background to this meeting is an Infectious Disease Ontology workshop (IDO 2007), which was organized in Cold Spring Harbor Laboratories in 2007. The workshop had four primary outcomes:
- Training of a core set of infectious disease researchers in ontology-development methods, facilitating their participation in ontology development;
- Development of a core Infectious Disease Ontology (IDO) which is designed to serve as a consensus-based controlled vocabulary resource for annotation of data representing all entities relevant to infectious diseases generally;
- Establishment of a method for creating, on the basis of IDO, a set of ontologies that can be developed in a distributed fashion yet together cover the entire infectious disease domain (the set consists of the above-described core IDO ontology plus sub-domain-specific extensions of the core, such as IDO-tuberculosis, IDO-malaria);
- Formation of an Infectious Disease Ontology Consortium (IDOC) whose members have agreed to contribute towards continued development of the core IDO and to develop seven different sub-domain-specific ontologies.
IDO 2008 Goals
To capitalize on these outcomes and to sustain the momentum gained from the 2007 workshop, we have scheduled a second Infectious Disease Ontology workshop, to be held in Buffalo, NY on September 16-17, 2008. The goals of this meeting are:
- i) to provide training for new consortium members,
- ii) to formualte and test a development methodology that can be adopted on a broad scale by experts in a wide variety of infectious disease sub-domains
- iii) to critically evaluate the ontology development test cases initiated at IDO 2007 to improve both the ontologies themselves and the methodology used for their development,
- iv) to identify key application test cases, such as ontology-based natural language processing, to be developed over the coming year,
- v) to expand representation of sub-domains still lacking development effort, and
- vi) to involve representatives from key information sources and institutions that could be important contributors and users of the IDO set of ontologies.
IDO 2008 Schedule
Day 1: Tuesday September 16
- 8:30 am to 9:00 am Continental Breakfast
- 9:00 am to 10:00 am Introduction to Biomedical Ontology
- 10:00 am to 11:00 am Introduction to the Infectious Disease Ontology
- 11:00 am to 11:30 am Refershment Break
- 11:30 am to 12:30 pm The Vaccine Ontology
- 12:30 pm to 1:30 pm Lunch
- 1:30 pm to 2:30 pm The Tuberculosis Ontology
- 2:30 pm to 3:30 pm The Staphylococcus aureus Ontology
- 3:30 pm to 4:00 pm Refreshment Break
- 4:00 pm to 5:00 pm The Infective Endocarditis Ontology
- 5:30 pm to 7:30 pm Dinner
- 7:30 pm to 9:00 pm Ontologies in the Future of Infectious Disease Research: A Discussion Introduced and Moderated by Christos Louis
Day 2: Wednesday September 17
- 8:30 am to 9:00 am Continental Breakfast
- 9:00 am to 10:00 am The Vector-borne Disease Ontology
- 10:00 am to 11:00 am The Dengue Fever Ontology
- 11:00 am to 11:30 am Refreshment Break
- 11:30 am to 12:30 pm The Influenza Ontology
- 12:30 pm to 1:30 pm Lunch
- 1:30 pm to 2:30 pm Current and Future Applications of the IDO Methodology
- 2:30 pm to 3:00 pm Refreshment Break
- 3:00 pm to 4:00 pm Goals for the Coming Year
Format
One person will designated as moderator for each session. All sessions will emphasize group discussion over presentation. Moderators of the ontology evaluation sessions will be responsible for beginning the session with a brief presentation of the ontology and will be prepared to navigate and display the ontology throughout the discussion. Moderators for the remaining sessions will be responsible for jumpstarting discussion with a brief outline of discussion points.
Confirmed Participants
Sivaram Arabandi
Lindsay Cowell
Alex Diehl
Vance Fowler
Steve Gill
Yongqun He
Joanne Luciano
Kitsos Louis
Saul Lozano-Fuentes
Anna Maria Masci
Chris Mungal
Darren Natale
Chimezie Ogbuji
Bjoern Peters
Alan Ruttenberg
Richard Scheuermann
Lynn Schriml
Barry Smith
Progress Since the IDO 2007
Progress has been made in the development of IDO and seven sub-domain-specific extensions of IDO. The sub-domain-specific extensions ontologies for the following diseases:
- Tuberculosis (Carol Dukes-Hamilton, Duke University Medical Center)
- Staphylococcus aureus bacteremia (Vance Fowler, Duke University Medical Center)
- Infective endocarditis (Sivaram Arabandi, Cleveland Clinic Foundation)
- Malaria and other vector-borne diseases (Christos Louis, Institute for Molecular Biology and Biochemistry – FORTH)
- Dengue fever (Saul Lozano-Fuentes, Colorado State)
- Influenza (Stuart Sealfon, Mount Sinai School of Medicine; Richard Scheuermann, University of Texas, Southwestern Medical Center)
Development of IDO has continued along two fronts, expansion of content driven by development of the subdomain-specific ontologies and refinement of the approach to representing infectious disease-relevant entities ontologically.
IDO is supplemented also by a Vaccine Ontology which is being developed by Yonggun He (University of Michigan) in collaboration with Lindsay Cowell and Barry Smith.
In collaboration with Dr. Carol Dukes-Hamilton at Duke University Medical Center, Drs. Cowell and Smith have begun developing a draft ontology of tuberculosis and a method for defining ISO 11179 data elements using logical constructs based on terms derived from ontologies. Dr. Dukes-Hamilton’s research group has defined eighty tuberculosis data elements and curated these into the National Cancer Institute’s metadata repository, caDSR. Definition of these data elements using ontology terms provides not only a formal method for data element definition, significantly improving the resulting definitions, but also interoperability between data elements (along with the data associated therewith) and the vast amount of biomedical data and information annotated with terms from the same or an interoperable set of ontologies.
In collaboration with Dr. Vance Fowler at Duke University Medical Center, Drs. Cowell and Smith have developed a draft ontology of Staphylococcus aureus bacteremia.
Dr. Sivaram Arabandi of the Cleveland Clinic Foundation is part of a large team developing SemanticDB technology, a semantic datastore with query functionality, having primary focus on Cardiology and Cardiothoracic Surgery. A portion of this work involves developing an IDO extension ontology for infective endocarditis.
Dr. Christos Louis’ research group at the Institute of Molecular Biology and Biochemistry (IMBB), one of the seven institutes of the Foundation for Research and Technology – Hellas (FORTH), based in Crete, is developing an IDO extension for malaria and other vector-borne diseases. The group is working in parallel to develop an ontology of the physiological processes of disease vectors that play a direct or indirect role in disease transmission. These ontology development efforts are being pursued within the context of VectorBase (http://www.vectorbase.org), an NIAID Bioinformatics Resource Center for invertebrate vectors of human pathogens, and embracing efforts to construct decision support systems for vector-borne diseases.
A collaborative group of researchers including Joanne Luciano (MITRE), Burke Squires (University of Texas, Southwestern Medical Center) and Lynn Schriml (University of Maryland, School of Medicine), have utilized the Ontology of Biomedical Investigations (OBI) components of materials/objects, qualities and processes to develop an influenza ontology and to map influenza virus sequence and surveillance terms to their respective materials and qualities.
The Influenza Ontology describes by category the Investigator, Event, Location, Strain Specimen, Amplified Strain Specimen, Virion RNA, Treatment, and Host. The groups from BHB, IGS and MITRE have consolidated influenza sequence and surveillance terms from resources such as the BioHealthBase (BHB), a Bioinformatics Resource Center (BRC) for Biodefense and Emerging and Re-emerging Infectious Diseases, the Centers for Excellence in Influenza Research and Surveillance (CEIRS), and the Gemina and Influenza Virus Genome Projects. The list of data fields that describe influenza virus isolates and surveillance data has been created by consolidating data fields from data contributors and separate CEIRS participants. The initial ontology of terms has been created with a cross reference of terms to existing OBO Foundry ontologies.
The CEIRS projects consist of two research areas: influenza virus surveillance and basic influenza virus sequence and genetic reassortment. Working in collaboration with Dr. Richard Scheuermann, University of Texas, Southwestern Medical Center, the immediate goal is to apply the Influenza Virus Ontology to data collected as part of the CEIRS projects in an effort to enable influenza researchers to more easily elucidate the causes of influenza virulence and pathogenesis. Once completed, a database schema based upon the OBI will serve as the repository for influenza sequence and surveillance data through the BHB portal.
For more information about IDO and its sub-domain extensions, see http://www.infectiousdiseaseontology.org.