Difference between revisions of "PATO Meeting"

From NCBO Wiki
Jump to navigation Jump to search
m (Added a couple more people to the list.)
Line 64: Line 64:
 
between events and conditions, the mechanisms that lead from the
 
between events and conditions, the mechanisms that lead from the
 
normal to the diseased phenotype.<br />
 
normal to the diseased phenotype.<br />
 +
 +
 +
===Ontology Engineering Approaches Based on Semi-Automated Curation of the Primary Literature (Gully Burns)===
 +
Abstract: The process of knowledge curation from the primary literature is time-consuming, laborious, and specialized.  Fortunately, the similarities of the curation process to the annotation of text with semantic labels presents an opportunity to employ cutting-edge natural language processing techniques to facilitate ontology construction.  The result is that manual curation work can support both the development of an automated curation system and the semi-automated construction of a formal model of the domain. We present a general approach based on the use of active learning methods in conjunction with text-mining systems using the Conditional Random Fields model. Ultimately, we wish to construct annotation tools that fit seamlessly into scientists' everyday interaction with the primary literature. Our secondary, and complementary, focus is the creation of a domain ontology of the types of information identified for curation, which may encode formally the expert's knowledge and help pinpoint errors or vagueness in his or her understanding. We present preliminary data taken from information extraction experiments performed on the neuroanatomical connectivity literature. While this data is not normally considered a 'phenotype' within neuroanatomy, we argue that it (along with other non-genomic data) should be considered by the PATO community. This work is funded by the Information Sciences Institute and the National Library of Medicine (LM-07061).<br />
 +
 +
===Suggested Ontology for Pharmacogenomics (SO-Pharm): Modular Construction and Preliminary Testing (Adrien Coulet)===
 +
Abstract: Pharmacogenomics studies the involvement of interindividual variations of DNA sequence in different drug responses (especially adverse drug reactions). Knowledge Discovery in Databases (KDD) process is a means for discovering new pharmacogenomic knowledge in biological databases. However data complexity makes it necessary to guide the KDD process by representation of domain knowledge. Three domains at least are in concern: genotype, drug and phenotype. The approach described here aims at reusing whenever possible existing domain knowledge in order to build a modular formal representation of domain knowledge in pharmacogenomics. The resulting ontology is called SO-Pharm for Suggested Ontology for Pharmacogenomics. Various situations encountered during the construction process are analyzed and discussed. A preliminary validation is provided by representing with SO-Pharm concepts some well-known examples of pharmacogenomic knowledge.<br />

Revision as of 14:22, 9 November 2006

General Information and Registration

The National Center for Biomedical Ontology will host a two-day meeting focused on the Phenotype and Trait Ontology (PATO) December 1-2, 2006 at Stanford University in Palo Alto, CA.

Register here.

Venue

Stanford University Clark Center, room S360

Directions to Stanford Medical Center

Map of Stanford Medical Center--see "C", Clark Center

Directions on taking Free Marguerite Shuttle Bus--take Line A to Medical Center

Accommodations

We aren't reserving a block of rooms anywhere, but these local hotels are close by and have Stanford shuttle service.

Draft agenda

Pheno workshop Dec2006.jpg

Please join us for dinner

  • Thursday, November 30, 2006
    • 7pm: Informal gathering for dinner (Gordon Biersch?)
  • Fridayday, December 1, 2006
    • 7pm:

Abstracts for Presentations

Challenges for Representing Phenotype in Pharmacogenomics (Russ Altman)

Abstract: The PharmGKB (http://www.pharmgkb.org/) is an online resource devoted to comprehensive cataloguing of genetic variations relevant to variation in drug response. We curate primary data (genotype, phenotype at molecular, cellular, clinical level) as well as knowledge (literature curation, pathways, human annotations of key genes). We provide search and visualization tools for this information, in order to catalyze research in pharmacogenomics. For both activities, we need to index the relevant phenotypes for the purposes of indexing, aggregation, search, and automatic summarization and data mining. We need a flexible method for annotating phenotypes that are described in the literature (by curators). We would prefer to adopt community-based standards that would allow PharmGKB to interoperate with other databases, both human and model organism.

Ontologies and Vocabularies Supporting Data Integration: Emphasis on Mouse Phenotypes and Disease (Janan Eppig)

Abstract: The mouse is an exceptional model system for connecting knowledge from sequence-to-phenotype-to-disease. The Mouse Genome Informatics Database (MGI, http://www.informatics.jax.org) supports biological knowledge building for mouse by integrating genetic, genomic, and biological data and facilitating data mining and complex querying. Full access to integrated data is enabled by extensive use of structured vocabularies and ontologies including the Gene Ontology (GO), mouse Embryonic and Adult Anatomical Dictionaries (EMAP and MA), Mammalian Phenotype (MP) Ontology, and Online Mendelian Inheritance in Man (OMIM) disease and syndrome terms. In addition, MGI is the authoritative source for nomenclature for mouse genes, alleles, and strains. Many smaller vocabularies, such as mutation class, sequence type, genetic marker type, expression assay type, etc., also are key to MGI data integration. Phenotypic descriptions in MGI rely on the MP Ontology and definition of specific genotypes and strain backgrounds. The MP Ontology has been adopted successfully to describe mouse (MGI), rat (RGD), human (NBCI), and animal (OMIA) phenotypes. As of July 2006, MGI included >16,000 alleles representing phenotypic mutations in >6,600 genes. Over 65,600 phenotype annotations in MGI have been made using MP Ontology terms. The MP Ontology itself has, thus far, grown to >4,400 defined terms. Over 1,700 mouse models are associated with OMIM disease terms. Supported by NIH grant HG00330.

Rat Genome Database Disease Portals: A Platform for Genetic and Genomic Research (Victoria Petri)

Abstract: The Disease Portals at RGD provide a comprehensive research platform through the integration of heterogeneous datasets into the context of the genome using multiple ontologies and tools for data mining and visualization. The portals provide both the novice/experienced user with easy access to a comprehensive, integrated knowledgebase. Current/proposed components of the portals include: 1) comprehensive rat, human and mouse gene sets associated with diseases, related phenotypes, pathways and biological processes; 2) all rat QTLs related to a disease, associated mouse/human QTLs; 3) strains used as disease models; 4) phenotype data in a species-dependent manner; 5) references; 6) expression data; 7) genome-wide view of genes/QTLs via GViewer; 8) comparative maps of disease related regions, 10) customization of datasets/download options; 11) analysis/visualization of function and cellular localization makeup of gene sets. The portals are designed to highlight genetic/ genomic data generated from rat research in diseases related to the cardiovascular, nervous, musculoskeletal, digestive, endocrine and immune systems as well as metabolic diseases, cancer. Disease data across the three species, along with species-dependent phenotypic data provide the user with a means to distinguish between subtle differences in disease manifestations. Such differences could help elucidate the links between events and conditions, the mechanisms that lead from the normal to the diseased phenotype.


Ontology Engineering Approaches Based on Semi-Automated Curation of the Primary Literature (Gully Burns)

Abstract: The process of knowledge curation from the primary literature is time-consuming, laborious, and specialized. Fortunately, the similarities of the curation process to the annotation of text with semantic labels presents an opportunity to employ cutting-edge natural language processing techniques to facilitate ontology construction. The result is that manual curation work can support both the development of an automated curation system and the semi-automated construction of a formal model of the domain. We present a general approach based on the use of active learning methods in conjunction with text-mining systems using the Conditional Random Fields model. Ultimately, we wish to construct annotation tools that fit seamlessly into scientists' everyday interaction with the primary literature. Our secondary, and complementary, focus is the creation of a domain ontology of the types of information identified for curation, which may encode formally the expert's knowledge and help pinpoint errors or vagueness in his or her understanding. We present preliminary data taken from information extraction experiments performed on the neuroanatomical connectivity literature. While this data is not normally considered a 'phenotype' within neuroanatomy, we argue that it (along with other non-genomic data) should be considered by the PATO community. This work is funded by the Information Sciences Institute and the National Library of Medicine (LM-07061).

Suggested Ontology for Pharmacogenomics (SO-Pharm): Modular Construction and Preliminary Testing (Adrien Coulet)

Abstract: Pharmacogenomics studies the involvement of interindividual variations of DNA sequence in different drug responses (especially adverse drug reactions). Knowledge Discovery in Databases (KDD) process is a means for discovering new pharmacogenomic knowledge in biological databases. However data complexity makes it necessary to guide the KDD process by representation of domain knowledge. Three domains at least are in concern: genotype, drug and phenotype. The approach described here aims at reusing whenever possible existing domain knowledge in order to build a modular formal representation of domain knowledge in pharmacogenomics. The resulting ontology is called SO-Pharm for Suggested Ontology for Pharmacogenomics. Various situations encountered during the construction process are analyzed and discussed. A preliminary validation is provided by representing with SO-Pharm concepts some well-known examples of pharmacogenomic knowledge.