Events

Publications

Click here for a PubMed RSS Feed that picks up ontology papers from the last 60 days.

DateAbstract
June 27, 2008A System for Ontology-Based Annotation of Biomedical Data, C. Jonquet, M. A. Musen, N. H. Shah. International Workshop on Data Integration in The Life Sciences 2008, DILS'08, Evry, France, Springer-Verlag, 5109, Lecture Notes in BioInformatics, 144-152. Published 2008.

We present a system for ontology based annotation and indexing of biomedical data; the key functionality of this system is to provide a service that enables users to locate biomedical data resources related to particular ontology concepts. The system’s indexing workflow processes the text metadata of diverse resource elements such as gene expression data sets, descriptions of radiology images, clinical-trial reports, and PubMed article abstracts to annotate and index them with concepts from appropriate ontologies. The system enables researchers to search biomedical data sources using ontology concepts...
May 1, 2008Help Will be Provided for This Task: Ontology-Based Annotator Web Service,C. Jonquet, M. A. Musen, N. H. Shah. International Semantic Web Conference (ISWC08), Karlsruhe, Germany. May 2008.

... an ontology-based annotator web service methodology that can annotate a piece of text with ontology concepts and return annotations in OWL. Currently, the annotation workflow is based on syntactic concept recognition (using concept names and synonyms) and on a set of semantic expansion algorithms that leverage the semantics in ontologies. The paper also describes an implementation of this service for life sciences and biomedicine. Our biomedical annotator service uses one of the largest available set of publicly available terminologies and ontologies. We used it to create an index of open biomedical resources.
May 1, 2008Towards a richer description of our complete collection of genomes and metagenomes: the 'Minimum Information about a Genome Sequence' (MIGS) specification, Field, et al. Nature Biotechnology, 26(5): 541 - 547, May 2008.

With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.
March 28, 2008BioPortal: A Web Portal to Biomedical Ontologies, Daniel L. Rubin, Dilvan A. Moreira, Pradip P. Kanjamala, and Mark A. Musen. AAAI Spring Symposium, March 28, 2008.

... We have created BioPortal, a Web portal to a virtual library of ontologies on the Semantic Web and a tool set enabling the community to access, critique, and improve ontologies. The BioPortal library contains over 50 ontologies from the biological and medical domains. In addition to a Web interface enabling researchers in cyberspace to locate these knowledge resources, BioPortal provides a suite of Web services, including ontology categorization, term search, graphical ontology visualization, and ontology version histories. ... we are also creating novel tools in BioPortal to enable the community to create mappings between classes in related ontologies and to critique ontology content, providing feedback to ontology developers. Preliminary user experience with BioPortal has been extremely positive. BioPortal appears promising for unifying and disseminating ontology content on the Semantic Web, and it is providing tools needed by the research community to exploit these rich resources.
March 12, 2008Ontology-driven Indexing of Public Datasets for Translational Bioinformatics, Nigam H. Shah, M.B.B.S., PhD, Annie P. Chiang, PhD, Atul J. Butte, MD, PhD, Rong Chen, PhD and Mark A. Musen, MD, PhD. American Medical Informatics Association Symposium on Translational Bioinformatics, San Francisco,CA; March 10-12, 2008.

... We have previously developed methods to map text-annotations of tissue microarrays to concepts in the NCI thesaurus and SNOMED-CT. In this work we generalize our methods to map text annotations of gene expression datasets to concepts in the UMLS. We demonstrate the utility of our methods by processing annotations of datasets in the Gene Expression Omnibus. We demonstrate that we enable ontology-based querying and integration of tissue and gene expression microarray data. We enable identification of datasets on specific diseases across both repositories. Our approach provides the basis for ontology-driven data integration for translational research on gene and protein expression data.
January 1, 2008Biomedical ontologies: a functional perspective, D. L. Rubin, N. H. Shah, N. F. Noy. Briefings in BioinformaticsJanuary 2008. 9(1):75-90.

...The objective of this review is to give an overview of biomedical ontology in practical terms by providing a functional perspective—describing how bio-ontologies can and are being used. As biomedical scientists begin to recognize the many different ways ontologies enable biomedical research, they will drive the emergence of new computer applications that will help them exploit the wealth of research data now at their fingertips.
December 8, 2007Translating the Foundational Model of Anatomy into OWL, N. F. Noy, D. L. Rubin. Journal of Web Semantics In Press, Corrected Proof. Available online 8 December 2007.

The Foundational Model of Anatomy (FMA) represents the result of manual and disciplined modeling of the structural organization of the human body. It is a tremendous resource in bioinformatics that facilitates sharing of information among applications that use anatomy knowledge. The FMA was developed in Protégé and the Protégé frames language is the canonical representation language for the FMA. We present a translation of the original Protégé frame representation of the FMA into OWL. Our effort is complementary to the earlier efforts to represent FMA in OWL and is focused on two main goals: ...
November 11, 2007Ontology Mapping - A User Survey, S. M. Falconer, N. F. Noy, M. A. Storey. November 2007.

Conference Proceeding from the The Second International Workshop on Ontology Matching at ISWC 07 + ASWC 07, Busan, Korea.
November 11, 2007A cognitive support framework for ontology mapping,S. M. Falconer, M-A. Storey. International Semantic Web Conference, Busan, Korea. November 2007.

Ontology mapping is the key to data interoperability in the semantic web. This problem has received a lot of research attention, however, the research emphasis has been mostly devoted to automating the mapping process, even though the creation of mappings often involve the user. As industry interest in semantic web technologies grows and the number of widely adopted semantic web applications increases, we must begin to support the user. In this paper, we combine data gathered from background literature, theories of cognitive support and decision making, and an observational case study to propose a theoretical framework for cognitive support in ontology mapping tools. We also describe a tool called COGZ that is based on this framework.
November 11, 2007Interpretation Errors related to the GO Annotation File Format, Dilvan A Moreira, Nigam H. Shah, Mark A. Musen. AMIA Annual Symposium, Chicago, IL, November 2007.

The Gene Ontology (GO) is the most widely used ontology for creating biomedical annotations. GO annotations are statements associating a biological entity with a GO term. These statements comprise a large dataset of biological knowledge that is used widely in biomedical research. GO Annotations are available as “gene association files” from the GO website in a tab-delimited file format (GO Annotation File Format) composed of rows of 15 tab-delimited fields. This simple format lacks the knowledge representation (KR) capabilities to represent unambiguously semantic relationships between each field. This paper demonstrates that this KR shortcoming leads users to interpret the files in ways that can be erroneous. We propose a complementary format to represent GO annotation files as knowledge bases using the W3C recommended Web Ontology Language (OWL).
November 7, 2007The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration, Barry Smith, Michael Ashburner, Cornelius Rosse, Jonathan Bard, William Bug, Werner Ceusters, Louis J Goldberg, Karen Eilbeck, Amelia Ireland, Christopher J Mungall, the OBI Consortium, Neocles Leontis, Philippe Rocca-Serra, Alan Ruttenberg, Susanna-Assunta Sansone, Richard H Scheuermann, Nigam Shah, Patricia L Whetzel & Suzanna Lewis. Nature Biotechnology, November 2007. 25(11):1251-1255.

Existing OBO ontologies, including the Gene Ontology, are undergoing coordinated reform, and new ontologies are being created on the basis of an evolving set of shared principles governing ontology development. The result is an expanding family of ontologies designed to be interoperable and logically well formed and to incorporate accurate representations of biological reality. We describe this OBO Foundry initiative and provide guidelines for those who might wish to become involved.
November 4, 2007The Gene Ontology Project in 2008, The Gene Ontology Consortium. Nucleic Acids Research, Advance Access published November 4, 2007. 36:D440-D444.

The Gene Ontology (GO) project (http://www.geneontology.org/) provides a set of structured, controlled vocabularies for community use in annotating genes, gene products and sequences (also see http://www.sequenceontology.org/). The ontologies have been extended and refined for several biological areas, and improvements to the structure of the ontologies have been implemented. To improve the quantity and quality of gene product annotations available from its public repository, the GO Consortium has launched a focused effort to provide comprehensive and detailed annotation of orthologous genes across a number of ‘reference’ genomes, including human and several key model organisms. Software developments include two releases of the ontology-editing tool OBO-Edit, and improvements to the AmiGO browser interface.
November 1, 2007Ontrez Project Report, N. H. Shah, C. Jonquet. Technical Report, Published 2007.

November 1, 2007Searching Ontologies Based on Content: Experiments in the Biomedical Domain, H. Alani, N. F. Noy, N. H. Shah, M. A. Musen. ACM 2007:55-62.

Conference Proceeding from the Fourth International Conference on Knowledge Capture (K-CAP 2007), Whistler, BC, Canada
October 11, 2007ChEBI: a database and ontology for chemical entities of biological interest, Kirill Degtyarenko, Paula de Matos, Marcus Ennis, Janna Hastings, Martin Zbinden, Alan McNaught, Rafael Alcántara, Michael Darsow, Mickaël Guedj and Michael Ashburner. Nucleic Acids Research, Advance Access published October 11, 2007. 36:D344-D350.

Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. The molecular entities in question are either natural products or synthetic products used to intervene in the processes of living organisms. Genome-encoded macromolecules (nucleic acids, proteins and peptides derived from proteins by cleavage) are not as a rule included in ChEBI. In addition to molecular entities, ChEBI contains groups (parts of molecular entities) and classes of entities. ChEBI includes an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified. ChEBI is available online at http://www.ebi.ac.uk/chebi/
September 27, 2007Which Annotation did you mean?, N. H. Shah, M. A. Musen. Technical Report, Published 2007.

Ontologies are widely used to create biomedical annotations. Considerable effort goes into curation of publications to create annotations of genes and gene products as well as in the annotation of data sets. However, ‘annotation’ has different meanings in these two contexts, leading to enormous confusion within the bioinformatics community. In this work, we delineate the two meanings of annotation. We demonstrate that the semantics of these two interpretations are different and have significant bearing on how annotations of each type are created, stored, and used.
September 1, 2007Current progress in network research: toward reference networks for key model organisms, B. Srinivasan, N. H. Shah, J. A. Flannick, E. Abeliuk, A. F. Novak, S. Batzoglou. Briefings in Bioinformatics, September 2007. 8(5):318-332.

The collection of multiple genome-scale datasets is now routine, and the frontier of research in systems biology has shifted accordingly. Rather than clustering a single dataset to produce a static map of functional modules, the focus today is on data integration, network alignment, interactive visualization and ontological markup. Because of the intrinsic noisiness of high-throughput measurements, statistical methods have been central to this effort. In this review, we briefly survey available datasets in functional genomics, review methods for data integration and network alignment, and describe recent work on using network models to guide experimental validation. We explain how the integration and validation steps spring from a Bayesian description of network uncertainty, and conclude by describing an important near-term milestone for systems biology: the construction of a set of rich reference networks for key model organisms.
August 8, 2007Annotation and query of tissue microarray data using the NCI Thesaurus, N. H. Shah, D. L. Rubin, I. Espinosa, K. Montgomery, M. A. Musen. BMC Bioinformatics, August 2007. 8:296.

The Stanford Tissue Microarray Database (TMAD) is a repository of data serving a consortium of pathologists and biomedical researchers. The tissue samples in TMAD are annotated with multiple free-text fields, specifying the pathological diagnoses for each sample. These text annotations are not structured according to any ontology, making future integration of this resource with other biological and clinical data difficult. Results: We developed methods to map these annotations to the NCI thesaurus. Using the NCI-T we can effectively represent annotations for about 86% of the samples. We demonstrate how this mapping enables ontology driven integration and querying of tissue microarray data. ...
July 1, 2007Phenotype ontologies: the bridge between genomics and evolution, Paula M. Mabee1, Michael Ashburner, Quentin Cronk, Georgios V. Gkoutos, Melissa Haendel, Erik Segerdell, Chris Mungall and Monte Westerfield. Trends in Ecology and Evolution, July 2007. 22(7):345-350.

Understanding the developmental and genetic underpinnings of particular evolutionary changes has been hindered by inadequate databases of evolutionary anatomy and by the lack of a computational approach to identify underlying candidate genes and regulators. By contrast, model organism studies have been enhanced by ontologies shared among genomic databases. Here, we suggest that evolutionary and genomics databases can be developed to exchange and use information through shared phenotype and anatomy ontologies. This would facilitate computing on evolutionary questions pertaining to the genetic basis of evolutionary change, the genetic and developmental bases of correlated characters and independent evolution, biomedical parallels to evolutionary change, and the ecological and paleontological correlates of particular types of change in genes, gene networks and developmental pathways.
June 27, 2007a href="http://bmir.stanford.edu/file_asset/index.php/1220/Rubin_LNBI_45440247.pdf">Using Annotations from Controlled Vocabularies to Find Meaningful Associations, Woei-Jyh Lee, Louiqa Raschid, Padmini Srinivasan, Nigam Shah, Daniel Rubin, and Natasha Noy. DILS 2007, 4th International Workshop, Philadelphia, PA, June 2007. LNBI 4544:247-263.

This paper presents the LSLink (or Life Science Link) methodology that provides users with a set of tools to explore the rich Web of interconnected and annotated objects in multiple repositories, and to identify meaningful associations. Consider a physical link between objects in two repositories, where each of the objects is annotated with controlled vocabulary (CV) terms from two ontologies. Using a set of LSLink instances generated from a background dataset of knowledge we identify associations between pairs of CV terms that are potentially significant and may lead to new knowledge. ...
June 2, 2007UMLS-Query: A Perl Module for Querying the UMLS, N. H. Shah, M. A. Musen. Technical Report, Published 2007.

The Metathesaurus from the Unified Medical Language System (UMLS) is a widely used ontology resource, which is mostly used in a relational database form for terminology research, mapping and information indexing. We describe UMLS-Query, a Perl module that provides functions for retrieving concept identifiers, mapping text-phrases to Metathesaurus concepts and graph traversal in the Metathesaurus stored in a MySQL database. UMLS-Query can be used to build applications for semi-automated sample annotation, terminology based browsers for tissue sample databases and for terminology research.
December 31, 2006Clench2.0: Cluster enrichment analysis and visualization of expression, annotation and transcription factor binding site data, N. H. Shah, N. V. Fedoroff, M. A. Musen. Technical Report, Published 2006.

Motivation: The end result of analyzing microarray datasets is a list of differentially expressed genes. Such gene lists are grouped by the functions of the gene products and common transcription factor binding sites contained in their promoters. Functional categorization is most commonly accomplished using Gene Ontology (GO) categories and promoters are analyzed for the presence and enrichment of binding sites for transcription factors known to be involved in the process under study. Although there are several programs that identify and analyze functional categories, few of them analyze both promoter sequences and functional categories. Moreover, the integrated visualization of the three data types, expression, annotation and transcription factor binding sites in the promoters, is important for drawing meaningful inferences.
November 11, 2006Ontology-based Annotation and Query of Tissue Microarray Data, N. H. Shah, D. L. Rubin, K. S. Supekar, M. A. Musen. AMIA Annual Symposium, Washington DC, November 2006. 709-713.

The Stanford Tissue Microarray Database (TMAD) is a repository of data amassed by a consortium of pathologists and biomedical researchers. The TMAD data are annotated with multiple free-text fields, specifying the pathological diagnoses for each tissue sample. These annotations are spread out over multiple text fields and are not structured according to any ontology, making it difficult to integrate this resource with other biological and clinical data. We developed methods to map these annotations to the NCI thesaurus and the SNOMED-CT ontologies. ...
June 1, 2006The National Center for Biomedical Ontology: Advancing Biomedicine through Structured Organization of Scientific Knowledge, D. L. Rubin, N. F. Noy, J. D. Richter, B. Smith, M. A. Storey, H. Solbrig, C. G. Chute, I. Sim, M. Ashburner, M. Westerfield, S. Misra, C. J. Mungall, S. E. Lewis, M. A. Musen. OMICS: A Journal of Integrative Biology, June 2006. 10(2):185-198.

The National Center for Biomedical Ontology is a consortium that comprises leading informaticians, biologists, clinicians, and ontologists, funded by the National Institutes of Health (NIH) Roadmap, to develop innovative technology and methods that allow scientists to record, manage, and disseminate biomedical information and knowledge in machine-processable form. The goals of the Center are...
October 12, 2005The Zebrafish Information Network: the zebrafish model organism database, Judy Sprague, Leyla Bayraktaroglu, Dave Clements, Tom Conlin, David Fashena, Ken Frazer, Melissa Haendel, Douglas G. Howe, Prita Mani, Sridhar Ramachandran, Kevin Schaper, Erik Segerdell, Peiran Song, Brock Sprunger, Sierra Taylor, Ceri E. Van Slyke and Monte Westerfield. Nucleic Acids Research, 2006. 34:D581–D585.

The Zebrafish Information Network (ZFIN; http://zfin.org) is a web based community resource that implements the curation of zebrafish genetic, genomic and developmental data. ZFIN provides an integrated representation of mutants, genes, genetic markers, mapping panels, publications and community resources such as meeting announcements and contact information. Recent enhancements to ZFIN include (i) comprehensive curation of gene expression data from the literature and from directly submitted data, (ii) increased support and annotation of the genome sequence, (iii) expanded use of ontologies to support curation and query forms, (iv) curation of morpholino data from the literature, and (v) increased versatility of gene pages, with new data types, links and analysis tools.
September 1, 2000Computers in Radiology: Interactive Software for Generation and Vizualization of Structured Findings in Radiology Reports, U. Sinha, B. Dai, D. B. Johnson, R. Taira, M. Golamco, H. Kangarloo. American Journal of Roentgenology, (AJR), September 2000. 175(3):609-612.

OBJECTIVES: To develop a user-friendly graphic interface for a module that integrates traditional radiology reporting, natural language processing, and editing capabilities; to facilitate the structuring of radiology reports as part of routine clinical practice; to use a commercial speech recognition module for online transcription; to implement the module in a hardware-independent environment.
Ruby On Rails