OboInOwl:Main Page

From NCBO Wiki
Revision as of 15:07, 19 January 2007 by Nigam (talk | contribs)
Jump to navigation Jump to search

OboInOwl

This wiki is for discussing the mapping between Obo1.2 format and OWL. The first version of this mapping is at http://www.godatabase.org/dev/doc/mapping-obo-to-owl.html. We have finished work on a newer version of this mapping. This page provides a brief background for the effort and provides links to the relevant tools (plugins for OBOEdit and Protege) that implement the mapping.

The Gene Ontology and a significant number of biomedical ontologies are in the OBO-format. The OBO format, which originated along with the Gene Ontology, has evolved to support the needs of the biomedical ontologies that fall under the Open Biomedical Ontologies (OBO) umbrella. The OBO-format aims to have 1) human readability, 2) ease of parsing, 3) extensibility and 4) minimal redundancy. The OBO-format currently forms the backbone of most GO based annotation and data analysis tools.

In parallel with the developments in bio-ontologies, ontologies in general have become more prevalent in information technology; with the most visible push coming from the W3C in the form of the W3C recommendation of the Web Ontology Language (OWL) as an international standard for ontologies on the web. There has also been a corresponding increase in the number, diversity and quality of the tools available to construct, maintain and view ontologies in OWL.

As bio-ontologies become more popular and grow in size as well as complexity, they are becoming the focus of attention of the larger computer science research community. On one hand there is significant interest in using the life sciences domain as a “focus” for W3C semantic web activity. In this light, biological data annotated using OBO ontologies is a prime resource and there is great interest from the Semantic Web community to access the ontologies and the annotated data in OWL format. On the other hand, if bio-ontologies are to benefit from the rapid progress being made in computer science – especially the semantic web technologies – bio-ontologies need to interoperate with other ontologies which are in the OWL format. The relatively newer biomedical ontologies (such as BioPAX) are already in OWL. The NCI-thesaurus, being developed by the National Cancer Institute, is also in OWL.

As a result, there is a strong need to map the OBO-format to OWL and provide tools that enable the end user to perform the translation at the click of a button in a stable ontology editing environment without worrying about underlying formats.

Mail Lists

We have created the OBO to OWL mapping remaining faithful to the (declared) semantics of the OBO format. At places where we found the format to be vague, we have tightened the semantics and have update the documentation accordingly. We make the mapping tools available for other researchers to use and evaluate the mapping. Please feel free to contact us on these mailing lists if you find anything lacking, have suggestions or have any kind of feedback.

Obo Format List

Also of interest: Obo Cross-Product List

Tools for the mapping

Protege plugins

OBO Explorer Protege tab:

 OBO Explorer Tab

OBO Converter Protege tab:

 Source code for OBO Converter Tab
 Binaries for OBO Converter Tab
 Instructions for OBO Converter Tab

OboEdit OWL plugin

That is the development version of the OWL Export/Import plugin for OboEdit . Just download the distribution file, unzip it and copy its content to <OboEdit>/extensions folder. Start OboEdit and you should find the option "OWL Adapter" for loading ontologies, File->Load Terms..., and for saving, File->Save as...

OboEdit OWL plugin.

If you have problems with big ontologies, try to increase the size of the memory available to OboEdit.

URIs

Added a separate page on mapping OBO IDs to URIs:

OboInOwl:URIs

Overview of [Other] Mapping efforts

The spreadsheet below provides a summary of the various OBO to OWL conversion efforts that I am aware of. Email me (nigam .AT. stanford.edu) with additions/deletions as you come across them.

Spreadsheet comparing mappings

Publically viewable:

Editable (contact us to add your mapping & edit):

Progress Notes

I've overhauled the obo2owl mapping. I've pretty much followed Alan's recommendations (I made a lot of purely internal changes to the xslt too though which should make it much clearer). Hope these work for you Stuart. Sorry about the churn - but this will definitely be worth it in the end.

Example OWL file can be found here (also attached):

http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/examples/gotest.owl

(note that this example includes a cross-product example)

The OWL is generated from either of the following:

http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/examples/gotest.obo http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/examples/gotest.obo-xml

The XSL can be found here:

http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/xsl/obo2owl.xsl http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/xsl/obo2owl_obo_in_owl_metamodel.xsl

The XSL actually serves as fairly reasonable documentation about what's going on - but we'll also come up with a friendlier description once it's finalised

You can convert the obo-xml directly with the xslt. If you want to convert from obo you'll need the latest version of go-perl (from cvs)

Here are the changes and things still pending:

      Adopted Alan Ruttenberg's metamodel changes (see obo-format list)
      split into 2 separate xsl files
      subset (ontology views) now more consistent with obo
      * the oboInOwl class is SubsetDef
      * this does not appear in the owl:Ontology section, it stands alone
        (subsets can be used across ontologies)
      namespace changes -
      * the metamodel is now called oboInOwl
        (the format is owned by GO, so this maps to a GO URI)
      * the default ontology content namespace is now bioont
        (the URI for this will be some bioontologies.org URI)
      * slashes not hashes or underscores
      - example: rdf:about="oboContent/GO/0000001"
      fixed rdf:about/resource/ID issues
      - ID is never used
      - about and resource now used in correct places
      CHECKED
      - validates as DL in http://phoebus.cs.man.ac.uk:9999/OWL/Validator
      - works in SWOOP
      - works in Protege-OWL (but looks odd)
      TODO
      do we need an equivalentClass for intersectionOf?
      SWOOP saves this without
      decide on final URI scheme
      - Can we make the URIs less verbose? Use entities - or is this frowned on?
      new obo tags for obsoletion
      handling obsoletes
      decide on whether the oboInOwl metamodel should be exported as
      part of the content export, or linked to separately;
      and if linked to separately, do we need an owl:imports?