Difference between revisions of "OboInOwl:Main Page"

From NCBO Wiki
Jump to navigation Jump to search
Line 18: Line 18:
  
 
Also of interest: [https://lists.sourceforge.net/lists/listinfo/obo-crossproducts Obo Cross-Product List]
 
Also of interest: [https://lists.sourceforge.net/lists/listinfo/obo-crossproducts Obo Cross-Product List]
 +
 +
=The Mapping=
 +
 +
We have made the mapping available as an online Google spreadsheet. You can view the sheet at http://spreadsheets.google.com/ccc?key=pWN_4sBrd9l1Umn1LN8WuQQ. If you want edits rights (or already have them) to leave comments then use the following link http://spreadsheets.google.com/ccc?key=o06770842196506107736.4732937099693365844
  
 
=Tools for the mapping=
 
=Tools for the mapping=

Revision as of 16:34, 19 January 2007

OboInOwl

This wiki is for discussing the mapping between Obo1.2 format and OWL. The first version of this mapping is at http://www.godatabase.org/dev/doc/mapping-obo-to-owl.html. We have finished work on a newer version of this mapping. This page provides a brief background for the effort and provides links to the relevant tools (plugins for OBOEdit and Protege) that implement the mapping.

The Gene Ontology and a significant number of biomedical ontologies are in the OBO-format. The OBO format, which originated along with the Gene Ontology, has evolved to support the needs of the biomedical ontologies that fall under the Open Biomedical Ontologies (OBO) umbrella. The OBO-format aims to have 1) human readability, 2) ease of parsing, 3) extensibility and 4) minimal redundancy. The OBO-format currently forms the backbone of most GO based annotation and data analysis tools.

In parallel with the developments in bio-ontologies, ontologies in general have become more prevalent in information technology; with the most visible push coming from the W3C in the form of the W3C recommendation of the Web Ontology Language (OWL) as an international standard for ontologies on the web. There has also been a corresponding increase in the number, diversity and quality of the tools available to construct, maintain and view ontologies in OWL.

As bio-ontologies become more popular and grow in size as well as complexity, they are becoming the focus of attention of the larger computer science research community. On one hand there is significant interest in using the life sciences domain as a “focus” for W3C semantic web activity. In this light, biological data annotated using OBO ontologies is a prime resource and there is great interest from the Semantic Web community to access the ontologies and the annotated data in OWL format. On the other hand, if bio-ontologies are to benefit from the rapid progress being made in computer science – especially the semantic web technologies – bio-ontologies need to interoperate with other ontologies which are in the OWL format. The relatively newer biomedical ontologies (such as BioPAX) are already in OWL. The NCI-thesaurus, being developed by the National Cancer Institute, is also in OWL.

As a result, there is a strong need to map the OBO-format to OWL and provide tools that enable the end user to perform the translation at the click of a button in a stable ontology editing environment without worrying about underlying formats.

Mail Lists

We have created the OBO to OWL mapping remaining faithful to the (declared) semantics of the OBO format. At places where we found the format to be vague, we have tightened the semantics and have update the documentation accordingly. We make the mapping tools available for other researchers to use and evaluate the mapping. Please feel free to contact us on these mailing lists if you find anything lacking, have suggestions or have any kind of feedback.

Obo Format List

Also of interest: Obo Cross-Product List

The Mapping

We have made the mapping available as an online Google spreadsheet. You can view the sheet at http://spreadsheets.google.com/ccc?key=pWN_4sBrd9l1Umn1LN8WuQQ. If you want edits rights (or already have them) to leave comments then use the following link http://spreadsheets.google.com/ccc?key=o06770842196506107736.4732937099693365844

Tools for the mapping

Protege plugins

OBO Converter Protege tab:

The OBO Converter is a Tab plugin for Protégé to convert OBO format files into OWL files and vice-versa (keeping in mind that OWL to OBO conversions can lose information if one encodes things in OWL that cannot be expressed in OBO). It is also developed in a manner such that it can also work as a standalone conversion program.

The OBO Converter Tab basically reads OBO files into Protégé OWL projects and saves those projects back as OBO 1.0 files. The Tab has two main panels, one to read OBO files and one to write (save) them. The save operation is straightforward as the user chooses the file name and the conversion is done. The read operation has the same functionality plus a set of options that can alter the way an OBO file is read (see figure).

OBOConverter.jpg

users to choose the way the OWL class names will be generated from the OBO format file Terms. There are 3 options:

  • OBO id This option will generate the name from the OBO term id. This is the default option and generates the OWL id in the way described in our mapping.
  • Class name This option will generate the name from the OBO term name. This has to be used with care because the names are not required to be unique. If the names are not unique, there will be a parser error.
  • Class name + OBO id This option will generate the name from the combination of the OBO term name and id.

In all cases, characters other than letters (a-z, A-Z) or numbers will be converted to underscore characters (_). Ex: the OBO term name “nurse cell” will be converted to the OWL class name nurse_cell. The default behavior of the Tab is to generate the OWL class names from the OBO id. If the user wants to see the OBO name, instead of their meaningless ids, as the class identifier, Protégé has an option to display the OWL class label as the identifier. As OWL labels can also have language identifiers (such as en for English), converted OBO ontologies can now have names in different languages all pointing to the same entities allowing for language localization while preserving the OBO ids. The other options are targeted to users that want to create their ontology using a specific OBO ontology as a start point, but want to name their entities in a different way.

Downloads:

 Source code for OBO Converter Tab
 Binaries for OBO Converter Tab
 Instructions for OBO Converter Tab


OBO Explorer Protege tab:

The OWL format for OBO files uses anonymous nodes to represent definitions, synonyms, and DbxRefs, and the generic Protégé GUI components are not immediately suitable to display and edit them. The existing graphical components also do not allow the user to easily access or edit the lexical information associated with an OBO term. The OBO Explorer tab allows the user to do so in an interface that is similar to that of OBO-edit. This provides the user with the flexibility to edit these lexical features (such as synonyms and dbxrefs) in an intuitive manner.

 OBO Explorer Tab

OBOExplorer.jpg

OboEdit OWL plugin

We have also added the functionality to save an OBO format file as an OWL file from within OBOEdit. We have a development version of the OWL Export/Import plugin for OboEdit available. Just download the distribution file, unzip it and copy its content to <OboEdit>/extensions folder. Start OboEdit and you should find the option "OWL Adapter" for loading ontologies, File->Load Terms..., and for saving, File->Save as...

OboEdit OWL plugin.

If you have problems with big ontologies, try to increase the size of the memory available to OboEdit.

URIs

Added a separate page on mapping OBO IDs to URIs:

OboInOwl:URIs

Overview of Other Mapping efforts

We are aware that there are several other groups that have created an OBO to OWL mapping to address immediate needs of their research groups. We have compiled a summary of the various OBO to OWL conversion efforts that we are aware of. Email me (nigam .AT. stanford.edu) with additions/deletions as you come across them.

 Spread sheet comparing the mappings

Progress Notes

I've overhauled the obo2owl mapping. I've pretty much followed Alan's recommendations (I made a lot of purely internal changes to the xslt too though which should make it much clearer). Hope these work for you Stuart. Sorry about the churn - but this will definitely be worth it in the end.

Example OWL file can be found here (also attached):

http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/examples/gotest.owl

(note that this example includes a cross-product example)

The OWL is generated from either of the following:

http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/examples/gotest.obo http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/examples/gotest.obo-xml

The XSL can be found here:

http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/xsl/obo2owl.xsl http://geneontology.cvs.sourceforge.net/*checkout*/geneontology/go-dev/xml/xsl/obo2owl_obo_in_owl_metamodel.xsl

The XSL actually serves as fairly reasonable documentation about what's going on - but we'll also come up with a friendlier description once it's finalised

You can convert the obo-xml directly with the xslt. If you want to convert from obo you'll need the latest version of go-perl (from cvs)

Here are the changes and things still pending:

      Adopted Alan Ruttenberg's metamodel changes (see obo-format list)
      split into 2 separate xsl files
      subset (ontology views) now more consistent with obo
      * the oboInOwl class is SubsetDef
      * this does not appear in the owl:Ontology section, it stands alone
        (subsets can be used across ontologies)
      namespace changes -
      * the metamodel is now called oboInOwl
        (the format is owned by GO, so this maps to a GO URI)
      * the default ontology content namespace is now bioont
        (the URI for this will be some bioontologies.org URI)
      * slashes not hashes or underscores
      - example: rdf:about="oboContent/GO/0000001"
      fixed rdf:about/resource/ID issues
      - ID is never used
      - about and resource now used in correct places
      CHECKED
      - validates as DL in http://phoebus.cs.man.ac.uk:9999/OWL/Validator
      - works in SWOOP
      - works in Protege-OWL (but looks odd)
      TODO
      do we need an equivalentClass for intersectionOf?
      SWOOP saves this without
      decide on final URI scheme
      - Can we make the URIs less verbose? Use entities - or is this frowned on?
      new obo tags for obsoletion
      handling obsoletes
      decide on whether the oboInOwl metamodel should be exported as
      part of the content export, or linked to separately;
      and if linked to separately, do we need an owl:imports?