LexGrid 2008 and OWL

From NCBO Wiki
Revision as of 14:15, 8 October 2008 by Noy (talk | contribs)
Jump to navigation Jump to search

General notes:

  • We agreed that a OWL-LexGrid-OWL round-trip is not required for the BioPortal purposes. This decision simplifies some of the handling of imported ontologies.
  • We understand that while the model seems to account for many OWL features, some critical ones have not been validated yet. We see validation as using either the API or the UI (not necessarily both) to access the stored information. Some of the information has not been stored at all yet. We all agree that validation through the API access (and, preferably, UI) of all features of the model is required to ensure that the information is represented faithfully.


Major features that the model does not seem to handle yet

Namespaces versus imported (composite) ontologies

This problem is probably the most serious one and the hardest to address in the current model and has two parts: The model does not make a distinction between a namespace as a unit and an ontology as a unit. In reality, there is no one-to-one correspondence between an ontology and a namespace. The model assumes a uniqueness of a namespace prefix throughout all the coding schemes and uses the namespace prefix as a coding scheme id. This approach will fail in BioPortal, as many ontologies import the same ontologies (but, perhaps, different versions of them) with the same namespace prefix. For example, BIRNLex and OBI import some of the same ontologies. It seems that the current model will not be able to handle such case. Example: ontology 1:

 ...
 xml:p1="http://www.owl-ontologies.com/Ontology1222730448.owl">
 <owl:Ontology rdf:about="http://www.owl-ontologies.com/ontology1.owl">
   <owl:imports rdf:resource="http://www.owl-ontologies.com/Ontology1222730448.owl""/>
 </owl:Ontology>
 ...

ontology 2:

 ...
 xml:p1="http://www.owl-ontologies.com/Ontologyabc.owl">
 <owl:Ontology rdf:about="http://www.owl-ontologies.com/ontology2.owl">
   <owl:imports rdf:resource="http://www.owl-ontologies.com/Ontologyabc.owl""/>
 </owl:Ontology>
 ...

Or the same example without the imports

Inheritance

The issue is not so much with the model but with the need to develop an API that handles inheritance. Currently, the LexGrid API cannot provide any inherited properties or restrictions for a class. There are two possible approaches:

  • have LexGrid pre-compute all the inheritance relationships at loading time and store them;
  • find the inherited relationships on the fly, when returning informaiton about a class through the LexGrid API.

The first option is likely to increase the required disk space by an order of magnitude (from our experience with NCIT), but will allow using Protege to determine the inheritance relationships. The second option will require non-trivial processing on the API side, as our experience with Protege shows. Determining all the inheritance correctly is harder than it seems. Note that we need inheritance of restrictions, as well as properties for which a class is a domain (see the next item)

Properties with no domains and ranges

The LexGrid/OWL specification says:

  • (OWLObjectProperty) "An association between two classes (hasDomain, hasRange)."
  • (OWLDatatypeProperty) "An association between one class (domain) and one association (hasDomain and hasDataProperty). The conceptProperty defines the range."

However, both datatype properties and object properties can be defined without any domains and ranges (and often are)

Multiple domains and ranges

It was not clear from our discussion if the model can handle multiple domains and ranges. Usually the requried semantics is that of a union. Suppose a property P has two domains, A and B, and two ranges, C and D. When rendering details for the class A, BioPortal will need to display all the properties where A is in the domain, and show all the ranges for those properties. So, in this case, when asked for A's properties, we would expect to get the property P, with two ranges, C and D. Note, that we would also want to get that for any subclass of A (see the inheritance point above).

Major features in the model that have not been validated yet

  1. Instances
  2. Subproperties

Major features that may already be in the API or the UI, but we did not validate yet

  1. Distinction between annotation properties and restrictions or properties that have a class in the domain (there are different properties at a class and will need to be returned/rendered differently
  2. Domains and ranges of properties (in the case where there is one domain and one range)
  3. Property values (for datatype properties in particular)

Features that are missing from the model but that are easy to add

  • Distinction between necessary and necessary and sufficient conditions for defined classes (an extra qualifier for the restriction)