Annotator User Guide

From NCBO Wiki
Revision as of 16:42, 4 May 2009 by Cyoun (talk | contribs) (→‎Annotator Restlet User's Guide)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Service parameters

The OBA web service offers a set of parameters that allows a user to customize the annotations created according to his specific requirements. Especially, the annotation workflow can be limited to use a specific set of ontologies and a specific set of semantic types. Plus, the two steps of the annotation workflow can be parameterized.

The OBA web service level agreement (e.g., response time) depends on the selected components as each consumes resources at a different level. For example, the is_a transitive closure takes a long time to process, even when using a pre-computed hierarchy table. As another example, an annotation with wholeWordOnly=false will be significantly longer that with wholeWordOnly=true.

The list of parameters and they possible values is specified here after:

  • longestOnly={true, false} (default: false)

Specifies either or not the concept recognition step (done with University of Michigan Mgrep tool) must match the longest words only if they are several concepts that match to an expression.

For example: the phrase 'skin neoplasms' if longestOnly=true will match to the concept NCI/C0037286 (Skin Neoplasms) and NCI/C0027651 (Neoplasms). If longestOnly=false, the concept NCI/C1123023, (Skin) will also match.

The by default setting is longestOnly=false.

  • wholeWordOnly={true, false} (default: true)

Specifies whether the concept recognition step must match whole words only or not, if they are several concepts that match to a given word.

For example: the phrase 'neoplasms' if wholeWordOnly=true will match to the concept NCI/C0027651 (Neoplasms) only. If wholeWordOnly=false, the concept NCI/C1551054 (S) or the concept NCI/C0242536 (ASM) will also match (~80 concepts in NCI).

The by default setting is wholeWordOnly=true.

Note that the concept recognition step does not consider text cast.

  • stopWords={true, false} (default: empty)

Specifies the list of stop words to use.

  • withDefaultStopWords true false [default: false]

Specifies whether to use stop words or not. The default stop word list are available from sample HTML page. If set to true, this override the value of stopWords given by the user.

  • scored={true, false} (default: true)

Specifies either or not the annotations are scored. A score is a number assigned to an annotation that reflects the accuracy of the annotation. The higher the score is the better the annotation is. The scoring algorithm gives a specific weight to an annotation according to the context of this annotation. For instance, an annotation done by matching a concept preferred name will be given a higher weight than an annotation done by matching a concept synonym or than an annotation done with a parent level 3 in the is_a hierarchy. Details on the scoring algorithm are given in section Scoring algorithm.

For example, the phrase 'melanoma' is annotated both with the concept NCI/C0025202 (melanoma) and the concept NCI/C1522102 (Mouse Melanoma). The former annotation is scored 10 where as the latter is scored 8.

The by default setting is scored=true.

  • ontologiesToExpand={localOntology1,...,localOntologyN} (default: all ontologies)

Specifies the list of ontologies to use to expand in the annotation process. The list of ontologies that can be used is available in the sample HTML page. The values are separated with comma (without spaces)

For example, SNOMEDCT,NCI,13578,36625,MSH.

The by default setting is to use all ontologies.

  • ontologiesToExpand={localOntology1,...,localOntologyN} (default: all ontologies)

Specifies the list of ontologies you want to filter in the result from the annotation process. The list of ontologies that can be used is available in the sample HTML page. The values are separated with comma (without spaces)

For example, SNOMEDCT,NCI,MSH.

The by default setting is to use all ontologies.

  • semanticTypess={semanticType1,...,semanticTypeN} (default: all semanticTypes)

Specifies the list of semantic types to use in the annotation process. The list of semantic types that can be used is available at the /obs/semanticTypes URL. Note that the restriction to semantic types is also applied during the semantic expansion steps.

For example, T047,T048,T191.

The by default setting is to use all semantic types.

  • levelMax={integer} (default: 0)

Specifies the minimum (resp. maximum) level a parent concept must have to be considered for the is_a semantic closure expansion step. For example, an annotation done with levelMin=1 & levelMax=3 will expand a direct annotations done with a concept up to the 3rd level parent in the is_a hierarchy for this concept. An annotation done with levelMin=0 & levelMax=0 is equivalent to disable the is_a transitive closure expansion step.

  • mappingTypes={null,mappingType1,...,mappingTypeN} (default: all mappingTypes)

Specifies the list of mapping type to use during the mapping expansion step. The list of rmapping types that can be used is available at the /obs/mappingTypes URL. The current list is described in section Mapping types.

For example, from-mrrel,Human.

Note that the use of the key word null in the mappingTypes list disables the mapping expansion component. Note also that the mapping expansion is limited to the ontologies specified with the localOntologyIDs parameter.

The by default setting is to use all mapping types.

  • textToAnnotate

The text to be annotated

  • format={asXML,asText,asTabDelimited} (default: asXML)

Specifies the format of the result of the annotation.