Introduction

This document should give an outline about the planed reaction ontology project. An overview over several languanges/syntaxes which are suitable for creating an ontology is given in the first section. The advantages and disadvantages of each language are shown. The second section describes the existing editors and API's for creating or manipulate ontologies. An overview over already existing reaction ontologies is given in section three.

Overview of several Ontology formats

OWL OBO DAML+OIL RDF(S)
Use widely-used widely-used used, but now deprecated widely used as part of OWL
Readability by humans medium good medium medium
Documentation very good medium good good
Expressiveness very good good medium poor
Graphs allows inverse relationships allows inverse relationships DAG's hierarchical
Editors COBRA, Protege, SWOOP AMIGO, OBO-Edit Protege COBRA, Protege, SWOOP
API's OWL-API, Jena API, Protege API (Protege API) - Jena API, Protege API

OWL

OWL is the official successor of DAML+OIL and can be represented in RDF and XML syntax. It is available in three different versions OWL_Lite, OWL_DL and OWL_Full with rising expressiveness. OWL_Full is probably the most powerful language to describe ontologies at the moment. Depending on the OWL version, it has more possibilities to describe properties and classes than DAML+OIL or RDFS, for example cardinalities, more types and characteristics of properties, (in)equality between classes and enumerated classes. Schema data and instance data are divided strictly within an OWL ontology. In an OWL document it is also possible to import already existing ontologies and reuse defined classes or properties.

To describe the basic statements of OWL, we have a look at the following simple model:

Man_Woman_Ontology

This model can be seen in OWL syntax.

A small collection of further OWL statements can be found here.

Organisations which use the OWL format for biological ontologies are among others:

OBO

OBO describes unrooted and possible cyclic directed graphs. The concepts which are modeled are a subset of the concepts modeled in OWL, for example you can't define as much property restrictions as in OWl. But OBO has several extensions like references or synonyms which can't be modeled in OWL. An ontology written in the OBO format can also be integrated in the Onology Lookup Service (OLS). Furthermore it is:

  • easy readable and understandable for humans
  • easy to parse
  • extensible with regard to further relationships
  • minimal with regard to redundancy within the document

The example shown above can be seen in OBO syntax. Organisations which use the OBO format for biological ontologies are among others:

DAML+OIL

DAML+OIL is the predecessor of OWL and also uses the RDF syntax. It has some more functions than RDFS, for example it has more exressiveness with regard to defining classes and allows computers to draw simple conclusions from existing knowledge. Nevertheless, it is not as powerful as OWL or OBO. An example ontology written in DAML can be seen here:

DAML+OIL Example

RDFS

RDFS is the simplest language of the four and uses of course the RDF syntax too. It was actually developed to provide meta data in the www. It allows the user to describe simple domain ontologies by using the class-subclass concept and to name resource types and relationships.

Available Editors and API's

Editor specifications

COBra is an ontology editor for editing and browsing ontologies written in the GO flat file format (which is now deprecated), OWL or RDF format. Ontologies written in the OBO format can't be loaded. It supports mapping of two different ontologies by linking related terms. Therefore it is more suitable for comparing two ontologies than for creating a new ontology. It is available for Linux, Windows and OS X and needs Java 1.4 to run.

The Protege platform provides creating and manipulating ontologies written in the OWL or RDF format. It has many functions which help the developer to create new ontologies, for instance creating multiple subclasses for existing superclasses, check the consistency within the ontology or generate Java code with its API (see below). It also has a ontology graph visualization tool. It is possible to develop plugins for this tool. Protege is available for all common OS's, including Windows, Linux and OS X.

The ontology editor SWOOP also supports OWL and RDF format. One advantage of this tool is the built-in address bar which allows the user to include Ontologies from the web for mapping or comparison with other ontologies. At the moment there is no function to visualize the ontology in a graph structure. It is available for Windows, there are no informations about other OS's.

RDF Gravity and OGraph are tools for visualizing, but not for creating or editing new ontologies.

A tool which only supports the OBO format is OBO-Edit, formaly known as DAG-Edit. It offers the common features creating and editing ontologies as well as searching for terms in an ontology. It is available for the usual OS like Windows, Linux, Unix and OS X.

AmiGO was specially developed to browse and visualize ontologies from the GeneOntology (GO) database. Additionally, it is possible to do requests by searching for special terms in the available ontologies.

API specifications

OWL API is an open-source (cvs) interface written in Java and provides programmatic access to OWL ontologies. That means concrete: you can develop an OWL document from an external source as well as manipulate an existing OWL document. It also can be used to check if an OWL document is written in a correct OWL-syntax and which kind of OWL version is used (OWL-LITE, OWL-DL, OWL-FULL). However, it is still in alpha stage.

The Jena API is an open-source (cvs) Java programmatic environment with API's for OWL and RDF, coming from the Jena framework. It provides methods to create and manipulate OWL and RDF files or do queries on these files. You also have access to a inference engine.

Another open-source Java API for ontologies is Protege API which is part of the Protege project (see above). You can built Java interfaces, schema classes and Protege OWL Java code automatically. Bong is a plugin for Protege which supports the import of OBO-ontologies into Protege and convert them into OWL-DL or OWL-Lite.

Existing Reaction Ontologies

System Biology Ontology

The System Biology Ontology was created by the group of Nicolas Le Novere @ EBI and is available in the OBO format. The main focus of this ontology is to describe kinetics of biochemical reacions like Michaelis-Menten equation or Hill equation. It lists several kinds of reactions like "binding" or "redox reaction" abstractly, but not detailed enough. Besides it doesn't distinguish between for example enzymatic and non-enzymatic or reversible and irreversible reactions.

SBO_Ontology

BioCyc Ontology

BioCyc is a collection of more than 200 pathway/genome databases. The BioCyc reaction ontology follows the EC hierarchy of enzymes to put the reactions in order which results in considering only enzymatic reactions. Furthermore it doesn't realize the distinction between overall reactions and reactions steps or the order of single reaction steps. There is another BioCyc ontology, the pathway ontology which is a classification hierarchy of metabolic pathways. The pathways are classified by their biological function and the classes of the metabolites which they produce.

BioCyc_Ontology

KEGG Ontology

KEGG Pathway Modules classifies enzymatic reactions and the corresponding enzymes on the basis of the biochemical process, for example metabolism, protein biosynthesis or lipid biosynthesis. The reactions are numbered by unique numbers independently from the enzyme numbers. The single reactions are mapped to the according pathway or to another reaction which is for example a two-step-reaction, but the order of the reaction steps isn't considered. Furthermore, all listed reactions are only reversible reactions.

Kegg_Ontology

Physico-chemical process Ontology

The Physico-chemical process Ontology is available in the OBO format and considers biochemical reactions as a subset of other chemical reactions and physico-chemical processes. Biochemical reactions are arranged according to their kind of reaction, for instance biological electron transfer reaction, biomacromolecule-catalysed reaction or biotransformation reaction. Nevertheless, there are only abstract reactions listed and no concrete reactions. Furthermore it doesn't use reversible/irreversible and enzymatic/non-enzymatic reactions as a classification criterion. However, there are also microscopic processes listed like electron transfer or subatomic processes (not visible in the sreenshot).

Rex_Ontology

Gene Ontology

Biochemical reactions are modeled in the Gene Ontology in two ways. On the one hand, they are ordered as abstract kinds of metabolism like biosynthesis, catabolism or secondary metabolism. There is no distinction between reversible and irreversible or enzymatic and non-enzymatic reactions. A listing and an order of reactionsteps is also missing. On the other hand, reactions are described as subentities of molecular functions, like binding or catalytic activity (not visible in the screenshot).

Gene_Ontology

Event Ontology

The Event Ontology is available in the OBO format and lists metabolic reactions as a kind of a molecular event. The classification follows in essence the EC hierarchy, although the order differs. The main focus is the cellular compartement in which the reaction takes place, for example there are hydrolysis in the cytosol, hydrolysis in the nucleus, hydrolysis in the plasma membran etc. A distinction between enzymatic and non-enzymatic is missing as well as distinction between reversible/irreversible reactions.

EventOntology

BioPAX and SBML

BioPAX

BioPAX is a file format for the exchange of biological pathway data. BioPAX Level 1 was created to represent metabolic pathways. BioPAX Level 2 provides some additional possibilities to represent data, like molecular binding interactions, post-translational modification or hierachical pathways. The BioPAX-Level2 Ontology is written in the OWL-format and is the abstract representation of biological pathways, i.e. their concepts and the relationships to each other. The BioPAX file format is the implementation of the BioPAX ontology and defines the syntax of the data representation. The Jena Framework provides an Java API for creating BioPAX files.

SBML

The System Biology Markup Language(SBML) is a description language for modeling biochemical reaction networks based on XML. It also includes the representation of cell-signaling pathways and regulatory networks. The main focus is however more on system biology and kinetics than on reaction details. It doesn't provide for example Xrefs to other databases or synonyms for reaction participants like BioPAX.