OntoClean

OntoClean is a methodology for analyzing ontologies based on formal, domain-independent properties of classes (the metaproperties) developed by Nicola Guarino and Chris Welty.

Overview and History
OntoClean was the first attempt to formalize notions of ontological analysis for information systems. The idea was to justify the kinds of decisions that experienced ontology builders make, and explain the common mistakes of the inexperienced. Alan Rector, during a debate at the KR-2002 conference in Toulouse, said, "What you have done is reduce the amount of time I spend arguing with medics."

The notions Guarino & Welty focused on were drawn from philosophical ontology. They were not after the seemingly endless arguments about what the right ontology of the universe is, but rather the techniques these philosophers use to analyze, support, and criticize each other's arguments. These techniques make very little, if any, commitment to a particular ontology, instead they expose what are often very subtle distinctions.

The ideas underlying OntoClean appeared first in the literature in a series of three papers published in 2000. The name OntoClean does not appear in the literature until 2002. According to Thompson-ISI, work on OntoClean was the most cited of academic papers on Ontology. OntoClean was important as it was the first formal methodology for ontology engineering, applying scientific principles to a field whose practice was mostly art.

Note on terminology
In logic, a property is a unary predicate in intention, in other words a property is what it means to be a member of a class. For example, we say that instances of the Person class have the property of "being a person." In the semantic web, a property is a binary relation.

The distinction between property and class is subtle, and probably not critical to understanding OntoClean, however this article, follows the OntoClean publications and consistently uses "property" according to its original meaning, and one can treat "property" and "class" as synonymous. Thus a metaproperty is a property of a property or class.

Metaproperties
The basis of OntoClean are the domain-independent properties of classes, the OntoClean metaproperties: identity, unity, rigidity, and dependence. Later work by Welty & Andersen has added two more metaproperties: permanence and actuality.

Identity
Identity is fundamental to ontology, and especially to information systems ontologies. Identity is well known in metaphysics and in database conceptual modeling. In the latter case, it is an accepted best practice to specify a primary key for rows in a table. If "two" rows have identical primary keys, they are considered the same row.

More importantly for ontology are questions of identity that expose the existence of, or at least the need to represent, other entities. Here the issue at stake is finding the conditions under which a proposed entity would be both the same and different. The classic example is an amount of clay that is shaped into a statue. If you use the same clay but reshape it into a different statue, is it the same entity? If so, how could it be different? If not, how could it be the same. In conceptual modeling, it is understood that when such an ambiguity arises, one should treat it as two different entities to account for a situation where one changes and the other stays the same.

In OntoClean, identity criteria are associated with, or carried by, some classes of entities, called sortals. A sortal is a class all of whose instances are identified in the same way. In information systems, these criteria are often extrinsic, like a social security number or universally unique id, which is not interesting from an ontological point of view. Identity criteria should be informative, they should help us and others understand what a class means. A triangle, for example, can be identified by the length of its three sides, or by two sides and an interior angle, etc. This says a lot about what is intended by the triangle class here, e.g. the same triangle could be in many places at the same time. Someone else may have an ontology in which the triangle class has different identity criteria, such that different drawings are always different triangles, even if they are the same size. Identity criteria (and OntoClean, for that matter) do not tell you that one of these definitions of triangle is right or wrong, just that they are different and thus that the classes are different.

Identity criteria and sortals are intuitively meant to account for the linguistic habit of associating identity with certain classes. In the classical statue and clay example, we naturally say "the same clay" or "the same statue", indicating that there are identity criteria that are peculiar to each class.

Being a sortal is the first OntoClean metaproperty, indicated with the +I superscript (−I for non-sortals) on a class in the original notation. +I (but not −I) is inherited down the class hierarchy, if a class is a sortal then all its subclasses are as well.

Unity
There are certain properties that only hold of individuals that are wholes. In formal ontology, wholes are often distinguished from mere sums, which are individuals whose boundaries are, in a sense, arbitrary. For example, consider the class clay. An instance of this class might be some amount of the material (this is only one possible meaning, of course), such that any (in fact, every) arbitrary subsection of the amount would be a different instance of the same class. By contrast, instances of the class Person are, typically, not decomposable in this fashion.

For the purposes of OntoClean, wholes are individuals all of whose parts are related to each other, and only to each other, by some distinguished relation. This relation can be viewed as a generalized connection relation. Mere sums have no such relation since any decomposition of a mere sum is connected to any larger sum, which is not one of its parts, by the same relation.

Unity is the metaproperty, indicated by +U, of classes all of whose individuals are wholes under the same relation. Like identity, OntoClean does not require that the relation itself be specified, often it is enough to know that the relation exists. Intuitively, a class has unity if all its instances are the same type of whole, and is typically true of classes of natural objects. Non-unity, indicated by −U, is the meta-property of classes whose instances are not all wholes, or not all wholes by the same relation. A further and more useful refinement of non-unity is anti-unity, indicated by ~U, the meta-property of classes all of whose instances are not wholes, such as classes of mere sums. +U and ~U (but not −U) are inherited down the class hierarchy.

Rigidity
Leibniz's law makes good sense when first considered, however it doesn't take long to see how considerations of time causes problems between most ontologies (especially semantic web ontologies) and Leibniz's law. For example, I might have a beard on one day and shave it off the next, yet I am the same entity at both times. How is it possible for me to be the same if I have changed?

There are many logical approaches to this classic dilemma (including simply ignoring it), the most common is to consider some properties to be essential; an essential property (and, q.v. terminology above, properties are unary predicates) of an entity is a property that cannot change, and these are the properties for which Leibniz's law holds. Other properties of an entity that can change are non-essential and cannot be involved in identity.

Some properties are essential to all their instances. Think of the property of being a person, usually represented by the class Person. For every entity that has this property, the property is essential. So at least one of the properties that has not changed about me when I shave my beard is that I am a person. These properties, that are essential to all their instances, are rigid properties.

Rigid properties are designated by +R, and properties that are not rigid −R. An important specialization of non-rigid properties are anti-rigid properties (~R), which are properties that must be changeable. Think of being a student — all students must possibly not be students. ~R (but not −R or +R) is inherited down the class hierarchy.

Note that these are just examples — it is certainly possible to have an ontology in which Person is anti-rigid. Imagine an ontology of mystical beliefs, for example, in which an entity changes from Person to Spirit upon death. In order for the individual to be the same across this change, being a person must not be essential and furthermore must be changeable (i.e. anti-rigid).

Rigidity should not be confused with Kripke's notion of Rigid Designators, which are particulars. The term rigid in OntoClean is meant to describe the instanceOf link between an individual and a rigid class — it cannot be broken.

Dependence
Dependence is a varied notion. In the core OntoClean papers, Guarino & Welty used a kind of dependence that captures a meta-property of certain relational roles. A property is dependent if each instance of it implies the existence of another entity. The property Student, for example, is dependent, since to be a student there must be a teacher; for every instance of student there is at least one instance of teacher. In later work for [Dolce] this was noted to subsume two kinds of property dependence: specific constant dependence and generic constant dependence. The former accounts for dependence on specific entities, e.g. each person is dependent on having a particular brain. The latter accounts for the Student/Teacher case, where any instance of Teacher will do.

There are many other kinds of dependence, see [Fine and Smith, 1983] and especially [Simons, 1987]. It is an open problem to adapt them into the OntoClean framework.

Being dependent is indicated with +D, being independent with −D. +D (but not −D) is inherited down the class hierarchy.