Expressive power (computer science)

In computer science, the expressive power (also called expressiveness or expressivity) of a language is the breadth of ideas that can be represented and communicated in that language. The more expressive a language is, the greater the variety and quantity of ideas it can be used to represent.

For example, the Web Ontology Language expression language profile (OWL2 EL) lacks ideas (such as negation) that can be expressed in OWL2 RL (rule language). OWL2 EL may therefore be said to have less expressive power than OWL2 RL. These restrictions allow for more efficient (polynomial time) reasoning in OWL2 EL than in OWL2 RL. So OWL2 EL trades some expressive power for more efficient reasoning (processing of the knowledge representation language).

Information description
The term expressive power may be used with a range of meaning. It may mean a measure of the ideas expressible in that language:
 * regardless of ease (theoretical expressivity)
 * concisely and readily (practical expressivity)

The first sense dominates in areas of mathematics and logic that deal with the formal description of languages and their meaning, such as formal language theory, mathematical logic and process algebra.

In informal discussions, the term often refers to the second sense, or to both. This is often the case when discussing programming languages. Efforts have been made to formalize these informal uses of the term.

The notion of expressive power is always relative to a particular kind of thing that the language in question can describe, and the term is normally used when comparing languages that describe the same kind of things, or at least comparable kinds of things.

The design of languages and formalisms involves a trade-off between expressive power and analyzability. The more a formalism can express, the harder it becomes to understand what instances of the formalism say. Decision problems become harder to answer or completely undecidable.

In formal language theory
Formal language theory mostly studies formalisms to describe sets of strings, such as context-free grammars and regular expressions. Each instance of a formalism, e.g. each grammar and each regular expression, describes a particular set of strings. In this context, the expressive power of a formalism is the set of sets of strings its instances describe, and comparing expressive power is a matter of comparing these sets.

An important yardstick for describing the relative expressive power of formalisms in this area is the Chomsky hierarchy. It says, for instance, that regular expressions, nondeterministic finite automata and regular grammars have equal expressive power, while that of context-free grammars is greater; what this means is that the sets of sets of strings described by the first three formalisms are equal, and a proper subset of the set of sets of strings described by context-free grammars.

In this area, the cost of expressive power is a central topic of study. It is known, for instance, that deciding whether two arbitrary regular expressions describe the same set of strings is hard, while doing the same for arbitrary context-free grammars is completely impossible. However, it can still be efficiently decided whether any given string is in the set.

For more expressive formalisms, this problem can be harder, or even undecidable. For a Turing complete formalism, such as arbitrary formal grammars, not only this problem, but every nontrivial property regarding the set of strings they describe is undecidable, a fact known as Rice's Theorem.

There are some results on conciseness as well; for instance, nondeterministic finite automata and regular grammars are more concise than regular expressions, in the sense that the latter can be translated to the former without a blowup in size (i.e. in O(1)), while the reverse is not possible.

Similar considerations apply to formalisms that describe not sets of strings, but sets of trees (e.g. XML schema languages), of graphs, or other structures.

In database theory
Database theory is concerned, among other things, with database queries, e.g. formulas that, given the contents of a database, specify certain information to be extracted from it. In the predominant relational database paradigm, the contents of a database are described as a finite set of finite mathematical relations; Boolean queries, that always yield true or false, are formulated in first-order logic.

It turns out that first-order logic is lacking in expressive power: it cannot express certain types of Boolean queries, e.g. queries involving transitive closure. However, adding expressive power must be done with care: it must still remain possible to evaluate queries with reasonable efficiency, which is not the case, e.g., for second-order logic. Consequently, a literature sprang up in which many query languages and language constructs were compared on the basis of expressive power and efficiency, e.g. various versions of Datalog.

Similar considerations apply for query languages on other types of data, e.g. XML query languages such as XQuery.