Menzerath's law

Menzerath's law, or Menzerath–Altmann law (named after Paul Menzerath and Gabriel Altmann), is a linguistic law according to which the increase of the size of a linguistic construct results in a decrease of the size of its constituents, and vice versa.

E.g., the longer a sentence (measured in terms of the number of clauses) the shorter the clauses (measured in terms of the number of words), or: the longer a word (in syllables or morphs) the shorter the syllables or morphs in sounds.

According to Altmann (1980), it can be mathematically stated as:

$$y=a \cdot x^{b} \cdot e^{-c x}$$

where:


 * $$y$$ is the constituent size (e.g. syllable length)
 * $$x$$ size of the linguistic construct that is being inspected (e.g. number of syllables per word)
 * $$a$$, $$b$$, $$c$$ are the parameters

The law can be explained by the assumption that linguistic segments contain information about its structure (besides the information that needs to be communicated). The assumption that the length of the structure information is independent of the length of the other content of the segment yields the alternative formula that was also successfully empirically tested.

Beyond quantitative linguistics, Menzerath's law can be discussed in any multi-level complex systems. Given three levels, $$x$$ is the number of middle-level units contained in a high-level unit, $$y$$ is the averaged number of low-level units contained in middle-level units, Menzerath's law claims a negative correlation between $$y$$ and $$x$$. Menzerath's law is shown to be true for both the base-exon-gene levels in the human genome, and base-chromosome-genome levels in genomes from a collection of species. In addition, Menzerath's law was shown to accurately predict the distribution of protein lengths in terms of amino acid number in the proteome of ten organisms.