The Complexity of Songs

"The Complexity of Songs" is a scholarly article by computer scientist Donald Knuth in 1977, as an in-joke about computational complexity theory. The article capitalizes on what it argues is the tendency of popular songs to devolve from long and content-rich ballads to highly repetitive texts with little or no meaningful content. The article notes that a song of length N words may be produced remembering, e.g., only O(log N) words ("space complexity" of the song) or even less.

Article summary
Knuth writes that "our ancient ancestors invented the concept of refrain" to reduce the space complexity of songs, which becomes crucial when a large number of songs is to be committed to one's memory. Knuth's Lemma 1 states that if N is the length of a song, then the refrain decreases the song complexity to cN, where the factor $c < 1$.

Knuth further demonstrates a way of producing songs with $O(√N)$ complexity, an approach "further improved by a Scottish farmer named O. MacDonald".

More ingenious approaches yield songs of complexity $$O(\log N)$$, a class known as "m bottles of beer on the wall".

Finally, the progress during the 20th century—stimulated by the fact that "the advent of modern drugs has led to demands for still less memory"—leads to the ultimate improvement: Arbitrarily long songs with space complexity $O(1)$ exist, e.g. a song defined by the recurrence relation


 * $$S_0=\epsilon, S_k = V_kS_{k-1},\, k\ge 1,$$
 * $Vk =$ 'That's the way,' $U$ 'I like it,' $U$, for all $k &ge; 1$
 * $U =$ 'uh huh,' 'uh huh'

Further developments
Prof. Kurt Eisemann of San Diego State University in his letter to the Communications of the ACM further improves the latter seemingly unbeatable estimate. He begins with an observation that for practical applications the value of the "hidden constant" c in the big O notation may be crucial in making the difference between the feasibility and unfeasibility: for example a constant value of 1080 would exceed the capacity of any known device. He further notices that a technique has already been known in Mediaeval Europe whereby textual content of an arbitrary tune can be recorded basing on the recurrence relation $$S_k = C_2S_{k-1}$$, where $$C_2 = \texttt{'la'}$$, yielding the value of the big-O constant c equal to 2. However it turns out that another culture achieved the absolute lower bound of O(0). As Prof. Eisemann puts it: When the Mayflower voyagers first descended on these shores, the native Americans proud of their achievement in the theory of information storage and retrieval, at first welcomed the strangers with the complete silence. This was meant to convey their peak achievement in the complexity of songs, namely the demonstration that a limit as low as c = 0 is indeed obtainable.

It is then claimed that the Europeans were unprepared to grasp this notion, and the chiefs, in order to establish a common ground to convey their achievements later proceeded to demonstrate an approach described by the recurrent relation $$S_k = C_1S_{k-1}$$, where $$C_1 = \texttt{'i'}$$, with a suboptimal complexity given by $c = 1$.

The O(1) space complexity result was also implemented by Guy L. Steele, Jr., perhaps challenged by Knuth's article. Dr. Steele's TELNET Song used a completely different algorithm based on exponential recursion, a parody on some implementations of TELNET.

Darrah Chavey suggested that the complexity analysis of human songs can be a useful pedagogic device for teaching students complexity theory.

The article "On Superpolylogarithmic Subexponential Functions" by Prof. Alan Sherman writes that Knuth's article was seminal for analysis of a special class of functions.