Thomae's function



Thomae's function is a real-valued function of a real variable that can be defined as: $$f(x) = \begin{cases} \frac{1}{q} &\text{if }x = \tfrac{p}{q}\quad (x \text{ is rational), with } p \in \mathbb Z \text{ and } q \in \mathbb N \text{ coprime}\\ 0          &\text{if }x \text{ is irrational.} \end{cases}$$

It is named after Carl Johannes Thomae, but has many other names: the popcorn function, the raindrop function, the countable cloud function, the modified Dirichlet function, the ruler function, the Riemann function, or the Stars over Babylon (John Horton Conway's name). Thomae mentioned it as an example for an integrable function with infinitely many discontinuities in an early textbook on Riemann's notion of integration.

Since every rational number has a unique representation with coprime (also termed relatively prime) $$p \in \mathbb Z$$ and $$q \in \mathbb N$$, the function is well-defined. Note that $$q = +1$$ is the only number in $$\mathbb N$$ that is coprime to $$p = 0.$$

It is a modification of the Dirichlet function, which is 1 at rational numbers and 0 elsewhere.

Related probability distributions
Empirical probability distributions related to Thomae's function appear in DNA sequencing. The human genome is diploid, having two strands per chromosome. When sequenced, small pieces ("reads") are generated: for each spot on the genome, an integer number of reads overlap with it. Their ratio is a rational number, and typically distributed similarly to Thomae's function.

If pairs of positive integers $$m, n$$ are sampled from a distribution $$f(n,m)$$ and used to generate ratios $$q=n/(n+m)$$, this gives rise to a distribution $$g(q)$$ on the rational numbers. If the integers are independent the distribution can be viewed as a convolution over the rational numbers, $g(a/(a+b)) = \sum_{t=1}^\infty f(ta)f(tb)$. Closed form solutions exist for power-law distributions with a cut-off. If $$f(k) =k^{-\alpha} e^{-\beta k}/\mathrm{Li}_\alpha(e^{-\beta})$$ (where $$\mathrm{Li}_\alpha$$ is the polylogarithm function) then $$g(a/(a+b)) = (ab)^{-\alpha} \mathrm{Li}_{2\alpha}(e^{-(a+b)\beta})/\mathrm{Li}^2_{\alpha}(e^{-\beta})$$. In the case of uniform distributions on the set $$\{1,2,\ldots, L\}$$ $$g(a/(a+b)) = (1/L^2) \lfloor L/\max(a,b) \rfloor$$, which is very similar to Thomae's function.

Probability distributions related to Thomae's function can also be derived from recurrent processes generated by uniform discrete distributions. Such uniform discrete distributions can be pi digits, flips of a fair dice or live casino spins. In greater detail, the recurrent process is characterized as follows: A random variable C$n$ is repeatedly sampled N times from a discrete uniform distribution, where i ranges from 1 to N. For instance, consider integer values ranging from 1 to 10. Moments of occurrence, T$x$, signify when events C$i$ repeat, defined as C$k$ = C$i$ or C$i$ = C$i-1$, where k ranges from 1 to M, with M being less than N. Subsequently, define S$i$ as the interval between successive T$i-2$, representing the waiting time for an event to occur. Finally, introduce Z$j$ as ln(S$k$) – ln(S$l$), where l ranges from 1 to U-1. The random variable Z displays fractal properties, resembling the shape distribution akin to Thomae's or Dirichlet function.

The ruler function
For integers, the exponent of the highest power of 2 dividing $$n$$ gives 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, ... . If 1 is added, or if the 0s are removed, 1, 2, 1, 3, 1, 2, 1, 4, 1, 2, 1, 3, 1, 2, 1, ... . The values resemble tick-marks on a 1/16th graduated ruler, hence the name. These values correspond to the restriction of the Thomae function to the dyadic rationals: those rational numbers whose denominators are powers of 2.

Related functions
A natural follow-up question one might ask is if there is a function which is continuous on the rational numbers and discontinuous on the irrational numbers. This turns out to be impossible. The set of discontinuities of any function must be an $F_{σ}$ set. If such a function existed, then the irrationals would be an $F_{σ}$ set. The irrationals would then be the countable union of closed sets $\bigcup_{i = 0}^\infty C_i$, but since the irrationals do not contain an interval, neither can any of the $$C_i$$. Therefore, each of the $$C_i$$ would be nowhere dense, and the irrationals would be a meager set. It would follow that the real numbers, being the union of the irrationals and the rationals (which, as a countable set, is evidently meager), would also be a meager set. This would contradict the Baire category theorem: because the reals form a complete metric space, they form a Baire space, which cannot be meager in itself.

A variant of Thomae's function can be used to show that any $F_{σ}$ subset of the real numbers can be the set of discontinuities of a function. If $ A = \bigcup_{n=1}^{\infty} F_n$ is a countable union of closed sets $$ F_n$$, define $$f_A(x) = \begin{cases} \frac{1}{n} & \text{if } x \text{ is rational and }   n \text{ is minimal so that } x \in F_n\\ -\frac{1}{n} & \text{if } x \text{ is irrational and } n \text{ is minimal so that } x \in F_n\\ 0           & \text{if } x \notin A \end{cases}$$

Then a similar argument as for Thomae's function shows that $$ f_A$$ has A as its set of discontinuities.