Offset filtration

The offset filtration (also called the "union-of-balls" or "union-of-disks" filtration) is a growing sequence of metric balls used to detect the size and scale of topological features of a data set. The offset filtration commonly arises in persistent homology and the field of topological data analysis. Utilizing a union of balls to approximate the shape of geometric objects was first suggested by Frosini in 1992 in the context of submanifolds of Euclidean space. The construction was independently explored by Robins in 1998, and expanded to considering the collection of offsets indexed over a series of increasing scale parameters (i.e., a growing sequence of balls), in order to observe the stability of topological features with respect to attractors. Homological persistence as introduced in these papers by Frosini and Robins was subsequently formalized by Edelsbrunner et al. in their seminal 2002 paper Topological Persistence and Simplification. Since then, the offset filtration has become a primary example in the study of computational topology and data analysis.

Definition
Let $$X$$ be a finite set in a metric space $$(M,d)$$, and for any $$x\in X$$ let $$B(x,\varepsilon) = \{y\in X \mid d(x,y) \leq \varepsilon \}$$ be the closed ball of radius $$\varepsilon$$ centered at $$x$$. Then the union $X^{(\varepsilon)}:=\bigcup_{x\in X} B(x,\varepsilon)$ is known as the offset of $$X$$ with respect to the parameter $$\varepsilon$$ (or simply the $$\varepsilon$$-offset of $$X$$).

By considering the collection of offsets over all $$\varepsilon \in [0,\infty)$$ we get a family of spaces $$\mathcal O(X) := \{ X^{(\varepsilon)} \mid \varepsilon \in [0,\infty)\}$$ where $$X^{(\varepsilon)}\subseteq X^{(\varepsilon^\prime)}$$ whenever $$\varepsilon \leq \varepsilon^\prime$$. So $$\mathcal O(X)$$ is a family of nested topological spaces indexed over $$\varepsilon$$, which defines a filtration known as the offset filtration on $$X$$.

Note that it is also possible to view the offset filtration as a functor $$\mathcal O(X) : [0, \infty) \to \mathbf{Top}$$ from the poset category of non-negative real numbers to the category of topological spaces and continuous maps. There are some advantages to the categorical viewpoint, as explored by Bubenik and others.

Properties
A standard application of the nerve theorem shows that the union of balls has the same homotopy type as its nerve, since closed balls are convex and the intersection of convex sets is convex. The nerve of the union of balls is also known as the Čech complex, which is a subcomplex of the Vietoris-Rips complex. Therefore the offset filtration is weakly equivalent to the Čech filtration (defined as the nerve of each offset across all scale parameters), so their homology groups are isomorphic.

Although the Vietoris-Rips filtration is not identical to the Čech filtration in general, it is an approximation in a sense. In particular, for a set $$X \subset \mathbb R^d$$ we have a chain of inclusions $$\operatorname{Rips}_\varepsilon(X) \subset \operatorname{Cech}_{\varepsilon^\prime}(X) \subset \operatorname{Rips}_{\varepsilon^\prime}(X)$$ between the Rips and Čech complexes on $$X$$ whenever $$\varepsilon^\prime / \varepsilon \geq \sqrt{2d/d+1}$$. In general metric spaces, we have that $$\operatorname{Cech}_\varepsilon(X) \subset \operatorname{Rips}_{2\varepsilon}(X) \subset \operatorname{Cech}_{2\varepsilon}(X)$$ for all $$\varepsilon >0$$, implying that the Rips and Cech filtrations are 2-interleaved with respect to the interleaving distance as introduced by Chazal et al. in 2009.

It is a well-known result of Niyogi, Smale, and Weinberger that given a sufficiently dense random point cloud sample of a smooth submanifold in Euclidean space, the union of balls of a certain radius recovers the homology of the object via a deformation retraction of the Čech complex.

The offset filtration is also known to be stable with respect to perturbations of the underlying data set. This follows from the fact that the offset filtration can be viewed as a sublevel-set filtration with respect to the distance function of the metric space. The stability of sublevel-set filtrations can be stated as follows: Given any two real-valued functions $$\gamma, \kappa$$ on a topological space $$T$$ such that for all $$i\geq 0$$, the $$i\text{th}$$-dimensional homology modules on the sublevel-set filtrations with respect to $$\gamma, \kappa$$ are point-wise finite dimensional, we have $$d_B (\mathcal B_i (\gamma), \mathcal B_i (\kappa)) \leq d_\infty (\gamma, \kappa)$$ where $$d_B(-)$$ and $$d_\infty(-)$$ denote the bottleneck and sup-norm distances, respectively, and $$\mathcal B_i (-)$$ denotes the $$i\text{th}$$-dimensional persistent homology barcode. While first stated in 2005, this sublevel stability result also follows directly from an algebraic stability property sometimes known as the "Isometry Theorem," which was proved in one direction in 2009, and the other direction in 2011.

A multiparameter extension of the offset filtration defined by considering points covered by multiple balls is given by the multicover bifiltration, and has also been an object of interest in persistent homology and computational geometry.