Concept class

In computational learning theory in mathematics, a concept over a domain X is a total Boolean function over X. A concept class is a class of concepts. Concept classes are a subject of computational learning theory.

Concept class terminology frequently appears in model theory associated with probably approximately correct (PAC) learning. In this setting, if one takes a set Y as a set of (classifier output) labels, and X is a set of examples, the map $$c: X\to Y$$, i.e. from examples to classifier labels (where $$Y = \{0, 1\}$$ and where c is a subset of X), c is then said to be a concept. A concept class $$C$$ is then a collection of such concepts.

Given a class of concepts C, a subclass D is reachable if there exists a sample s such that D contains exactly those concepts in C that are extensions to s. Not every subclass is reachable.

Background
A sample $$s$$ is a partial function from $$X$$ to $$\{0, 1\}$$. Identifying a concept with its characteristic function mapping $$X$$ to $$\{0, 1\}$$, it is a special case of a sample.

Two samples are consistent if they agree on the intersection of their domains. A sample $$s'$$ extends another sample $$s$$ if the two are consistent and the domain of $$s$$ is contained in the domain of $$s'$$.

Examples
Suppose that $$C = S^+(X)$$. Then:


 * the subclass $$\{\{x\}\}$$ is reachable with the sample $$s = \{(x, 1)\}$$;
 * the subclass $$S^+(Y)$$ for $$Y\subseteq X$$ are reachable with a sample that maps the elements of $$X - Y$$ to zero;
 * the subclass $$S(X)$$, which consists of the singleton sets, is not reachable.

Applications
Let $$C$$ be some concept class. For any concept $$c\in C$$, we call this concept $$1/d$$-good for a positive integer $$d$$ if, for all $$x\in X$$, at least $$1/d$$ of the concepts in $$C$$ agree with $$c$$ on the classification of $$x$$. The fingerprint dimension $$FD(C)$$ of the entire concept class $$C$$ is the least positive integer $$d$$ such that every reachable subclass $$C'\subseteq C$$ contains a concept that is $$1/d$$-good for it. This quantity can be used to bound the minimum number of equivalence queries needed to learn a class of concepts according to the following inequality:$FD(C) - 1\leq \#EQ(C)\leq \lceil FD(C)\ln(|C|)\rceil$.