Talk:Glivenko–Cantelli theorem

Glivenko–Cantelli class
In the section titled "Glivenko–Cantelli class", when the GC classes are defined, it is stated that almost sure convergence, convergence in probability and convergence in mean are equivalent for the definition. However, I cannot see why this would be true. For integrable functions this might be the case, but for general measurable functions this seems too strong of a statement. Nmdwolf (talk) 12:53, 22 October 2021 (UTC)


 * You are correct; they are not equivalent in general. Convergence in probability is associated with "weak" GC classes and almost-sure convergence is associated with "strong" GC classes, and are not equivalent except under regularity conditions (such as integrability and uniform boundedness).
 * I have edited the page to distinguish between the weak and strong versions, as well as added some two theorems describing when they are equivalent (with references for the theorems).
 * One major issue remains that still needs to be edited:
 * The statement of the Vapnik and Chervonenkis (1968) result isn't quite right. Similarly to the Dudley et. al (1991) theorem, the class C is uniformly GC iff it has finite VC dimension *so long as C is image admissible Suslin*. There are simple counterexamples to this theorem otherwise (e.g. consider the set L = {1, 2, 3, ..., ω_1} with the order topology and Borel sigma-algebra and let C the class {{x ∈ L | x < k} | k ∈ L}; this has VC dimension 1 but is not universal Glivenko-Cantelli).
 * There is another point that may need editing that I'm not sure about:
 * The current page does not distinguish between weak and strong universal GC classes. This may be unnecessary, since Proposition 4 of Dudley et. al. (1991) (currently reference [7]) shows that the weak and strong versions always coincide for classes of universally measurable real-valued functions. However, we may want to include the definitions of weak/strong universal GC for completeness and add their equivalence as a theorem.
 * Aopus (talk) 18:13, 24 December 2023 (UTC)
 * I've now also updated the statement of the Vapnik and Chervonenkis (1968) result. Rather than providing the consistency conditions given by the 1968 paper, I've added below the theorem two of the more modern versions of the necessary conditions (either the image-admissible Suslin property, or the universally separable property) that people usually check nowadays. Aopus (talk) 14:48, 25 December 2023 (UTC)

Untitled
Why is this theorem listed as part of the Robotics WikiProject. There is no reference to robotics anywhere, and this is a theoretical (not applied) result. --Tekhnofiend (talk) 20:30, 30 September 2008 (UTC)
 * Because it was under Category:Machine learning, which is under the scope of WikiProject Robotics. I don't really have an objection if you remove the article from the ML category and remove the project tag. --Jiuguang (talk) 20:46, 30 September 2008 (UTC)
 * Heh, and then the next question is, why is Robotics under Machine Learning, but there is probably some big boss somewhere so I understand if you do not know the answer. I would be tempted to say Robotics uses Machine Learning, among other things, and hence it partially falls under Machine Learning, rather than the other way around. After all, robots are machines that do learning. I am fine with leaving this tag here if it's a result of some hierarchy where a ML tag entails a Robotics tag. The ML tag makes some sense. --Tekhnofiend (talk) 22:16, 30 September 2008 (UTC)
 * There was a discussion a few months ago on the scope of WikiProject Robotics (see ). Basically, the general view is that since there are currently no WikiProjects for areas such as ML, AI, vision, etc, WikiProject Robotics will temporarily serve as the umbrella project for these fields, along with other intelligent systems-related topics. --Jiuguang (talk) 00:14, 1 October 2008 (UTC)

VC theorem
Was published in 1968 in Russian, and in 1971 in English. (Igny (talk) 23:34, 20 April 2010 (UTC))

Definition of empirical distribution function
Isn't it supposed to be I(-inf, Xi] rather than I[Xi, inf) in the definition of the empirical distribution function on this page? — Preceding unsigned comment added by 2001:6B0:2:2801:79C5:5B00:44C4:48A1 (talk) 12:15, 23 April 2019 (UTC)
 * Never mind, I confused Xi and x. Beginner mistake. Please feel free to delete this thread. — Preceding unsigned comment added by 2001:6B0:2:2801:20C8:F2C1:C2D7:CE85 (talk) 07:55, 24 April 2019 (UTC)

The "proof" is not a proof
what is currently presented could be part of a proof, but in my estimation this isn't even enough to be called a proof sketch. biggest issue is that the data points x_i are deterministic, but the theorem requires an argument for random x_i.

2001:16B8:680F:7100:4F01:C5A2:46D8:812C (talk) 00:49, 4 July 2020 (UTC)

pointwise and a.s. convergence
In the statement of the theorem it says " {\displaystyle F_{n}(x)} is a sequence of random variables which converge to F almost surely by the strong law of large numbers, that is, converges to  pointwise."

Almost surely convergence does not imply pointwise convergence (the other way around is true). Am I missing something? Rachel Buchuk Coca (talk) 17:38, 5 September 2022 (UTC)