Talk:Kernel (statistics)

Questionable reference in section "Kernel functions in common use"
In the list of functions, "tricube" is followed by a reference to Altman, but this article does not mention "tricube", or even "cube". — Preceding unsigned comment added by Prosaicpat (talk • contribs) 23:56, 27 March 2019 (UTC)

The figure with all the kernels in a common coordinate frame seems to have a bug
There's a curve on this plot that seems to be labeled "exponential". This curve does not look exponential to me. And nor does an exponential kernel appear to be defined in the body of the article. —Preceding unsigned comment added by 151.203.248.95 (talk) 21:49, 16 April 2009 (UTC)

Epanechnikov Kernel
The Epanechnikov Kernel shown on the page is only correct for 1 dimensional problems.

There is a generalization for n dimensions, but the prefactor has to change cf. Simon, Optimal State Estimation, p.472

There should be explicit warning about the dimensional restrictions of the form of the kernel to avoid misuse.

Thanks, Frank —Preceding unsigned comment added by 76.124.111.215 (talk) 04:41, 9 November 2008 (UTC)

A question from 167.191.250.81
triweight--is that the same as tricube? Which I understood to be:


 * $$K(u)=\left(1-\left|u^3\right|\right)^3$$

Why is this approach "non-parametric"
I have to admit I'm not as familiar with this subject as I ought to be. The stuff I remember about non-parametric statistics includes things like the rank-sum test, which really does not make any assumptions about the distribution of the two samples beyond the null hypothesis (both samples drawn from identical populations?).

After reading this article and a couple of related articles, it appears that the "kernel" is a sort of prior distribution. So why are these methods called non-parametric? And why isn't that question addressed somewhere in Wikipedia? Or have I just been reading the wrong articles? DavidCBryant 16:43, 11 August 2007 (UTC)


 * Actually, this question is addressed in Wikipedia. See Non-parametric statistics. Solarapex 23:14, 10 October 2007 (UTC)

Munaf Kernel is incorrect?
Something is wrong with this kernel. I could not find any references to this kernel. The current formula states:

$$ K(u)=\frac{45}{64}(1-u^2)^4\ 1_{(|u|\leq1)} $$

But it does not satisfy the requirement:

$$ \int K(u) du = 1 $$

The one that satisfies this requirement is:

$$ K(u)=\frac{315}{256}(1-u^2)^4\ 1_{(|u|\leq1)} $$

Otherwise, the Munaf kernel should be (to follow the ratio):

$$ K(u)=\frac{45}{64}(1-u^4)^2\ 1_{(|u|\leq1)} $$

Generalization
In general, Epanechnikov, Quartic, Triweight, and Munaf can be generalized as:

$$ K(u)=B (1-u^{2m})^n\ 1_{(|u|\leq1)} $$,

where

$$ B = \frac{1}{2 \times \sum_{j=0}^{n} C_j^n \frac{(-1)^j}{2mj + 1} } $$,

$$C_j^n$$ - the number of combinations.

I haven't seen this in books. So I have to warn you, this may be original research. Solarapex 23:14, 10 October 2007 (UTC)


 * Assuming this was incorrectly copied from a correct source, my money is on the formula (45/65)(1-u^4)^2. In light of the lack of sources and importance ("Munaf kernel" only gets a Google hit on this page), I've removed it from the article. --Lambiam 11:07, 29 October 2007 (UTC)


 * I cannot find the above generalization, but I have found two generalizations in (http://www.ssc.wisc.edu/~bhansen/718/NonParametrics1.pdf) and (http://www2.math.cycu.edu.tw/TEACHER/MSYANG/yang-pdf/yang-n-56-mean-shift.pdf). From one source: ks(u) = (2s + 1)!! / (2^(s+1) * s!) * (1 - u^2)^s * indicator_func(|u| <= 1). The other source is somewhat similar.

Proposed merge of Kernel (statistics) and Kernel smoother
Isn't a merge with Kernel density estimation more appropriate? --Lambiam 20:11, 16 March 2008 (UTC)
 * I think they are different things, although perhaps the redundancy between them could be reduced somehow. --Zvika (talk) 05:32, 17 March 2008 (UTC)
 * The procedure of kernel density estimation clearly is a form of kernel smoothing. Could you give a hint what you see as an essential difference? --Lambiam 19:30, 17 March 2008 (UTC)
 * It seems to me that both the setting and the applications are different. In one case we have want to estimate a function from noisy measurements, and in the other we want to estimate a pdf from iid realizations of its random variable. The technique itself is similar. However, I am not an expert on this, so barring any further resistance, go ahead and do what you think is right. --Zvika (talk) 07:03, 18 March 2008 (UTC)

In kernel density estimation you are actually trying to estimate the number of measurements inside each window, when with kernel smoother you are estimating the values of data points (and not the number of data points). In some sense, both techniques are similar (density estimation is a particular case of kernel smoother, when all the measurement points has the value 1). Anyway, I think that kernel density estimation is an important topic, and it should have its own article --Anry11 (talk) 17:13, 18 March 2008 (UTC)

There is also the page density estimation to consider for overlap. Melcombe (talk) 11:56, 9 April 2008 (UTC)

Missing inverse in the definition
The last line of the definition states that "If K is a kernel, then so is the function K* defined by K*(u) = λ−1K(λu), where λ > 0." Isn't the λ "inside" the function supposed to be inverted as well? I'm pretty sure that this is the case but since I haven't written here before I don't dare to change it myself. —Preceding unsigned comment added by 217.10.116.172 (talk) 21:57, 15 April 2008 (UTC)
 * No, I think the definition is correct as it is: this is the way a scaling factor is used. However, please don't hesitate in the future to make changes yourself (even if you are not sure of them); in the worst case, someone will come along and correct you. --Zvika (talk) 04:05, 16 April 2008 (UTC)
 * This was indeed an error. I've fixed it. --Lambiam 05:06, 24 April 2008 (UTC)
 * You're right. I must have been confused that day :) --Zvika (talk) 10:46, 24 April 2008 (UTC)

Exponential Kernel
$$K(u) = e^{-u} 1_{u > 0}$$. It's referred in the article, but there's no any information on it. Needs a graph. Solar Apex (talk) 07:14, 24 October 2010 (UTC)

Optimal Smoothing/Bandwidth for Kernels?
I have read (see pages 31-32 or so of http://books.google.com/books?id=7WBMrZ9umRYC&printsec=frontcover#v=snippet&q=Optimal%20Smoothing&f=false) that the optimal smoothing parameter can be verified if the distribution is known. But there are a couple commonly used widths for normal kernels:

$$ (\frac{4}{3n})^(1/5) * \sigma $$

where sigma is either the standard deviation or for a more robust estimation it is the median of absolute differences divided by 0.6745.

Can someone please create a section for optimal kernel widths assuming different kernels (as the above only applies to normal distributions) — Preceding unsigned comment added by 150.135.222.234 (talk) 20:02, 2 April 2013 (UTC)

Symmetric, or just zero mean?
In the Definition sub-section of the non-parametric statistics section, the second bullet point says that the kernel must be symmetric, but the text below says that it just needs to have mean zero. Are there differing conventions, or is there just a typo? I've seen the zero-mean definition in Wasserman 2003 (20.3) -- anybody have a reference for the symmetric definition? If not, I'll change it. DWeissman (talk) 21:36, 25 November 2014 (UTC)

Indicator function
Rather than "indicator function", which is rather a jargon term, the article would be clearer if instead the section "Kernel functions in common use" simply defined the range of variable u. Geoffrey.landis (talk) 14:36, 14 April 2017 (UTC)