Talk:Prediction interval

Which percentile to use
I'm not sure about the line "where Ta is the 100(1 − (p/2))th percentile of Student's t-distribution..." for a 100p% prediction interval. For example, for a 90% prediction interval that would be the 55th percentile, which doesn't sound right - or am I missing something?

Perhaps it should instead read "where Ta is the 100(1 − (α/2))th percentile of Student's t-distribution..." for a 100(1-α)% interval, also replacing p by 1-α in the line above (i.e. α is the error rate in the prediction, whereas p was the success rate). For a 90% prediction interval (α=0.1) that would mean using the 95th percentile, which sounds more reasonable.

For possible support for this formulation see http://www.amstat.org/publications/jse/secure/v8n3/preston.cfm which defines α in the same way and uses the 100(α/2)th and 100(1-(α/2))th percentiles of a general distribution. Also http://www.math.umd.edu/~jjm/tpredictionintervals.pdf, which uses the 100(α/2)th percentile of the t-distribution - I assume that the choice of 100(α/2)th or 100(1-(α/2))th percentile depends on how your t-distribution tables are written.

Alternatively, the definition of p as a success rate in the article could be retained by referring to the 100((1+p)/2)th percentile of the t-distribution, in which case the error rate α would not need to be introduced.

Richard J Price 10:54, 22 March 2007 (UTC)

In agreement to the table on the student's t page, if T_a is the 100((1+p)/2)th percentile, then P(T<T_a)=(1+p)/2=(1+1-alpha)/2=1-alpha/2, with T student t distributed, which is the correct error for two-sided interval (see confidence interval). Gummif (talk) 00:49, 3 August 2013 (UTC)

Michael Hardy kindly edited the article at 20:01, 22 March 2007 to address my first comment above. I've just noticed that much later on this change was reversed, I think by Rnjma99 at 12:13, 20 November 2015, and it has remained that way ever since. I don't understand why - can anyone explain? — Preceding unsigned comment added by RichardJPrice (talk • contribs)

OK, I see now that it was corrected in a different way on 14 March 2017 by an anonymous contributor, who changed the definition of p from the success rate of the prediction to its failure rate (what I called α above). Richard J Price (talk) 10:07, 25 October 2018 (UTC)

Unclarity
Could we please get another example, with a population variable such as apple width or orange peel thickness, instead of a bunch of abstract equations? Thanks in advance. 75.35.79.113 21:41, 19 April 2007 (UTC)


 * I’ve elaborated and given some simpler and clearer examples, notably the simple non-parametric estimation – hope it’s clearer now!
 * —Nils von Barth (nbarth) (talk) 17:21, 19 April 2009 (UTC)

Bayesian Statistics
Why exactly is this stated --- "In Bayesian statistics, one can compute (Bayesian) prediction intervals from the posterior probability of the random variable, as a credible interval. In theoretical work, credible intervals are not often calculated for the prediction of future events, but for inference of parameters – i.e., credible intervals of a parameter, not for the outcomes of the variable itself"? --- It's quite common in practice to create a posterior predictive distribution which gives you an interval for the actual outcome of the variable itself. 97.125.169.175 (talk) 19:23, 30 April 2011 (UTC)

unclear on scope
The article could possibly be clarified by relating a prediction interval to a tolerance interval. The intro currently uses language that a prediction interval is not normally appropriate for, although terminology in this area can be a bit inconsistent:
 * an estimate of an interval in which future observations will fall, with a certain probability, given what has already been observed

If not read carefully, this could imply that a prediction interval is an interval bounding n% of all future samples from a process, which would be equivalent to n% population coverage, which is not typically what a prediction interval gives you (except on average). However I'm not entirely sure where to go in clarifying this article. I've started by expanding tolerance interval instead. --130.207.127.232 (talk) 14:14, 26 August 2011 (UTC)

Standard score
The source of confusion is clearly explained by Melcombe on the project page. A prediction interval [L,U] is an interval such that for a future observation X it holds: P(L<X<U) has a given value. For the standard score Z of X therefore it gives:
 * $$P\left( \frac{L-\mu}{\sigma} < Z < \frac{U-\mu}{\sigma} \right) = \gamma$$

By determine the quantile z such that
 * $$P\left( -z < Z < z \right) = \gamma$$

it follows:
 * $$L=\mu-z\sigma,\ U\mu+z\sigma$$

Notice Z is a standard score, z is not. Actually I don't think the use of the term standard score is much of a help. Nijdam (talk) 08:16, 13 May 2012 (UTC)


 * I still think it's necessary to mention standard score in the article. Let's continue the issue at that project page: Wikipedia_talk:WikiProject_Statistics. Mikael Häggström (talk) 18:57, 13 May 2012 (UTC)

If Known mean and known variance then it is not a prediction interval but a tolerance interval
Maybe this is just about semantics but if you agree then we should remove the example for "Known mean, known variance" and just link this case to Tolerance interval, what you think?


 * There should be a link to Melcombe (talk) 07:03, 18 July 2012 (UTC)

On known mean, unknown variance
In this case, for normal population we have:
 * $$s^2=\frac{1}{n-1}\sum\limits_{i=1}^n(X_i- \mu)^2,$$

and, therefore
 * $$\frac{n-1}{\sigma^2}s^2\sim\chi_n^2$$

is chi-squared distributed with n degrees of freedom;
 * $$\sqrt{\frac{n}{n-1}}\frac{X- \mu}{s}\sim t_n$$

is t-distributed with n degrees of freedom, i.e. the statistic
 * $$T=\frac{X-\mu}{s}$$

is scaled Student-:$$t_n$$ distributed. J. Angelova — Preceding unsigned comment added by 46.10.58.124 (talk) 20:34, 12 October 2012 (UTC)


 * Your point being? Fgnievinski (talk) 20:31, 6 February 2013 (UTC)

>>> The text of the page reads $$T=\frac{X}{s}$$. It's missing the $$\mu$$, correct? Hoggenbit99 (talk) 05:01, 8 November 2019 (UTC)

Non-parametric
I don't catch!

When forecasting a growth curve (x1, x2, ..., xn), then P(xi < xi+1) > P(xi > xi+1).

In facts, P(xi < xi+1) = 1-e where e is of the order of magnitude of the error on data.

Please explain or cite references. — Preceding unsigned comment added by AlainD (talk • contribs) 18:09, 25 January 2014 (UTC)


 * I added a reference to conformal Prediction, which I think, here this is a special case Biggerj1 (talk) 10:06, 1 October 2023 (UTC)

Regression
When looking in my text book, I see the best estimate for $$x_t$$ is has an expectation of $$\bar{y_t} = \beta + \alpha x_t$$, and standard-deviation $$\sqrt{ MS_E (1+\frac 1 n + \frac{(x_t - \bar x)^2}{S_{xx}} }$$.

This implies that the error on the forecast estimate is mimimum for $$x_0=\bar x$$ and widens as $$|x_0 - \bar x|$$ increases. It also implies that the confidence interval for the best estimator of $$x_0$$, is always wider than the confidence interval for $$x_o$$.

Is it the same concept? If then, is there a reason not to include the complete formula? AlainD (talk) 21:00, 25 January 2014 (UTC)