Talk:Inferential statistics

Missing content - needs work?
This article seems to be missing a great deal of description/explanation. It looks to me like this is a stub, with added examples. How do I put up a template (or something) to indicate that this article needs work? DonkeyKong the mathematician (in training) 02:24, 14 June 2006 (UTC)
 * What do you want from the article? I will be happy to answer your questions. Bo Jacoby 18:46, 14 June 2006 (UTC)

User:Michael Hardy states that the article needs cleanup. Could you please be more specific? Bo Jacoby 15:28, 21 June 2006 (UTC)


 * More specifically: It could be mainly about inferential statistics. I'll answer at greater length later. Michael Hardy 16:07, 21 June 2006 (UTC)

I see what you mean. It seems to talk too much about deduction. The point is that the induction formula is a flexion of the deduction formula. So induction cannot be explained isolated from deduction. Perhaps the title should be changed. Bo Jacoby 08:25, 22 June 2006 (UTC)

cleanup
I usually work on narrower topics, but I'm going to pay some attention to this page over the next few days. I've deleted material that appeared to say that frenquentist statistics relies ONLY on maximum likelihood estimation (for starters, this would actually suggest that frequentists don't use unbiased estimation!).

So I've marked this for cleanup again.

Wikipedia has not done nearly as well with statistics as with mathematics generally. Michael Hardy 18:15, 25 August 2006 (UTC)
 * I agree. Many of the statements made in this article are patent nonsense. Btyner 21:53, 18 October 2006 (UTC)
 * Please tell us what you consider to be nonsense. Bo Jacoby 14:49, 10 December 2006 (UTC)

Suggestion that this page should be merged with Inferential statistics
I support this suggestion.

A possible lead-in on the statistics page could be:

(... discuss historical meaning of 'statistics' ....)

Nowadays, statistics generally means either
 * data, or
 * the methodology of dealing with data. This includes
 * descriptive statistics - the methodologies of describing particular set(s) of data
 * inferential statistics, in which inferences are made from the particular set(s) of data to some larger population(s).


 * Johnbibby 20:40, 3 November 2006 (UTC)

If the material cut from the article is added, perhaps it should be kept. Otherwise, merge it. Dr. Payne 18:06, 11 December 2006 (UTC)

material cut from article

 * The article should not consist primarily of an example of Bayesian inference, without discussion first at some length what statistical inference is and what the various schools of thought are. This material could perhaps belong in some other article.  I'm pasting it below:

This is an example of the latter [i.e. of Bayesian inference].

From a population containing N items of which I are special, a sample containing n items of which i are special can be chosen in


 * $$ {I \choose i}{{N-I} \choose {n-i}} $$

ways (see multiset and binomial coefficient).

Fixing (N,n,I), this expression is the unnormalized deduction distribution function of i.

Fixing (N,n,i), this expression is the unnormalized induction distribution function of I.

The two most important parameters of a probability distribution are: the mean value and the standard deviation. The plus-minus sign, ±, is used to separate the mean from the deviation.

Deduction distribution formula
The mean value ± the standard deviation of the deduction distribution is used for estimating i knowing (N, n, I)


 * $$i \approx f(N,n,I)$$


 * $$f(N,n,I)=\frac{nI\pm\sqrt{\frac{nI(N-n)(N-I)}{N-1}}}{N}$$

where a(b ± c) = ab ± ac. Note that f defines two functions of three variables.

Example: The population contains two items one of which is special, and the sample contains one item. (N, n, I) = (2, 1, 1) gives


 * $$i\approx f(2,1,1)=\frac{1}{2}\pm\frac{1}{2}$$

confirming that the number of special items in the sample is either 0 or 1.

Induction distribution formula
The mean value ± the standard deviation of the induction distribution is used for estimating I knowing (N,n,i)
 * $$I \approx -1-f(-2-n,-2-N,-1-i)$$

where a+(b±c)=(a+b)±c.

Thus deduction is translated into induction by means of the involution


 * $$(N,n,I,i) \leftrightarrow (-2-n,-2-N,-1-i,-1-I).$$

Example: The population contains a single item and the sample is empty. (N,n,i)=(1,0,0) gives
 * $$I\approx -1-f(-2-0,-2-1,-1-0)=\frac{1}{2}\pm\frac{1}{2}$$

confirming that the number of special items in the population is either 0 or 1.

Note that the frequency probability solution to this problem is $$I\approx \frac{Ni}{n}=\frac{0}{0}$$ giving no meaning.

Binomial distribution formula
In the limiting case where N is a large number, the deduction distribution of i tends towards the binomial distribution with the probability $$P=\frac{I}{N}$$ as a parameter,


 * $$i\approx nP\left (1\pm\sqrt{\frac{\frac{1}{P}-1}{n}}\right )$$

Example: The population is big, the probability $$P=\frac{I}{N}=\frac{1}{2}$$, and the sample contains one item. n = 1 gives
 * $$i\approx \frac{1}{2}\pm\frac{1}{2}$$

confirming that the sample contains 0 or 1 special items, with equal probability.

Beta distribution formula
In the limiting case where N is a large number, the induction distribution of $$P=\frac{I}{N}$$ tends towards the beta distribution
 * $$P\approx\frac{i+1\pm\sqrt{\frac{(i+1)(n-i+1)}{n+3}}}{n+2}.$$

The frequency probability solution to this problem is $$P \approx \frac{i}{n}$$. The probability is estimated by the relative frequency.

Example: The population is big and the sample is empty. n = i = 0 gives
 * $$P \approx(50 \pm 29)\%$$.

The frequency probability solution to this problem is $$P \approx \frac{i}{n}=\frac{0}{0}$$, giving no meaning.

Poisson distribution formula
In the limiting case where $$\frac{N}{n}$$ and $$\ n$$ are large numbers, the deduction distribution of i tends towards the poisson distribution with the intensity $$M=\frac{nI}{N}$$ as a parameter,


 * $$i \approx M \pm \sqrt{M}$$

Example: The population is big and the sample is big, and the intensity $$M=\frac{nI}{N}=1$$ gives
 * $$i\approx 1 \pm 1$$.

Gamma distribution formula
In the limiting case where $$\frac{N}{n}$$ and $$\ n$$ are large numbers, the induction distribution of $$M=\frac{nI}{N}$$ tends towards the gamma distribution with i as a parameter:


 * $$M \approx i+1 \pm \sqrt{i+1}.$$

Example: The population is big and the sample is big but contains no special items. i = 0 gives
 * $$M\approx 1 \pm 1$$.

The frequency probability solution to this problem is $$M\approx 0$$ which is misleading. Even if you have not been wounded you may still be vulnerable.