User:Tenoc.1776/StatProbSet1

Problem 1
After sorting the list of states and their population, I decided to choose intervals of roughly the same size and used millions (M) as the corresponding unit.

See ProblemSet1_LucasLaib_distr-table.pdf for the distribution table and ProblemSet1_LucasLaib_hist.png for the histogram (not perfect due to temporary technical limitations).

2a
$$A(y)=\frac{1+2+1+3+1+4+1+2+2+3}{10}=2$$

2b
$$\sigma(y)=\sqrt{\frac{4*(1-2)^{2}+3*(2-2)^{2}+2*(3-2)^{2}+(4-2^{2})}{10}}=\sqrt{\frac{3}{5}} \approx 0.77460$$

2c
$$Cov(x,y)=\frac{2*(1-2)(1-\sqrt{\frac{3}{5}})+(1-2)(2-\sqrt{\frac{3}{5}})+(1-2)(3-\sqrt{\frac{3}{5}})+2*(3-2)(2-\sqrt{\frac{3}{5}})+(4-2)(3-\sqrt{\frac{3}{5}})}{3}=\frac{3}{3}=1$$

$$\varrho(x,y)=\frac{Cov(x,y)}{\sigma(x) \sigma(y)}=\frac{1}{1*2}=\frac{1}{2}$$

3a
$$z_m=\frac{600-500}{120}=\frac{5}{6} \approx 0.83$$

$$P(x_m > 600) = 1 - \Phi(0.83) = 1 - 0.7967 = 0.2033 = 20,33%$$

3b
$$z_f=\frac{600-460}{120}=\frac{7}{6} \approx 1.17$$

$$P(x_f > 600) = 1 - \Phi(1.17) = 1 - 0.9616 = 0.0384 = 3,84%$$

4a and 4b
The average describes the sum of a set of values divided by the number of values in that set, the mean is the value in the middle of such set after it has been sorted. The difference becomes clear if the set of values for example describes income distribution among workers. In that case the average would describe the situation in which the total amount of income would be distributed evenly across all workers - and the median would be the same as the average. But if in reality the money is unevenly distributed, the mean would be different from the average (if the richer workers would earn a larger share of the money than the poorer workers, the mean would be below the average).