Wikipedia:Reference desk/Archives/Mathematics/2009 July 31

= July 31 =

Bay window measurement
I'm measuring my backyard and I've ran into a problem with a bay window area. There are three segments, each being 3' 5". The middle segment sticks out from the main exterior wall by 2'. What formula could I use to find the total width of the bay, taking angles into consideration, and then find what degree the angles should be at (for the left and right segments)? 3' 8" * 3 obviously wouldn't work. Here's an image that might help: http://img354.imageshack.us/img354/5492/confusion.jpg

I'd go out and measure it myself, but it's too dark, and it's also a torrential rainfall out there right now.

--69.154.119.41 (talk) 00:15, 31 July 2009 (UTC)


 * Your best bet is to measure it when the weather gets better, since the abstract math may leave out some stuff like thickness of frames. Also, your diagram is confusing--are the outer segments supposed to be angled towards the wall?  If you imagine a right triangle whose hypotenuse is the 3'8" piece of glass, and the side adjacent to the angle at the end runs along the wall, and the opposite side of the angle is the 2' distance from the wall to the middle segment, then the ratio "opposite over hypotenuse" (i.e. 2' / 3'8") is the sine of the angle.  Since you know the sine, you can find the angle with your calculator's arcsine function (the arcsine "undoes" the sine function, like squaring undoes a square root).  If you want know how much wall space the tilted segment covers, the ratio "adjacent over hypotenuse" is the cosine of the angle.  There are a bunch of mathematical relationships ("trig identities") between the sine and the cosine, but conceptually it's simplest in this case to just use a calculator, figure out the sine, use the arcsine button to find the angle, then use the cosine button to get the ratio.  The length along the wall will be 3'8"*(cos x) where x is the angle.  —Preceding unsigned comment added by 67.117.147.249 (talk) 01:19, 31 July 2009 (UTC)

The floor plan can be viewed as a rectangle and two right triangles: the rectangle is 2' by 3'5". Each right triangle has a 3'5" hypotenuse and a 2' leg. The question is how long the other leg is. So use the Pythagorean theorem:
 * $$\text{other leg} = \sqrt{(41\text{ inches})^2 - (24\text{ inches})^2} \cong 33 + 1/4\text{ inches}. \, $$

Thus the whole width of the bay is about
 * $$ (33 + 1/4) + 41 + (33 + 1/4) = 105 + 1/2\text{ inches}.\, $$

Michael Hardy (talk) 02:47, 31 July 2009 (UTC)
 * Wait: Your diagram says 3'8", but your text says 3'5". For 3'8", change the numbers in what I did above.  But use the same method.  The bottom line is a bigger number. Michael Hardy (talk) 02:50, 31 July 2009 (UTC)

Picking four true properties of population mean - identified three, but stuck on the last.
Statistics, population mean - question that I'm 75% done with, but need one more:


 * A study reports that college students work, on average, between 4.63 and 12.63 hours a week, with confidence coefficient .95. Which of the following statements are correct?


 * MARK ALL THAT ARE TRUE. There are four correct answers. (MARK ALL)


 * A. The interval was produced by a technique that captures mu 95% of the time.
 * B. 95% of all college students work between between 4.63 and 12.63 hours a week.
 * C. 95% of all samples will have x-bar between between 4.63 and 12.63.
 * D. The probability that mu is between 4.63 and 12.63 is .95.
 * E. 95% of samples will produce intervals that contain mu.
 * F. The probability that mu is included in a 95% CI is .95.
 * G. We are 95% confident that the population mean time that college students work is between 4.63 and 12.63 hours a week.

I've determined that A, F, and G are TRUE. I'm also sure that D is FALSE (each sample will likely have a different mean and end points).

So that means that one of B, C, and E is true, and the remaining two are false. Any insights on which of B, C, and E is the true property? 70.169.186.78 (talk) 03:02, 31 July 2009 (UTC)
 * F is equivalent to D. You say D is false, therefore F is not true either. (Igny (talk) 03:37, 31 July 2009 (UTC))


 * I'm not so sure. See -
 * A. The interval was produced by a technique that captures mu 95% of the time. TRUE - That is what "95% confidence" means.
 * B. 95% of all college students work between between 4.63 and 12.63 hours a week.
 * C. 95% of all samples will have x-bar between between 4.63 and 12.63.
 * D. The probability that mu is between 4.63 and 12.63 is .95. FALSE - The population mean will be between the end points of the interval for 95% of all samples. But since each sample will have a different mean, the end points of those intervals will also change. This interval either includes m, or it doesn't. Once I take a sample and compute x-bar, there is no more probability involved. 
 * E. 95% of samples will produce intervals that contain mu.
 * F. The probability that mu is included in a 95% CI is .95. TRUE - As long as we talk about the probability of AN interval, not THIS interval.
 * G. We are 95% confident that the population mean time that college students work is between 4.63 and 12.63 hours a week. 	TRUE: We are trying to estimate the population mean.

-- 70.169.186.78 (talk) 03:50, 31 July 2009 (UTC)
 * You convinced me as long as you are talking about a random confidence interval, and not random (fixed) mu. (Igny (talk) 04:24, 31 July 2009 (UTC))


 * A study reports that college students work, on average, between 4.63 and 12.63 hours a week, with confidence coefficient .95. Which of the following statements are correct? The number of hours a particular student works on a particular week depends on which student and which week. You can average over students (giving an average value for each week), or over weeks (giving an average value for each student), or over both (giving one average). It is unclear which of those is the case here. Mentioning confidence indicate that it is unknown population averages rather than known sample averages. A sample of size N has a mean value &mu; and a standard deviation &sigma;. These sample numbers, N, &mu; and &sigma;, are known to the authors of the study, but not revealed to you. From these numbers the 95% confidence interval limits for the population average, 4.63 hours and 12.63 hours, are computed, and you can try and compute backwards. The population average is estimated to be close to the sample average, which must have been (4.63+12.63)/2 = 8.63 hours. The sample average is supposed to be normally distributed with a standard deviation equal to the population standard deviation divided by the square root of N. (See Normal distribution). So the difference between the sample average and the population average is normally distributed with mean zero and standard deviation &sigma;/&radic;N. The total width of the 95% confidence interval, 12.63&minus;4.63 = 8 hours, is about four times this standard deviation, so we have  4&sigma;/&radic;N = 8 hours. Assuming that no student works less that 0 hours or more than 40 hours a week, the sample standard deviation is less than 40&middot;&radic;(p(1&minus;p)) where p = 0.22 (such that &mu; = p&middot;40+(1&minus;p)&middot;0 = 8.63), so &sigma; << 16. (Equality if only the extremes 0 hours and 40 hours are observed). So 4&middot;16/&radic;N >> 8 hours. So N << 64. It must have been a very small study. The decimals on 4.63 and 12.63 hours give an unjustified impression of precision; the population average is given with a huge inaccuracy; and the variation of the observations are untold. So B is false. Bo Jacoby (talk) 20:21, 31 July 2009 (UTC).

B is absolutely false. And I was surprised the first time I saw students interpreting it that way. D is false and F is true if you construe the words the way they're usually construed, but there I have some bones to pick. As someone pointed out, the question is whether we're talking about confidence intervals in general or about THIS interval. That's what "Igny" seems to be missing. F is true of confidence intervals in general, but not of any one particular confidence interval, unless we do some funny and somewhat elaborate revisions of our concept of probability. Where it gets involved is that maybe we SHOULD do some strange and elaborate revisions of the concept, but that's more than I could get into here. E is true, and it's just another way of restating the statement about what the 95% probability means. Michael Hardy (talk) 23:56, 31 July 2009 (UTC)

hypothesis tests
Q2:

For a test of Ho: p = 0.5, the Z-test statistic equals -1.52. Find the p-value for Ha: p < 0.5.


 * A. 0.0643
 * B. 0.0668
 * C. 0.9357
 * D. 0.9332
 * E. 0.1286
 * F. 0.1336

I looked it up in the z-table and found a value of 0.0643 (A) for the area under the standard normal curve. But is this the solution, or is there something more? 70.169.186.78 (talk) 20:03, 31 July 2009 (UTC)


 * Looks OK. But your use of the letter p makes me wonder about whether it involved a population proportion, and whether you should therefore think about using a continuity correction.  Depends on context: if it's an exercise from a textbook, then looking at the exact phrasing of the question might quickly settle the matter. Michael Hardy (talk) 00:01, 1 August 2009 (UTC)

Bilateral statistical hypothesis test
Homework question, sorry. I just can't figure something out.

I'm assuming that I need to do a bilateral statistical hypothesis test.

Info: [...]In 1993-1994, a little more than half (52 %) of foreign exchange students were from Asia.[...]

Assuming that a random sample of 1,346 foreign students contains 756 students from Asia, in 1996.

a) With a significance level (not sure of the term in English) of 5 % can we conclude that the proportion of Asian foreign exchange students, in 1996, is 52 % ?

What I did:

1) H0: µ = 0,52

H1: µ = (not equal) 0,52

2) The significance level is 5 %

3) n = 756 > (or equal) 30

4) Since the size of the sample is n = 756 > (or equal) 30, we can use the normal distribution.

5) If Zx < -1.96 or if Zx > 1.96, the average proposed will be discarded and in consequence the alternative hypothesis H1 will be accepted. If -1.96 < (or equal) Zx < (or equal) 1.96, the average proposed by the null hypothesis H0 will not be discarded.

6) Calculations: http://i31.tinypic.com/xljhc9_th.jpg

However, the standard derivation is given NOWHERE. How am I supposed to calculate Zx ? Is there an error in the manual ? Rachmaninov Khan (talk) 20:02, 31 July 2009 (UTC)


 * You're running a z-test for proportions. The putative standard deviation is computed as $$\sqrt{\frac{p(1-p)}{n}}$$ (cf. variance of a variable under a binomial distribution). You have the corresponding formula for the statistic z in statistical hypothesis testing, one-proportion z-test. Pallida  Mors  05:48, 1 August 2009 (UTC)

I can't say how much I'm thankful. Thanks again !!! Rachmaninov Khan (talk) 15:08, 1 August 2009 (UTC)

Finding sample size
We want to determine the true average number of drinks University of Michigan students have over a weeklong period. Assume the standard deviation is ~6.3. How many students should be sampled to be within 0.5 drink of population mean with 95% probability?

Is it 610? 70.169.186.78 (talk) 04:39, 31 July 2009 (UTC)


 * Assuming it's a Normal distribution you can work out p(drinks), and therefore p(drinks-averagedrinks > 0.5) for 1 sample.
 * You could use a computer to calculate (integrate) the same probability for 2 samples. eg integrate over all p(x)p(2d-x) to find p(d) (d=drinks)
 * And repeat that process to get the probability for 3,4,5 etc samples until p(drinks-meandrinks > 0.5) is less than 2.5%
 * Finding the mathematical formula is beyond me. Maybe someone else will know.83.100.250.79 (talk) 22:53, 31 July 2009 (UTC)


 * See also Standard error (statistics), Normal_distribution 83.100.250.79 (talk) 23:03, 31 July 2009 (UTC)

The answer above misses the point completely and is entirely wrong.

The standard deviation of the sampling distribution of the sample average is &sigma;/&radic;n, where n is the sample size and &sigma; is the population standard deviation, in this case 6.3. Since you've asked for 95%, you get 1.96 out of the normal table (that's the 97.5th percentile, so 2.5% is above that, and another 2.5% below &minus;1.96, so that leaves 95% between ±1.96. So you need
 * $$ 1.96\cdot{\sigma \over \sqrt{n}} = 1.96\cdot{6.3 \over \sqrt{n}} \le 0.5. $$

That tells you that
 * $$ \sqrt{n} \ge {1.96\sigma \over 0.5}. $$

and so on....... Michael Hardy (talk) 00:12, 1 August 2009 (UTC)
 * Actually it's not entirely wrong - it's a computational method, also the links link to exactly the answer you gave :) 83.100.250.79 (talk) 00:24, 1 August 2009 (UTC)
 * You wrote "p(drinks-averagedrinks > 0.5)". That that's what was being asked about is entirely wrong. Michael Hardy (talk) 01:00, 1 August 2009 (UTC)
 * Is this better :
 * p(drinksaverage after n samples-averagedrinks > 0.5) is less than 2.5%
 * I can't see the confusion, it's what is being asked? (assuming the distribution is symmetrical)83.100.250.79 (talk) 12:57, 1 August 2009 (UTC)
 * That's it if you use the absolute value of the difference, not just the difference, and if by "average" you mean the (unobservable) population average. And you're using the word "sample" incorrectly.  There's only one sample, which includes n observations. Michael Hardy (talk) 16:45, 1 August 2009 (UTC)
 * ok buddy, I'll bow to your superior knowledge of naming conventions - but do I actually need the absolute value - that's why I used 2.5% (instead of 5%), and commented that the distribution is symmetrical.83.100.250.79 (talk) 17:28, 1 August 2009 (UTC)

OK, I see what you mean: no absolute value needed when you've got "2.5%" there. Michael Hardy (talk) 17:32, 1 August 2009 (UTC)
 * oh silly me - I've just realised that they may not know the mean.. 83.100.250.79 (talk) 18:59, 1 August 2009 (UTC)

Clarification: The statement we've derived is that
 * $$ \Pr\left( - 1.96\frac{6.3}{\sqrt{n}} < \overline{X} - \mu < 1.96\frac{6.3}{\sqrt{n}} \right) = 0.95, $$

But now we want to rearrange it like this:
 * $$ \Pr\left( \overline{X} - 1.96\frac{6.3}{\sqrt{n}} < \mu < \overline{X} + 1.96\frac{6.3}{\sqrt{n}} \right) = 0.95, $$

with the unobservable population mean &mu; in the middle, and only observable quantities as the endpoints of the interval. That observable interval is the 95% confidence interval for the unobservable population mean &mu;. Michael Hardy (talk) 17:38, 1 August 2009 (UTC)