Wikipedia:Reference desk/Archives/Mathematics/2014 April 2

= April 2 =

Is my statistics book wrong?
Hi

I've encounterd the following question: "three balls are chosen randomly from an urn containing 7 black balls and one yellow balls. What is the probability that a yellow ball was chosen (at least one time)? The book claims the answer is C(7,2)/C(8,3) = 0.375. But doesn't this answer assumes the black balls are distinguishable, while they are not? I've run a simualtion of this problem in Python, and after 10,000,000 "experiments" the probability stabilizes on 0.330, not 0.375. Thanks! 212.179.46.23 (talk) 08:43, 2 April 2014 (UTC)
 * The question is ambiguous, but the "(at least one time)" condition implies to me that the balls are selected with replacement: i.e. choose one, put it back, choose another, etc. In that case the probability choosing at least one yellow is 1 - (probability of choosing all black) = 1 - (7/8)^3 ~= 0.33, as in your simulation. If the balls are chosen without replacement then the answer given in the book is correct: there are C(8,3) possible choices, of which C(7,2) include the yellow ball. AndrewWTaylor (talk) 09:05, 2 April 2014 (UTC)
 * Yep I wonder if the original question included the "at least one time" bit? If it did then it certainly was posed wrong as it implies replacement. Dmcq (talk) 09:09, 2 April 2014 (UTC)
 * Sorry - It didn't include "at least on time", it's my fault. Yet, doesn't the answer in the book assumes the black balls are distinguishable, while they are not? 212.179.46.23 (talk) 09:32, 2 April 2014 (UTC)
 * Whether or not the blackballs are distinguishable has no effect on the answer. In general, assuming objects are distinguishable won't affect probability, it will only affect questions that begin "In how many ways can you..."--80.109.80.78 (talk) 09:52, 2 April 2014 (UTC)
 * Here's a way to calculate the answer without assuming distinguished balls: You have a 1/8 probability of drawing the yellow ball with your first draw and a 7/8 probability of drawing a black ball. If you draw a black ball, there are 7 remaining balls, so you have a 1/7 probability of drawing the yellow ball on your second draw, and a 6/7 probability of drawing a black.  If you drew a black for your first and second ball, you have a 1/6 probability of drawing the yellow with your third draw.  So (1/8) + (7/8)(1/7) + (7/8)(6/7)(1/6) = 3/8 = .375.--80.109.80.78 (talk) 09:57, 2 April 2014 (UTC)
 * That's a nice way of looking at it - basically a Tree diagram (probability theory). Generalising it shows that with $$r$$ choices (without replacement) from $$n$$ balls, one of which is yellow, the probability of choosing the yellow is $$r/n$$. I'm not sure whether that result is surprising or obvious... AndrewWTaylor (talk) 18:12, 2 April 2014 (UTC)
 * It's obvious if you think about it backwards: you draw your $$r$$ balls first, then the yellow one decides which ball it's going to be.--80.109.80.78 (talk) 19:41, 2 April 2014 (UTC)

Equation to fit a curve
I found a table of measured emissivities vs temperature for a metal in an old text book. I plotted them on a graph at http://i60.tinypic.com/2cy2wj5.jpg. Is there a simple formula for e as a function of T that, with suitably chosen constants, will closely conform to this graph? Below about 1900 K, e = 0.0000338T1.18 is an accurate fit and above about 2350 K, e = 0.00606T0.5 is a close fit, with a smooth transition between 1900 and 2350 K. Unfortunately, the textbook does not give an explanation of why emissivity follows this curve, not does it have any math. 120.145.97.4 (talk) 13:24, 2 April 2014 (UTC)


 * Emissivity is in general a function of the material, the surface condition of the material, the temperature of the material, the orientation of the material surface relative to the pyrometer, etc. People have related the empirical thermal dependence of emissivity to the empirical spectral dependence of emissivity, but there is no general theory that I know of for this. With regard to the plot, it is a mice nice smooth curve; such curves can be well-fit by for instance splines. --Mark viking (talk) 18:46, 2 April 2014 (UTC)


 * "Mice smooth" = "As curvature continuous as a mouse's tail" ? :-) StuRat (talk) 19:52, 2 April 2014 (UTC)


 * Unintentionally funny misspelling. :-) But it's true that mouse tails can be modeled in computer graphics with inverse kinematic splines.  --Mark viking (talk) 20:10, 2 April 2014 (UTC)


 * I see one inflection point, and no discontinuities in position, angle, or curvature, so a single polynomial arc of degree 3 ought to fit it fairly well, no need to go for a multi-arc spline. The general equation of such a curve is y = ax3 + bx2 + cx + d.  Customizing the variable names to your application we get:  e = aT3 + bT2 + cT + d.  A curve fitting program can find the values of the coefficients a, b, and c, and the constant d.  Note however, that extrapolation of the curve beyond the start and end data points is unreliable. StuRat (talk) 19:42, 2 April 2014 (UTC)


 * Yes a cubic polynomial results like e = .19 + .00014 (t - 1500) - (.0002 t - .2)3 84.209.89.214 (talk) 00:44, 3 April 2014 (UTC)


 * Thank you StuRat and 84.209.89.214, but the suggested polynomial does not fit very well at all. The constants suggested give a worse case error of 23%.  Casual inspection shows that there are at least 3 points of inflection,  because the function has a assymptotic minimum value, an assymptotic maximum value and a curvy slope in between.  120.145.97.4 (talk) 10:36, 3 April 2014 (UTC)


 * That wasn't apparent from the graph shown. If it has two more inflection points, try upping the polynomial by two degrees, to get:  e = aT5 + bT4 + cT3 + dT2 + eT + f. StuRat (talk) 11:55, 4 April 2014 (UTC)


 * I'm not going to use a 5th order polynomial - a cop-out method for the mentally lazy. If every scientist or engineer used multi-order polynomials or splines to fit measured data, we would have no understanding of the underlying physics of anything.  Nor would we efficiently spot errors in data taking.  Nor could we extrapolate beyond measurements.  And if you actually try the fit of the two separate equations I gave for the low and high temperature ranges, you would see, apart from the fact that emissivity always has (by definition) a minimum value and a maximum value (as Tardis pointed out), the curve has three distinct regions: a) a low temperature region (below ~1900 K) conforming precisely to aTb, b>1, b) a high temperature region (above ~2350 K but less than ~3000 K) conforming to some equation I have been unable to identify but for which cTd, d<1 is a reasonable fit for practical purposes, and a transition region between 1900 and 2350 K.  In other words, it is obvious that there must be least 3 inflection points (because of the upper and lower limits and the change of exponent between 1900 and 2350 K), and by carefull curve fitting, it is seen that there are actually 4 points of inflection.  Knowing that there are 4 points of inflection, I can at least try and identify physical reasons for them.120.145.97.4 (talk) 11:46, 5 April 2014 (UTC)


 * Because emissivity is intrinsically bounded by 0 and 1 and yours is strictly increasing, it is natural to seek a fitting form that shares those properties. The obvious choice is the logistic function, with T scaled and shifted (and e scaled, since you needn't ever reach 1).  I had to guess values from your plot rather than using the actual data, but I got a reasonable fit with $$e=\frac{0.34717}{1+e^{(T-1401.03\text{ K})/541.171\text{ K}}}$$.  --Tardis (talk) 00:05, 3 April 2014 (UTC)


 * Tardis, thanks for that.  I presume you meant
 * $$e=\frac{0.34717}{1+e^{(1401.03\text{ K}-T)/541.171\text{ K}}}$$.
 * It isn't a good fit, with a worse case error of 27%. Choosing constants of 0.347, 1400, 500 produces the best RMS error of 4.4% with a worst case error of 8.2% - still not very good.  However, reading the logistic function article and links therein has given me some ideas to follow up on. 120.145.97.4 (talk) 02:48, 3 April 2014 (UTC)


 * Once of the alternatives mentioned in the logistic function article is the Gompertz curve - which I had not previously heard of:-
 * e = aebe cT


 * Trying it gives a good fit, RMS error of only 1.55% with constants of a=0.398, b=-3.4, c=-0.00102. However, a function that takes an exponent of an exponent does not seem a natural process, and a better fit would be nice.  Any ideas? 120.145.97.4 (talk) 10:17, 3 April 2014 (UTC)


 * Yes, of course &mdash; sorry for the sign error (it crept in in converting from gnuplot syntax). But I'm surprised by your statement of its error: is it for the dashed section of your curve (which I ignored)?  In any event, I'm crippled by having to guess points (and don't feel like trying to automate extracting them).  If you post the actual data, I'll see if it's really as hard to fit as you say.  --Tardis (talk) 00:40, 4 April 2014 (UTC)


 * The red dashed section is unimportant. The data in the textbook does not have it - I added it by guestimation knowing that emissivity assymptotes to a minimum value.  But the values at and just above 300 K are important, because what happens in practice is you measure things at laboratory ambient (about 300 K) or the lab oven (max temperature 400 K) and then correct for operating temperatures.  As measurement is in any case subject to error (especially if in a home laboratory using home-constructed instruments eg mine), you don't want to build in more error.  The textbook data was obtained by carefull measurement by the guarded concentric cylinder method and is accurate to the number of digits given.  Here's the data:-
 * [three colums of data - does anyone know how to format this in Wikipedia?]

Temperature (K)  Textbook Value  0.347/(1+e(1400-T)/500) ===============  ==============  ====================     0                 -             0.020   100                 -             0.024   200                 -             0.029   300               0.032           0.035   400               0.042           0.041    500               0.053           0.049   600               0.064           0.058   700               0.076           0.069   800               0.088           0.080   900               0.101           0.093  1000               0.114           0.108  1200               0.143           0.139  1400               0.175           0.174  1600               0.207           0.208  1800               0.236           0.239  2000               0.260           0.267  2200               0.279           0.289  2400               0.296           0.306  2600               0.311           0.318  2800               0.324           0.327  2900               0.329           0.331  3000               0.334           0.333  3100               0.337           0.336


 * 120.145.97.4 (talk) 07:45, 4 April 2014 (UTC)


 * I did some quick formatting above. Somebody else can make it into a proper wikitable, if they care to take the time. StuRat (talk) 12:03, 4 April 2014 (UTC)


 * The basic logistic fit to that data is $$e=\frac{0.34812}{1+e^{(1403.79\text{ K}-T)/540.196\text{ K}}}$$. The worst-case error is +25%, at the smallest temperatures/emissivities; since all errors are weighted equally, we would expect them be relatively larger there.  We can combat this (since you say the fit near the bottom is important) by weighting the values proportionally to some power (I used -2) of themselves (or to the temperature): $$e=\frac{0.332847}{1+e^{(1319.96\text{ K}-T)/479.451\text{ K}}}$$ (10%, still at the bottom).  Or you can choose the relative differences as your metric by fitting the logarithms: $$e=\frac{0.333243}{1+e^{(1319.96\text{ K}-T)/480.910\text{ K}}}$$ (12%).  Your fit has a larger relative error (-10%) at $$T=700\text{ K}$$ than the +8.2% at the low end.  No one of these is "best", of course; we have to balance error, simplicity, and smoothness.  (A spline would go through all the data points smoothly but would sacrifice the simplicity; linear interpolation would sacrifice smoothness; etc.)  --Tardis (talk) 02:52, 5 April 2014 (UTC)


 * Thanks for your help, Tardis. 10% error at 700 K is not ideal, but less important than the 8.2% error at 300 K.  Unless somone comes up with a formula that fits better, what I'll do is this:  Calculate using 0.0000338T1.18 and also using 0.00606T0.5, and take whichever is the lesser value at the temperature of interest.  This will be very accurate below 1900 K, pretty good above 1900 K, much simpler than splines, is easily "inverted" to find T as a function of emissivity, and more likely has some relevance to the underlying physics, at least below 1900 K.

Thank you for providing tabulated data. This 3rd order polynomial e = A + B t + C t 2 +D t 3 with A=0.005488067, B=0.00006750678, C=5.912991e-8, D=-1.509679e-11 achieves MSE=0.0000071. This asymetrical sigmoid e = D + ( A - D ) / [ 1 + ( t / C )B ]M with A=0.02429779, B=1.920627, C=19239.42, D=0.3577777, M=93.93041 achieves MSE=0.0000024. A curve fitting tool is here. 84.209.89.214 (talk) 01:30, 6 April 2014 (UTC)


 * Thanks for that, 84.209.89.214. I don't think I will use your second equation, with its' 8.2% error and high sensitivity to constant roundoff. 120.145.70.125 (talk) 05:27, 6 April 2014 (UTC)


 * Regarding your not wanting to use a 5th degree polynomial, there might be a split between how the curve theoretically should look and how it actually looks, in which case the 5th degree polynomial may work better than those equations which theoretically should fit exactly. For an example from another field, a ballistic track should theoretically follow a parabolic arc, but when you add in air resistance, wind, spin, and other effects, it may deviate significantly from that shape.  So, if trying to hit a specific target, you would do better to allow curves which can take those additional constraints into account, when plotting the previous trajectories.


 * In your case, I suggest you try a 5th degree polynomial, to see if it's a better fit, then you can make an informed decision on whether to use that or another equation. StuRat (talk) 15:28, 6 April 2014 (UTC)
 * No.  I don't doubt that with sufficient degree, a polynomial can be found that fits as closely as one desires - as you said, no discontinuities are apparent. There is everywhere only one value of e for any value of T and vice versa.  And as I said, using splines or a high-degree polynomial is a cop-out.  Your example of ballistic trajectories is a good one, but not for the reason you said.  If our only knowlege of aiming what to fit high degree polynomials to what happened each time we fired a gun, we would learn nothing.  But someone (Newton) worked out the basic physics of mass, acceleration, and gravity.  That allowed some other person to say "Well there is some error, I'll work out how to correct for wind".  And then somebody else could come along and say "Well, it's better, but there is still some error.  I'll look at the aerodynamics of spin."  And now, since before World War 2, man has understood enough that he can fire dirty great military weapons with accuracy - he can predict the outcome before the gun is fired, each time, not just fit a curve after he's fired it.
 * In my case, I need a formula to calculate emissivity at a given temperature for tungsten. A polynomial may do the job.  But later I might need it for some other metal, a similar refractory metal or some other.  I might need to calculate the temperature for a given emissivity instead.  Accurate and well-corroborated measured data is available for tungsten due to its long and widespread use in the vacuum tube and incandescent lamp industries.  Similarly accurate and verified data is not readily available for other metals.  But if I find out the what and why of the points of inflection, I'm probably going to be home and hosed.
 * Another example is the variation of specific heat of gasses with temperature. It's a curve that looks something like a rounded off square root sign - as temperature is increased, spec heat falls rapidly, then flattens out to a minimum value, then rises again steeply, then flattens out somewhat and continues to rise slowly and more slowly as temperature goes way up.  A 5th order polynomial fits it quite nicely.  But you need a different polynomial for each gas, and the coefficients change with pressure in complex ways.  However, some boffin realised that the basis of most of it is degrees of freedom of molecular brownian motion.  The someone else realised that molecular spin altered things a quite a bit.  Then someone else looked at atomic bond stretching.  And now we can use a quite simple calculation to work out the spec heat for ANY gas, under ANY conditions of temperature and pressure.
 * 120.145.70.125 (talk) 00:27, 7 April 2014 (UTC)
 * Now we're getting into WP:RD/S territory. The simple explanation is the Stefan–Boltzmann law; the emissivity is the correcting factor, and is in general dependent on all manner of quantum mechanical processes.  (You might start a real study at Fresnel equations and complex refractive index.)  Our article only just touches on this, but it does have a link to Sakuma–Hattori equation, which appears to be the sort of thing you're looking for.  Unfortunately, it has many forms, and the "standard" choice is said to be relevant only to the low end of your data.  (This is good in that you care about that portion, but bad in that we have too little low-temperature data to make a good fit.)  --Tardis (talk) 03:36, 9 April 2014 (UTC)

Math question
Can anyone please quickly tell me the answer to :

PI * (365 / 0.6) + (1 * 10) = ? — Preceding unsigned comment added by 14.39.170.53 (talk) 21:55, 2 April 2014 (UTC)
 * 1921.135531 but did you really mean to write (1*10) ? 84.209.89.214 (talk) 23:18, 2 April 2014 (UTC)
 * I am assuming that the OP wants to know how to answer that type of question, not the answer itself. Besides, we don't do homework here. But I see nothing wrong with providing the method to do a problem. As such, I would suggest using Please Excuse My Dear Aunt Sally (Parenthesis, Exponents, Multiplication/Division, Addition/Subtraction) Order of Operations.

The provided question, I must say, is an iconic example of using the proper Order of Operations. For humor's sake, I like to pretend that the Order of Operations is a distinct medieval religious order. Father Superior Parenthesis is at the top of the Order, and his little monks (Brother Exponent, Brother Multiplication, etc.) fulfill the lower ranks. 140.254.227.76 (talk) 14:42, 4 April 2014 (UTC)
 * PI is a constant.
 * Do the operations in parentheses first.
 * Do multiplication second.
 * Do addition last.