Wikipedia:Reference desk/Archives/Mathematics/2015 March 12

= March 12 =

Approximate-value-with-margin-of-error as data type a
What statistical software packages, if any, allow an approximate value with a specified error margin to replace an exact number in a data set, with calculations such as regression analysis automatically adjusting for them (e.g. least-squares regression being based on the Z-score of each point within its own confidence distribution, rather than the raw residual)? What other options, if any, are there when one needs to account for a mix of more-precise and less-precise measurements, for an inherently heteroskedastic measurement method (where the heteroskedasticity can't just be eliminated by nonlinear transformations), or for the inclusion of estimated or imputed values in a data set? Neon Merlin  05:48, 12 March 2015 (UTC)


 * If using a curve fitting method that allows for variable weighting of data points, you might weight those points with wider margins of error more lightly than those with smaller margins. Another option is to not use data points at all with particularly wide margins of error, and then observe if the solution happens to fall within that range anyway.  Since data fitting usually works better (quicker result, more likely to converge, & fewer inflections/lower degrees in the resulting curve) with fewer data points, this is a good option when it happens to work out.  StuRat (talk) 05:52, 12 March 2015 (UTC)


 * I think it would help to look more carefully at what the margin of error means. Lets say you have two measurements A and B, with margins of error sA and sB. Interpret that to mean that if the actual value is M, then the probabilities of getting measurements A and B are proportional to
 * $$e^{ -\frac{(A-M)^2}{2{(s_A)}^2} }$$ and $$e^{ -\frac{(B-M)^2}{2{(s_B)}^2} }.$$
 * Assuming independence, which may or may not be justified, then the joint distribution is proportional to
 * $$e^{ -\frac{(A-M)^2}{2{(s_A)}^2} -\frac{(B-M)^2}{2{(s_B)}^2}}.$$
 * The maximum likelyhoodestimator for M can then be found with by minimizing
 * $$\frac{(A-M)^2}{2{(s_A)}^2} +\frac{(B-M)^2}{2{(s_B)}^2}$$
 * which, after a bit of first-year calculus comes out as
 * $$M\left(\frac{1}{s_A^2}+\frac{1}{s_B^2}\right) = \frac{A}{s_A^2}+\frac{B}{s_B^2}.$$
 * In other words, the estimate for M is the weighted average of A and B with weights 1/(sA)2 and 1/(sA)2. I thinks this is what you would expect, but this is assuming independent normal distributions. There could be a situation where the B result is obtained by rounding A, in which case you should just throw B out and stick with A as your estimate. Also, if A and B are themselves statistics based on other data, then it would be better to go back to the original data if possible. If you're going to make bread then it's better to start with flour than bread crumbs. --RDBury (talk) 13:18, 12 March 2015 (UTC)
 * When everything else fails, try brute force. In this case brute force means simulation. If you have some computation, Y=f(X), where X is random with some odd distribution, and f is nonlinear, then it may be hard to compute the distribution of Y. Then create (x1, x2, x3, . . ., x10000) representing X, and compute yi=f(xi) for i=1, 2, . . . , 10000, representing Y and you can compute the mean and the standard deviation and so on of Y from yi. Bo Jacoby (talk) 21:17, 12 March 2015 (UTC).

Examples of arbitrariness in mathematics and statistics
An example of arbitrariness in statistics is the selection of the p-value. Why other examples of arbitrariness exist in mathematics and statistics? 140.254.136.157 (talk) 13:52, 12 March 2015 (UTC)


 * In any type of estimation or approximate iterative solution, the accuracy of the answer which is considered acceptable is arbitrary. Approximate curve fitting is an example.  Using hill climbing to find a maximum/minimum is another.  Also in curve fitting, which outliers are far enough outside the norm to prune off is rather arbitrary. StuRat (talk) 16:07, 12 March 2015 (UTC)


 * p-values are not arbitrary, they are a specially defined quantity that tells you valuable information. Now, what type of p-value should be used to reject the null hypothesis can seem somewhat arbitrary, but that is not a matter of math or statistics, but rather a matter of science and politics. See also these relevant XKCD cartoons: . SemanticMantis (talk) 20:37, 12 March 2015 (UTC)


 * In a broad view, mathematical theories are a collection of theorems and results that ultimately derive from a set of axioms, rules for combining those axioms, and a chosen system of logic. Insofar as the axioms are unprovable, the choice of which axioms to use is arbitrary. Of course there are more useful axioms and less useful axioms in any given context, depending on what the mathematician is trying to achieve, but the rules of the game are made up and arbitrary. --Mark viking (talk) 20:55, 12 March 2015 (UTC)
 * For more information on this, see Formalism (mathematics). To have your life changed forever (well, it worked for me!), read "Outlines of a Formalist Philosophy of Mathematics" (1951) by Haskell Curry (yes, that Haskell, and, yes, that Curry!). In the formalist tradition it is not unreasonable to claim that all of mathematics is completely arbitrary. Useful, sometimes, but completely arbitrary. RomanSpa (talk) 11:56, 14 March 2015 (UTC)