Wikipedia:Reference desk/Archives/Mathematics/2016 January 21

= January 21 =

General question on complex solutions to polynomial equations.
Assume a polynomial f from C to C of degree n has n solutions (no double roots). Draw green curves connecting those points where the Real part of f(x) =0, draw red curves connecting those points where the Imaginary part of f(x) = 0. The n solutions then occur where a green curve and a red curve cross. Is it possible for two solutions of f(x) to be on the same green curve and different red curves (or vice versa) if so, are there any characteristics of f which would determine whether or not it would occur?Naraht (talk) 03:21, 21 January 2016 (UTC)


 * Do you mean that the green curves are the preimages of points on the imaginary axis, and the red curves are the preimages of the real axis? Can you describe your picture for, say, z^2+1?--JBL (talk) 03:24, 21 January 2016 (UTC)
 * Exactly. The Red curve would be where z^2+1 = a +bi where b = 0. So z^2+1 = a which means z^2 is a real number so the Red Curve = the real and imaginary axes. OTOH the green curve are those where z^2+1 = a+bi where a=0 so z^2+1= bi for some b. so z^2 = -1 + bi, so this curve goes through i and is asymptotic to the lines where Re(z) = Im(z) and Re(z) = -Im(z) (and then the second curve going through -i and asymptotic to the same lines. So in that case the two solutions share the same Red curve.Naraht (talk) 04:20, 21 January 2016 (UTC)
 * Surely you're interested in more than degenerate cases such as $$x = 0$$, where the red and green curves are both simply the origin and the solution is the origin (and then your proposition is vacuously true).--Jasper Deng (talk) 16:28, 21 January 2016 (UTC)
 * I do not understand this comment. If the polynomial is f(z) = z then the two curves are the two axes and they intersect in a single point (the origin).  [And something similar will be true for any linear polynomial.]  For a quadratic polynomial, the curves will always be conic sections (in the example above, the red curve is a degenerate hyperbola x y = 0). --JBL (talk) 18:52, 21 January 2016 (UTC)
 * And that is why I should not be commenting here when I'm sleep deprived!--Jasper Deng (talk) 21:17, 22 January 2016 (UTC)
 * I'm still not entirely sure I understand the question, but if you eliminate the case where f has a critical point on one of the axes as in the above example, then the answer is no: each branch of the red and green curves contains exactly one point of intersection. To see this, let P be a point moving along one of the branches without changing direction. Then f(P) moves along one of the axes, also without changing direction since otherwise f would have a critical point where the change occurs. So f(P) can cross the origin only once, and therefore there is at most one point where the branch crosses the curve of the opposite color. In fact, since the "ends" of a branch corresponds to the "ends" of the corresponding axis, f(P) must cross the origin at least once. As a corollary, there must be n branches of each curve, though there are probably more direct ways of showing this. --RDBury (talk) 18:58, 21 January 2016 (UTC)

Deciding whether some dice are fair or not
I want to discover whether some dice are fair, that is, there is exactly the same 1/6 chance for each number. A let my pet monkey through them 10 times and write down the results. In one die I got 6666666666, in the other "2351361633". After that I calculate the chances of getting this number exactly, which is 6^10 in both cases. Am I any step nearer to know which die is truly random?--Llaanngg (talk) 14:17, 21 January 2016 (UTC)


 * See Randomness tests. Loraof (talk) 14:43, 21 January 2016 (UTC)


 * Also, maybe you could do a Bayesian probability approach: Start with prior probabilities of 1/6 for each number, then look at the first observed digit and revise upward (according to a formula) the probability of that number and revise downward the probabilities of the other numbers; then observe another number and revise all the probabilities accordingly; and so on. After a certain number of rolls, you could take some kind of measure of how far from (1/6, 1/6, ..., 1/6) the updated set of probabilities is. Surely your first die will deviate from this a lot more than your second die. Loraof (talk) 15:14, 21 January 2016 (UTC)
 * As Loraof mentioned, Bayesian inference is the correct approach. Given that the die is fair, 6666666666 and 2351361633 are exactly as likely. But given that the die is unfair, 6666666666 is much more likely than 2351361633 - because an unfair die allows being biased towards 6, which suddenly makes 6666666666 have significant probability (whereas 2351361633 is still unlikely no matter what the bias is). So seeing 6666666666 is strong evidence in favor of being unfair. But it depends on the prior - if you strongly believe the die is fair, this result might not be enough to convince you otherwise (but it will weaken your confidence). On average, the more you roll the dice, the more you will be confident in the truth.
 * One problem is that "unfair die" is not a single possibility, there are many ways the die can be unfair. You can have a model where you assume that any bias would likely be subtle to avoid detection. So you assume 94% chance that the die is fair, and for each number, 1% chance that the die is biased so it lands on this number 20% of the times, and 16% on any other number. If you do Bayesian inference under this model, the longer a streak of 6's that you observe, the more you will be convinced that the die is biased towards 6. -- Meni Rosenfeld (talk) 15:32, 21 January 2016 (UTC)
 * Nicely explained, thanks. Dmcq (talk) 16:08, 21 January 2016 (UTC)


 * 2351361633 looks a little suspicious as well, because of the lack of 4s. The probability of seeing no 4s in 10 throws with a fair die is about 16%. So not a hugely significant result, but worth investigating further. Gandalf61 (talk) 16:11, 21 January 2016 (UTC)


 * There is the problem that you got that by inspecting the data afterwards. You might find there aren't any 1s in a second sequence or 2s in another. Dmcq (talk) 19:36, 21 January 2016 (UTC)
 * Thought of another way, in 10 throws there is a 72.8% chance that there will be at least one missing number. So the fact that there is a missing number in the result (which happens to be 4) really doesn't indicate much. -- Meni Rosenfeld (talk) 14:18, 24 January 2016 (UTC)

Bayesian Chi-squared test
If the dice is fair the probabilities are pi = 6−1 = 0.1667 for i = 1, 2, 3, 4, 5, 6. If you do not know whether the dice is fair or not, then pi has a Dirichlet distribution. When you have observed xi occurrences of outcome i the parameters are
 * αi = 1+xi.

The sum is
 * β = Σi αi.

The mean value of pi is
 * μi = αi β−1

The standard deviation of pi is
 * σi = αi1/2 β−1 (β−αi)1/2 (1+β)−1/2

The deviation from fairness, measured in standard deviations, is
 * Zi = (μi − 6−1) σi−1

the chi-squared distribution value is
 * χ52 = Σi Zi2

noting that there are 5 degrees of freedom.

The observation 2351361633 means that
 * x = (2, 1, 4, 0, 1, 2)
 * α = (3, 2, 5, 1, 2, 3)
 * β = 16.
 * μ = ( 0.1875, 0.1250, 0.3125, 0.0625, 0.1250, 0.1875)
 * σ = ( 0.0947, 0.0802, 0.1124, 0.0587, 0.0802, 0.0947)
 * Z = ( 0.2201,−0.5195, 1.2972,−1.7743,−0.5195, 0.2201)
 * χ52 = 5.5

The p-value is 0.5. The hypothesis that the die is fair cannot be rejected.

The observation 6666666666 means that
 * x = (0, 0, 0, 0, 0, 10)
 * α = (1, 1, 1, 1, 1, 11)
 * β = 16.
 * μ = ( 0.0625, 0.0625, 0.0625, 0.0625, 0.0625, 0.6875)
 * σ = ( 0.0587, 0.0587, 0.0587, 0.0587, 0.0587, 0.1124)
 * Z = (−1.7743,−1.7743,−1.7743,−1.7743,−1.7743, 4.6330)
 * χ52 = 37.2

The p-value is below 0.001. The hypothesis that this die is fair is rejected. Bo Jacoby (talk) 23:21, 21 January 2016 (UTC).

Data compression
There is an easy way to test for a fair die. A fair die is one where each face has the same probability of coming up in a roll of the die. In other words, it is impossible to predict in advance which face will come up with a probability greater than 1/n. So using information theory, to check if the die is "fair" just use a data compression program to compress the sequence of result of rolling the die. The string "66666666666666666666" can easily be compressed by the compression program hence the die is NOT FAIR. Where as "2351361633" cannot be compressed hence the die is NOT NOT FAIR or is FAIR. 175.45.116.66 (talk) 23:59, 21 January 2016 (UTC)


 * The string "111122223333444455556666" can easily be compressed by the compression program, still the die is FAIR. Bo Jacoby (talk) 10:08, 22 January 2016 (UTC).
 * But probably not IID. Jheald (talk) 10:27, 22 January 2016 (UTC)
 * Well, as explained earlier, that depends on the Bayesian prior. --Trovatore (talk) 11:00, 22 January 2016 (UTC)
 * The prior is the Dirichlet distribution for xi = 0. This means that we don't know. If we knew there would be no point in doing the test. Bo Jacoby (talk) 19:19, 22 January 2016 (UTC).
 * It's true that fairness and incompressibility are the same thing in theory. It is certainly not true that you can use ordinary compression programs as randomness tests. They are not theoretical ideal compressors. They just look for certain kinds of redundancy that are common in commonly compressed file types, but very different from the biases you tend to see in broken random number generators. You should use randomness tests that are designed to find those biases. -- BenRG (talk) 19:05, 22 January 2016 (UTC)

Applying the extended Euclidean algorithm to finite fields
The part of the article on the extended Euclidean algorithm explaining how to apply it to finite fields is extremely unclear. Given a polynomial, I simply want to be able to calculate its multiplicative inverse over some $$GF(2^n)$$. Here is what I have in Python: I need to write a function  that takes ,  , and  , and gives me   such that   returns 1. Presumably, the extended Euclidean algorithm is what I would use, but it contains operations that I don't know how to compute for finite fields. Any help? — Melab±1 &#9742; 23:00, 21 January 2016 (UTC)
 * GenerateReductionConstant seems to just mask off the high bit of its second argument, but does it in an unnecessarily complicated way. You could replace it with return Polynomial & ((1 << (Polynomial.bit_length - 1)) - 1) for example. But you don't want to clear the high bit anyway: if you keep it set then A ^= Polynomial will clear it in A, which is what you want.
 * You can replace if A & 65536: A ^= ReductionConstant with A = min(A, A ^ Polynomial) to avoid having to find the high bit at all. And replace return C&65535 with return C since C will always be in the right range if the arguments are. And drop the Degree argument since you don't need it.
 * For the extended Euclidean algorithm you need to compute quotients and remainders of polynomials over Z2 (not elements of GF(2n)). You can do this using grade-school long division in base 2 without borrows/carries. Here's a worked example of dividing 11001 (x4+x3+1) by 1001 giving a quotient of 11 and a remainder of 10:

11       _____   1001)11001        1001        -         1011         1001           10
 * You already implemented carryless Russian peasant multiplication for the Galois product, and implementing long division is similar. -- BenRG (talk) 05:13, 22 January 2016 (UTC)