Wikipedia:Reference desk/Archives/Mathematics/2010 April 14

= April 14 =

Randomness Testing
The following string of 44 digits represents final digits of a sequence of 33-digit results on the calculator that comes with Windows98 (Specifically: The 4th root of 17797577777.77...7 minus that of 17797577777, where the 7s after the decimal point range from 1 to 44, where calculation is consistently done with the square root operation, where beyond the 20th term the whole and decimal parts are added to each other and beyond the 30th term this entails a further addition of 7*10-31, 7.7*10-31, etc.). This sequence was arrived at without discarding previous similar data (i.e., the claim is that this was not part of a data-mining operation).

$$ 38297838339867556722539777859777712092576764 $$

Now, I chose to identify the salient feature of this string to be that three of our digits are missing through precisely 33 terms, the last to arise being 4 at the 44th, and the digit 0 being one of the other two to allow me to notice this all; and a random sequence of digits with this description results with probability 3.132*10-7.

I have more inaccurate digits to work with--the penultimate, third from last, etc.--and I'm wondering if there is a specific best approach to finding a non-random pattern or other indication. This may be the sort of thing I should know by now, but I don't. Any help would be appreciated. Thanks, and sorry I couldn't re-type the whole sequence including non-final digits after my flub with this laptop touchpad.Julzes (talk) 01:13, 14 April 2010 (UTC)


 * Are you basically asking about checking whether a coin is fair? 66.127.52.47 (talk) 09:55, 14 April 2010 (UTC)

I guess my own calculation on the final digits is close to that, but I was looking for something perhaps a little deeper along the lines of picking a signal of any type out of noise. I don't suppose I'm looking for something simple enough to be fully described in an encyclopedia article, but perhaps a description of the basic theory behind something hard with a link to an adequate tool based upon it. Note that the description of the method for obtaining the sequence is given in full. If someone wants to replicate and analyze it and report back on that, that would be almost equally interesting as telling me how to do it myself.Julzes (talk) 14:17, 14 April 2010 (UTC)
 * If you want to get a signal from data you have to formulate your hypothesis before you examine the data. You can always retrofit properties that make any given data meaningful or surprising. When I expand 1/3 to 10 digits, I get 0.333333333 - the chance for that string of digits is 10-10 (or 1, depending on how you see it). Note that nearly all real numbers are normal numbers, meaning that the probability that the decimal representation of a (uniformly) random real number contains an encoding of the full text of Hamlet and the dedication "For Lizzy, with love! Your Oxi!" is 1. -Stephan Schulz (talk) 14:54, 14 April 2010 (UTC)
 * Out of curiosity, just what do you mean by "a uniformly random real number"?
 * (I think a complete answer would be very long. There's a rather natural notion of what it means for a real number to be, roughly speaking, random against all computable tests &mdash; three fairly natural and distinct formulations, by Kolmogorov, by Martin-Löf, and a third I don't remember, turn out to be equivalent.  However, by requiring randomness against wider classes of tests, you get ever stronger notions, each of which still holds for almost all reals.  My sense is that there is no coherent notion of what it means for an individual real to be "truly" random.  This is a similar conundrum to the one about what, if anything, it means to be a definable real number.) --Trovatore (talk) 01:55, 16 April 2010 (UTC)
 * I was imprecise, in that I should have restricted it to some finite, non-empty interval. In that case, the notion is clear (in my head ;-). --Stephan Schulz (talk) 02:01, 16 April 2010 (UTC)
 * Well, no, the finiteness of the interval is not really the point. Everything I said above holds without modification on [0,1]. --Trovatore (talk) 02:04, 16 April 2010 (UTC)
 * Oh, wait, I didn't see that you specified "the probability that ... of a random real number:. OK, that's different; that's linguistic shorthand for something else, and doesn't imply that you have a notion of what it means for an individual real to be random.  My apologies; I misinterpreted you.  Too bad, though, in a sense, because it's a very interesting subject. --Trovatore (talk) 02:09, 16 April 2010 (UTC)


 * Finding patterns can be extremely difficult (cryptanalysis amounts to that), but one simple approach for the example you gave would be to find a Markov chain that describes the data. See also dynamic Markov compression. 66.127.52.47 (talk) 15:37, 14 April 2010 (UTC)

I guess my hypothesis is that whatever signal there may be from a purely mathematical standpoint in the array of inaccurate 6 or 7 final digits is pretty much entirely in the final digits. Apparently, according to Stephan Schulz, I've ruined the final digits by noticing a pattern after seeing the data. Thanks for the article suggestions, unnamed number person.Julzes (talk) 14:27, 15 April 2010 (UTC)

Diophantine equation
Using the equation $${1\over x} + {1\over y} = {1\over n}$$, I have found that the number of solutions when n is a power of any prime (n^x) to be: $$xn-x+1$$. Similarly, if n has 2 prime factors, $$n = p_1^x*p_2^y$$, the number of solutions is: $$(2xy)n^2+(4xy+x+y)n-(2xy+x+y-1)$$ and if n has 3 prime factors: $$(4xyz)n^3-(10xyz-2xy-2z)n^2+(12xyz-4xy-4xz-4yz+x+y+z)n-(4xyz-2xz-2xy-2yz+x+y+z-1)$$

However, I basically just arrived at these equations through trial and error. Where are they coming from, and how can I generalize this? 129.219.184.186 (talk) 02:02, 14 April 2010 (UTC)


 * Are you sure about those formulas? For 1/x + 1/y = 1/pk I find that there are only 2k+1 integer solutions regardless of what p is.  They are the pairs (pk(pi+1), pk-i(pi+1)) for all 0 ≤ i ≤ k, as well as those pairs with x and y reversed. Rckrone (talk) 06:32, 14 April 2010 (UTC)
 * Oh wait, I only considered positive x, y. Allowing negative values adds an additional 2k solutions: (-pk(pi-1), pk-i(pi-1)) for all 0 < i ≤ k and their reverses.  That makes a total of 4k+1 integer solutions. Rckrone (talk) 06:47, 14 April 2010 (UTC)
 * Thinking more generally about the shape of the graph in the x-y plane, we can conclude that the number of integer points can never exceed 4n-3, which rules out a quadratic or higher in n. Rckrone (talk) 08:08, 14 April 2010 (UTC)
 * I get a parametric solution as x=(a+b)ak, y=(a+b)bk, n=abk. You can assume a and b are relatively prime and you can then determine all the solutions for a given n if you know its factorization.--RDBury (talk) 20:20, 14 April 2010 (UTC)
 * Actually there is a simpler solution: x=n+a, y=n+b where ab=n2. That gives the number of solutions as d(n2), the number of divisors of n2, if you count only positive solutions, and 2d(n2)-1 solutions if you allow negatives. (You can eliminate a=b=-n since that produces x=y=0 which is impossible.)--RDBury (talk) 03:18, 15 April 2010 (UTC)
 * Wow, nice. Rckrone (talk) 06:46, 15 April 2010 (UTC)

RSA
Reading about the number theoretic details of the RSA algorithm, I know that if n=pq (p and q are two large distinct primes) is factored, then $$\phi(n)=(p-1)(q-1)$$ can be easily computered and then using the euclidean algorithm, we can use the encryption exponent e to compute the decryption exponent d and we can view the plaintext for any ciphertext. In reverse, if I know n, d, and e, how can I efficiently factor n? I got up to the step where I have $$k=ed-1$$ but then how can I find p and q?  In fact, I don't even have $$\phi(n)$$. I just know k, which is one of its multiple. Thanks! 174.29.105.219 (talk) 02:12, 14 April 2010 (UTC)


 * If ed-1 is significantly smaller than n, you can factor ed-1 to get prospects for p-1 and q-1 using a smaller search space than you would factoring n itself. Then, with the small number of prospects for p-1 and q-1, you just add one to each of them and see if it evenly divides n. --  k a i n a w &trade; 02:26, 14 April 2010 (UTC)


 * You can write


 * $$\phi(n) = (p-1)(q-1) = n + 1 - (p+q)$$


 * p and q are of order sqrt(n), so you can easily extract which multiple of phi(n) you have. Thus phi(n) follows from that and thus also p + q, and from that p and q themselves as the product of p and q is known to be n. Count Iblis (talk) 02:32, 14 April 2010 (UTC)

The multiple of phi(n) that I have ed-1 is actually much larger than n itself so there is no hope of trying to factor it. It would have been faster to probably just factor n itself to begin with. Also, how can I extract which multiple of phi(n) do I have? 174.29.105.219 (talk) 05:24, 14 April 2010 (UTC)


 * Suppose that n is number of many hudreds of digits, then p and q being of order sqrt(n), are many orders of magnitude less than n, so ed-1 divided by n is almost equal to that integer you seek. If you divide ed-1 by that integer you get phi(n) and once you have phi(n), you only have to solve a quadratic equation to extact p and q. Count Iblis (talk) 14:26, 14 April 2010 (UTC)
 * ed − 1 is not divisible by n, and it can be almost as large as n2. Which integer do you want it to be divided by, again?—Emil J. 14:37, 14 April 2010 (UTC)
 * Oh, I see: you approximately divide ed − 1 by n, and you try integers "nearby" the result to find one of them which divides ed − 1. What is "nearby"? Well, p and q are of order √n, hence φ(n) = n(1 + O(1/√n)). ed − 1 will be of order O(n2), hence (ed − 1)/n will be of order O(n), and it will differ from (ed − 1)/φ(n) by something of order O(n/√n) = O(√n). In other words, the search space for the divisor of ed − 1 is about as large as the search space for factoring n by trial division, unless you are lucky and ed − 1 is much smaller than usual.—Emil J. 15:34, 14 April 2010 (UTC)
 * OTOH, I gather from our RSA page that some practical implementations apparently fix e to be as low as 3 (I can't imagine why, as any artificial restriction of the key space reduces security, but whatever). If this is the case, then ed − 1 is O(n), and Count Iblis' method works like charm. As a matter of fact, if e = 3, and p, q are large enough (read: not equal to 3), then the condition of e being coprime to φ(n) implies that p and q are both congruent to 2 modulo 3, hence φ(n) is 1 mod 3. It follows that d = (2φ(n) + 1)/3 and ed − 1 = 2φ(n), so one does not even have to perform the division, we already know the result. However, the OP's statement that ed − 1 is much larger than n suggests that their e is not that small.—Emil J. 12:36, 15 April 2010 (UTC)
 * ed-1 is not necessarily hard to factor, despite its size, as it may have many relatively small prime factors. Once you have factorised ed-1, create a list of its divisors, then for each pair of divisors a, b test (a+1)(b+1) to see if it is equal to n. As phi is a divisor of ed-1 then so are p-1 and q-1, so you recover p=a+1 and q=b+1. Tedious to do by hand, but fairly simple for a computerised search. And even if you can only partly factorise ed-1, you might get lucky. Gandalf61 (talk) 10:13, 14 April 2010 (UTC)
 * ed-1 is not necessarily hard to factor, but there is no guarantee that it isn't. Miller has described a polynomial-time algorithm for factoring n using any nonzero polynomial-sized multiple of φ(n) assuming the generalized Riemann hypothesis, which assumption can be removed at the expense of making the algorithm randomized using an idea of Rabin. The unconditional randomized polynomial-time version of the algorithm goes similar to Miller–Rabin primality test, except that the given multiple m of φ(n) is used in place of n − 1. Once the algorithm identifies a compositeness witness a, we know that one of the following two things happens: (1) $$b:=a^{d2^r}\not\equiv\pm1\pmod n$$ but $$b^2\equiv1\pmod n$$ for some r < s, which implies that gcd(b ± 1,n) is a nontrivial factor of n; (2) $$a^m=a^{d2^s}\not\equiv1\pmod n$$, which implies that gcd(a,n) is a nontrivial factor of n.—Emil J. 12:10, 14 April 2010 (UTC)
 * Yes, I see. And as our RSA article suggests that p and q should be chosen so that p-1 and q-1 both have at least one large prime factor, this would make factoring ed-1 especially difficult. So does this mean that if p and q are chosen appropriately, knowing both public and private keys does not make it any easier to find p, q or phi than if we just know n (although, as we can now decrypt any message, this may only be of theoretical interest) ? Gandalf61 (talk) 08:48, 15 April 2010 (UTC)
 * ?? I just described an efficient algorithm how to find p and q given the pair of keys.—Emil J. 11:41, 15 April 2010 (UTC)
 * Maybe it's worth clarification that the condition on p − 1 and q − 1 having a large prime factor is imposed so that one cannot too easily factorize n without knowing the private key (and thus to break the cipher), using factoring algorithms such as Pollard's p − 1 algorithm.—Emil J. 13:22, 15 April 2010 (UTC)

As EmilJ explained, if ed -1 is of order n^2 (or even just of order n^(3/2)) then you won't be able to find p and q easily. However, it is not clear to me why that should be the case. If I design the RSA code, I pick p and q and take some e. Then I compute d using the p and q that are known to me. Why would I choose the d that only I will be using to decript messages such that e d - 1 will be a huge multiple of phi(n)? Count Iblis (talk) 14:54, 15 April 2010 (UTC)
 * First, for the third time: you will be able to find p and q easily in any case, you just have to use the right algorithm. Second, you don't get to choose d, it is uniquely determined by e (or rather, if you want to choose d, you can't also choose e, you have to compute it from d).


 * Now: why would you choose either of them such that ed − 1 will be a huge multiple of φ(n)? Because the size of ed − 1 is completely irrelevant, this number is not used in any way in either the encryption or decryption process. On the other hand, what is relevant is that as in any other encryption scheme, the cipher can be only as strong as the keys being used. If you restrict the set of keys you are choosing from, or if you select parts of the key in a predictable way, you are giving the adversary an advantage which they may exploit to break the encryption more easily. In this case it may not be entirely obvious how could the adversary benefit from having a small e when e is publicly known anyway, so here are some suggestions: (1) The adversary may be trying to break the cipher by some kind of an algorithm which uses precomputed tables; it's much easier for them to arrange this if they know beforehand that a large fraction of the key space will not be used at all. (2) If e is fixed to a specific value like 3 above, it may turn out that there are more efficient algorithms for breaking these particular instances of the cipher that do not apply to the general case.—Emil J. 16:14, 15 April 2010 (UTC)


 * Ooops, sorry. I wasn't thinking straight. Of course, d is found by finding the inverse of e Mod phi(n). Count Iblis (talk) 22:16, 17 April 2010 (UTC)

Sum
how i can calculate Rumman sum —Preceding unsigned comment added by Abdullmajeed (talk • contribs) 06:31, 14 April 2010 (UTC)
 * Surely you mean Riemann sum. Staecker (talk) 19:17, 14 April 2010 (UTC)