Wikipedia:Reference desk/Archives/Mathematics/2010 October 27

= October 27 =

Primitive roots modulo 5^n
Hi all,

Could anyone suggest an elegant (number theoretic) proof of the fact that both 2 and 3 are primitive roots modulo 5^n for all n, i.e. 2^i or 3^i both generate the multiplicative group of Z/((5^n)Z)? I've looked at inductive proof but it seems a bit messy and I'm hoping there's a nice clean/concise solution - I have no doubt the reference desk brains are far more capable of coming up with an elegant proof than I am ;) (or if there is a particularly concise inductive proof, I have nothing against induction!) Could anyone suggest anything, please?

Thankyou for the help! Delaypoems101 (talk) 00:02, 27 October 2010 (UTC)
 * $$(\mathbb Z/5^n\mathbb Z)^\times$$ is cyclic of order 4⋅5n−1. If a = 2,3 were not a primitive root, its order would have to be a proper divisor of 4⋅5n−1, or in other words, a would be a square mod 5n, or n > 1 and a would be a fifth power mod 5n. In the former case, a would also be a square mod 5, which it is not (the only squares are ±1). In the latter case, a would be a fifth power mod 25, which it also is not. (You can check that by simply enumerating all the 20 possibilities, or you can reverse the argument above: if a were a fifth power mod 25, then a4 ≡ 1 (mod 25), which does not actually hold.)—Emil J. 15:14, 27 October 2010 (UTC)


 * The moral of all this is that when p is an odd prime and n ≥ 2, then r is a primitive root modulo pn if and only if r is a primitive root modulo p2.—Emil J. 15:39, 27 October 2010 (UTC)

Standard error and standard deviation
In lit review today, the professor was going on about how poor the author's data was, and how we can infer this point from his use of standard error rather than standard deviation -- he explained that because standard error is equal to standard deviation/square root of the population (I think that's what he said), if the standard deviation appears too large for him to publish without everyone laughing at his research data, he can trick readers into viewing a smaller number by reporting standard error instead, and only the most erudite will catch on because most people haven't the foggiest notion of what any of this means anyway. I tried to read both articles above, but they a) both confuse me (the former much more than the latter) and b) they didn't really discuss either in terms of the other, which is what I'm really trying to get at. I guess that's my question...if you could please explain it without sigmas and thetas, 'cause then I might as well read the article again. Thanx!  DRosenbach  ( Talk 02:08, 27 October 2010 (UTC)
 * Your formula for the standard error of the mean is right (standard deviation / sqrt(n), n = population size), so generally standard error of the mean is much smaller than standard deviation. The specifics really matter, though... it can be perfectly reasonable to report standard error of the mean instead of standard deviation. It would of course be possible to report one when the other was appropriate, since they measure quite different things.
 * The standard deviation measures how dispersed your sample data is. Standard error of the mean measures how accurate the average you compute from your sample data is. For instance, if you measure the heights of your classmates, you can get a standard deviation from that sample which says how diverse the heights are. You can then compute the average height of your class, but you don't know how accurate that average is. The standard error of the mean is a "reasonable" estimate of the accuracy of your class's average height when compared to that of, say, the average height of anyone in the world.
 * For the sake of the argument, suppose half of everyone is 2 feet tall and the other half is 8 feet tall. The standard deviation of any sample of those people's heights would be relatively large--say around 3 feet (a guesstimate), which should be roughly the same for each sample size. However, if you were to compute the average height from any sample, you'd get pretty close to 5 feet. In fact, as you took more and more people in your sample, you'd expect to get sample averages very very close to 5 feet--the error of the mean would start to vanish. In a crude way, that's why you have to divide by an expression involving the number of data points in your sample to compute the error of the mean.
 * So again, standard deviation = dispersion of your sample; standard error of the mean = accuracy of the average you calculate from your sample.
 * Note: I've dropped out a lot of details and technicalities that you clearly don't want to hear about, like that the above formula is an estimate for an infinitely large population where your samples are independent. But, I avoided almost all equations, sigmas, and thetas! 67.158.43.41 (talk) 05:41, 27 October 2010 (UTC)


 * Maybe a bit simpler (if not precisely correct): the standard deviation measures the variation expected from one individual item in the group, while the standard error measures the variation expected in the average of a group of size n of the items. For instance, you know that if you pick one person at random you have a reasonable chance of getting someone with a high IQ or a low IQ, but if you pick twenty people you have a much lower likelihood of the average IQ of the group being high or low (to get a very high or very low average IQ you'd need to pick 20 people all with high IQs or all with low IQs).  Standard deviation is a descriptive statistic (it describes a population) and is usually reported but isn't often used in meaningful ways. Standard error is used extensively in hypothesis testing, to test if samples come from different populations (this is why sample size matters in hypothesis testing - the larger n is, the smaller standard error is, and the easier it is to see differences between the sample mean and the expected mean of the population).  -- Ludwigs 2  06:12, 27 October 2010 (UTC)
 * Nice answers, both of you. &#x2013; b_jonas 12:58, 29 October 2010 (UTC)

1 = -1 ?
in my textbook, a puzzle (not homework)

we have learnt $$\sqrt{\frac{a}{b}}=\frac{\sqrt{a}}{\sqrt{b}}$$ and $$\sqrt{a}\times\sqrt{a}=a$$

also $$-1=\frac{1}{-1}=\frac{-1}{1}$$

thus $$\sqrt{\frac{1}{-1}}=\sqrt{\frac{-1}{1}}$$

multiply both sides by $$\sqrt{1}\sqrt{-1}$$ we have $$\sqrt{\frac{1}{-1}}\sqrt{1}\sqrt{-1}=\sqrt{\frac{-1}{1}}\sqrt{1}\sqrt{-1}$$ therefore $$\sqrt{1}\sqrt{1}=\sqrt{-1}\sqrt{-1}$$

hence $$1=-1$$

can you spot the wrong reasoning ??? --Umar1996 (talk) 15:14, 27 October 2010 (UTC)


 * Try Mathematical_fallacy. -- SGBailey (talk) 15:20, 27 October 2010 (UTC)
 * Specifically, the rule $$\sqrt{\frac{a}{b}}=\frac{\sqrt{a}}{\sqrt{b}}$$ works when b is positive, but not in general. -- Meni Rosenfeld (talk) 15:24, 27 October 2010 (UTC)


 * yes! $$\sqrt{-b}$$ is impossible if b is positive. --Umar1996 (talk) 11:55, 28 October 2010 (UTC)
 * Not impossible, but different. — Anonymous Dissident  Talk 12:05, 28 October 2010 (UTC)

how high do cryptographic functions scale (how big primes can you pick?)
take an algorithm like RSA, which uses big primes. Now, it is always big news when we generate a large prime, so, obviously there are limits to how many digits the primes can be. What is that limit? Can it work with 5000 (decimal) digit numbers, or 10,000 digit ones? Million digit ones? Where is the limit, before it becomes too slow to use realistically? 84.153.230.5 (talk) 16:13, 27 October 2010 (UTC)
 * In general it's not a really interesting question. The important question is, does it scale better than algorithms for cracking the cryptograms.  And good encryption algorithms should be polynomial while requiring an exponential time to crack.  So the answer is that, long before the algorithm becomes unusuable it will be essentially impossible to break the encryption.  As to spesific limits, it will depend on application.  Taemyr (talk) 23:57, 27 October 2010 (UTC)


 * Primality testing can be done in polynomial time in the number of digits in the input using the AKS algorithm. The polynomial exponent is (off the top of my head, so take it with a grain of salt) 6. That is, a million digit prime would take 1,000,000^6 = 10^36 operations to prove primality for. Probable primes are easier to check. The key step in RSA is modular exponentiation by squaring, which is shockingly fast. I recently had Python compute the largest known Mersenne prime, something in the neighborhood of 40 million binary digits, so, say, 10 million decimal ones, and take it modulo around 40 million. I didn't even combine the steps, which should have been much quicker. It completed several seconds later. Last I heard, RSA usually uses ~thousand digit primes, which, as Taemyr said, should be good enough.
 * Note: I've actually glossed over many details. I understand AKS is merely asymptotically polynomial, that certain primes take different amounts of time to prove primality for, that Mersenne primes are historically easier to find, that my example isn't representative, etc.; I don't believe the extra complication is worth it, for me or the OP. 67.158.43.41 (talk) 04:32, 28 October 2010 (UTC)
 * Was that several seconds for a single modulo operation? Keep in mind that while exponentiation by squaring is efficient, it still takes time proportional to the log of the exponent. I think in RSA decryption you use an exponent roughly the size of the primes. So for 40 million-bit primes, you'll need to multiply by 40 million the time it takes for a single operation, and several seconds quickly become several years. -- Meni Rosenfeld (talk) 08:41, 28 October 2010 (UTC)
 * If I recall RSA, you exponentiate by the encoding string, which can be on the order of the prime (product of the primes?). So you certainly have a good point--if the exponent was gigantic it would indeed take much, much longer than my admittedly nonrepresentative anecdote. If the OP is interested, perhaps they could try implementing RSA and see how quickly it works if they feed in some gigantic primes. I would be genuinely interested in the results. 67.158.43.41 (talk) 10:01, 28 October 2010 (UTC)

finding angles in 3D figures
Could you please give me an expression that gives the angle subtended by points placed symmetrically on a spherical surface. For instance, I understand that the angle between tetrahedrally-disposed covalent bonds on a carbon atom is ~109.5deg. How is this derived?

As a more general example suppose we had to place radio transmitters to produce optimum coverage over the Earth's surface (assumed spherical). For certain numbers e.g. 6, one of the many valid placements would be self-evident (e.g one at each pole and four at 90deg. intervals on the equator). But suppose we had a number such as 7 or 17? And would there be a unique configuration (disregarding "symmetry" equivalents) or could there be alternatives?

(Digression: It occurs to me that a physical analogue could be devised - with difficulty - to yield a solution: e.g. by placing the required number of equal repulsive electric charges confined within a thin spherical shell or surface).

Thanks. —Preceding unsigned comment added by Paul Renshaw (talk • contribs) 20:11, 27 October 2010 (UTC)
 * You can generate a tetrahedra, so the distribution needed, by placing points at non-adjacent vertices of a cube. The angle is then the angle between any two of these, such as between the vectors to (1, 1, 1) and (1, -1, -1) from the origin. Similar solutions exist I think for other regular polyhedra such as the octahedron that you describe. On the more general problem there is information at circle packing which links to Tammes problem and Thomson problem.-- JohnBlackburne wordsdeeds 20:20, 27 October 2010 (UTC)


 * There are only a few platonic solids, that is, regular (i.e. "symmetric", by what I think you mean) shapes that fit on the surface of a sphere. These each have well-known angles. The number of vertexes in these shapes determines the allowable solutions to your question, 4, 8, 6, 20, and 12. Apparently, if you place any other number of repelling charges on a sphere (edit, see below: under a to-be-determined and non-standard potential function), they'll end up in a configuration slightly different from perfect symmetry. A special case is <= 3 points, since they don't form a polyhedron in that case, and the platonic solids argument doesn't apply. Clearly 1 point doesn't provide symmetry; 2 points do if they're placed on opposite sides of the sphere; 3 points do not, since they determine a plane which, from symmetry, must divide the sphere in half--which isn't a spherically symmetric configuration. 67.158.43.41 (talk) 22:44, 27 October 2010 (UTC)
 * Interestingly, with 8 repelling charges on a sphere, the corresponding platonic solid (a cube) will not occur. Rather, 8 points arranged around a central atom typically form a Square antiprism, rather than a cube.  It's not hard to see why.  Even though some of the symmetry is broken in going from the cube to the anticube, you actually increase the average distance between the points.  There aren't many chemical compounds that are 8 coordinated, but there are some: TeF82-, for example. Buddy431 (talk) 03:12, 29 October 2010 (UTC)
 * Addendum: presumably the twenty coordinate case would also do something similar, forming some sort of skewed Dodecahedron. I don't know of any 20 coordinate compounds, though, so the point is largely moot. Buddy431 (talk) 03:24, 29 October 2010 (UTC)


 * That makes me wonder under what potential function point charges on a sphere arrange themselves into the platonic solids, where possible. I wonder if the usual potential function with distance taken to be geodesic distance works. Anybody know? 67.158.43.41 (talk) 04:13, 29 October 2010 (UTC)

Clustering algorithm
Is an efficient algorithm known for the kind of problem of which this is an example:

$$\begin{matrix} 5 & 3 & 8 & 17 & 2 \\ 12 & 6 & 4 & 2 & 6 \\ 8 & 8 & 3 & 4 & 11 \\ 12 & 13 & 15 & 12 & 20 \\ 9 & 6 & 3 & 1 & 8 \end{matrix}$$

The figures show the distribution of population in a 5 by 5 block of 1 km squares, the total being 198. The aim is to cluster the squares into six groups each of size 198/6 = 33, or as close to this as possible in terms of minimising total absolute deviation. The groups can be of any shape, but no component square may be attached at just one or more of its corners.

The general problem will have a rectangular array, probably with some zero elements.→86.155.186.37 (talk) 22:17, 27 October 2010 (UTC)


 * I'm sorry, I do not know an algorithm. The general case feels quite hard to me. Thinking a little more, your problem is harder than the partition problem, a known NP-Complete problem. (Proof: take a 1D array of inputs; create a 2D array with 0's off the diagonal, stringing the 1D array along the diagonal. Cluster the 2D array into two groups. The clustering must consider cases where the first cluster takes the lower triangle and the second takes the upper triangle. From here, you can get all possible subsets of the diagonal split between the two clusters by choosing which cluster to put each diagonal element in. This is precisely the set of subsets considered by the partition problem, and you're considering them with respect to the same fitness function. Any other clustering also divides the diagonal into two sets, so is redundant. This instance of your problem is thus precisely equivalent to the full partition problem.) So, short of giving up, I would just throw a greedy algorithm at it and hope for the best. The article I linked is perhaps more hopeful: "The pseudo-polynomial time dynamic programming solution for the subset sum problem applies to the partition problem as well, and gives an exact answer in polynomial time when the size of the given integers is bounded." Perhaps you could use a variant of that solution and execute it in reasonable time. Good luck! 67.158.43.41 (talk) 04:19, 28 October 2010 (UTC)


 * This may be generalized into an NP problem, so an absolutely correct solution would be exhaustive to run - it would be a brute force check of every possible combination. However, a heuristic may be used.  One that comes to mind would be to initially divide this into two subsets of equal size.  It is similar to the partition problem mentioned above, but it is a bit easier because the limitations on which items may be included once another number is included.  Once you have two equal size sets, divide each one into two equal size sets.  If the heuristic fails to produce four nice sets, try rotating the entire set to see if your algorithm grabs a better group (since it will likely be biased to look in one direction for new numbers, rotating the set will make it technically look in a different direction). --  k a i n a w &trade; 12:39, 28 October 2010 (UTC)
 * Thanks for the response. There are analogies with the partition problem, as stated, and also with bin packing for a fixed number of bins. The latter analogy breaks down, as each "bin" has to be able to hold more than its notional capacity to cater for under-filling elsewhere, and the spatial nature of the inputs prevents clustering purely by value. The problem is a simplified version of the one involved in constructing UK parliamentary constituencies, the latest set of which contains the oddity of York Outer, colloquially known as Polo because of its resemblance to the eponymous sweet made by the million in the city.→86.155.186.37 (talk) 15:16, 28 October 2010 (UTC)


 * If (somehow) you knew that two particular values simply had to be in separate groups, you could use the two of them as starting points to create two equal-sized groups. Then, the problem becomes much easier because you'll add some to one group and some to another - back and forth.  Because you have no means of knowing immediately if two values must be in different groups, you don't have a starting point.  You have to guess at one, then try it.  If you don't like the results, try another.  That is the real reason that this is a hard problem. --  k a i n a w &trade; 18:20, 28 October 2010 (UTC)
 * Even if you knew beforehand two values were in separate groups, you might well make a mistake in creating equal-sized groups and take a sub-optimal part of the map which goes to the other cluster in the optimal solution. I don't think the problem is much easier even with this magically derived information, especially considering there are k instead of just 2 clusters total. Also, to be clear, Kainaw had said "It is similar to the partition problem mentioned above, but it is a bit easier because the limitations on which items may be included once another number is included" but my first post shows that this problem is actually strictly harder than the partition problem inasmuch as any solution of this problem automatically solves the partition problem exactly. Disallowing 0's (which you've explicitly allowed), it might become easier than the partition problem.


 * Again I would just try a greedy algorithm a whole bunch of times and hope for the best--like randomly start your k clusters in k places and keep adding to them while respecting the bounds as well as possible. A genetic algorithm might do well too, come to think of it. Mating clusters cleverly would be the key. The method would need to make clusters of nearly the right size change very little, while clusters of the wrong size should be quite flexible. Given two adjacent clusters, one with too many people and one with too few, you could extend the border of the smaller one into the larger relative to the difference needed. 67.158.43.41 (talk) 02:11, 29 October 2010 (UTC)


 * This sounds related to elections. &#x2013; b_jonas 12:55, 29 October 2010 (UTC)