Talk:Trial division/Archive 1


 * However, for n with at least one small factor, trial division can be a quick way to find that small factor. It is worthwhile to note that for random n, there is a 50% chance that 2 is a factor of n, and a 33% chance that 3 is a factor, and so on. It can be shown that 88% of all positive integers have a factor under 100, and that 91% have a factor under 1000.

I'm not a mathematician, but this seems off to me. It is obviously true that any random positive integer has a 50% chance of being even, but is "88% of all positive integers have a factor under 100" meaningful? Can you apply percentages to an infinite set? Is this really shorthand for "for every random positive integer there is an 88% chance that it has a factor under 100" (which in turn presumably means "for every set of 100 random integers approximately 88 will have a factor under 100")? This is probably just me not understanding how these expressions are used, but still: is "88% of all positive integers" strictly speaking correct? 82.92.119.11 23:14, 28 April 2006 (UTC)


 * You're right; you can't assign proportions to the entire set of the integers in this way. However, you can look at it as a limit: as n tends to infinity, the proportion of integers below n that have a factor under 100 tends to 88%. &mdash; ciphergoth 08:05, 29 April 2006 (UTC)
 * So you'd say that the expression as used is indeed meaningful, just shorthand for something formally more complicated? 82.92.119.11 09:47, 29 April 2006 (UTC)


 * Yes. Normally I'd be wary of simplifications like this, but it's such a well behaved proportion that I don't think there are any problems expressing it this way. &mdash; ciphergoth 09:54, 29 April 2006 (UTC)

Reducing the number of trials to nearly (39/105)*N1/2 instead of (1/2)*N1/2
The naive solution to getting the prime factors of a number N by trial division is picking them up one by one and applying modulo division, and if the remainder is zero, it is a prime factor. Trial division begins with 2,3,4,5,6... and ends with N1/2 (or the square root of N) rounded down. Thus the number of trial divisions is roughly equal to -1+N1/2 at the worst case. (The worst case happens when the number is prime or semiprime. Semiprime numbers are numbers with two prime factors of roughly equal size.)

Meanwhile, we can afford to take out all other even factors, since it is the case that if a number K is not divisible by 2, it is also not divisible by positive multiples of 2. (In general, if K is not divisible by L, K is not divisible by all positive multiples of L.) Thus we can first try dividing by 2, then 3,5,7,9,11,13,15,17... until N1/2 rounded down. The number of trials is roughly equal to -1+(1/2)*N1/2 at the worst case. This is the notation which can be found on the current article. The number of divisions is (roughly) equal to the square root of N over 2.

From here it must be noted that it is much cost-efficient to naively apply trial division without regarding the primality testing of such factors. The overhead of getting all prime factors less than or equal to N1/2 is too large for sufficiently large N.

Now a good question is: Is there a way to reduce the number of trials? We saw that by removing all other even numbers from the trials, we basically cut down the trials in half. Now how about removing all remaining multiples of 3? Obviously we do not reduce the number of trials by another 1/3 because some multiples of 3 are also multiples of two. Thus we will only be able to reduce the trials by 1/3 of 1/2, or equivalently, 1/6. How does this happen?

Initially, the numbers in your sequence are 2,3,4,5,6,7,8,9,10,... Applying a 2-sieve, you get 3,5,7,9,11,13,15,17,19,... which is exactly the sequence of trial division without the even numbers. Now by applying a 3-sieve we get 5,7,11,13,17,19,23,25,29,31,35,37,...

Thus in this case we will only have to divide by 2 then 3 then 5,7,11,13,17,19,23,25,29,31,35,37,... We have thus reduced the number of divisions to 1+(1/3)*N1/2.

In the trial sequence 2,3,4,5,6,7,8,9,10,11,12... our common difference between the terms is 1. In the trial sequence 3,5,7,9,11,13,15,17,19... our common difference between the terms is 2. In the trial sequence 5,7,11,13,17,19,23,25,29,31... we do not have a common difference but we do have a difference sequence of 2,4 that repeats. Meaning, we add 2 to the first term to get the next term (5+2=7) and then add 4 to the next term which should give you 7+4=11, etc.

From these three sequence we will easily see that the first composite number that appears in the sequence is q2, where q is the first number in the sequence. It is also true that in order to find all the prime numbers from 2 to some number N, we only have to sieve using prime numbers from 2 to the prime number less than or equal to the square root of N. Thus in order to get the prime numbers up to 100, we only need to sieve by the prime numbers less than or equal to 10, which is the square root of 100. Or in other words, it is sufficient to make 2-sieve, 3-sieve, 5-sieve and 7-sieve to get the prime numbers up to 100. The proof to this is based on the same proof as to why you should do trial division on factors less than or equal to the square root of N.

Of course, we can make improvements on our latest trial sequence by taking out all multiples of 5, the current smallest prime number in the sequence. The sequence will now be: 7,11,13,17,19,23,29,31,37,41,43,47,49,53,59,61,67,71,73,77,79,83,89,91,97,101,103,107,...

Again, the first composite number in the sequence is 72 or 49. The new difference sequence is 4,2,4,2,4,6,2,6. (Meaning from 7, add 4 to get the 2nd term, 2 to get the 3rd term, and so on.)

How much did we reduce? Since some multiples of 5 are also multiples of 2 and/or 3, we were only able to take out 1/30 of the multiples. You now only have approximately 2+(3/10)*N1/2 trials at the worst case.

Does continuous sieving like so advantageous for us in the long run? Not actually. If we try to apply a 7-sieve this time, we will only be able to take out 1/210 of the trials and we will now have 48 numbers in our difference sequence. Our trial sequence is now 11,13,17,19,23,29,31,37,41, 43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,121,127,131,137,139,143,149,151,157,163,167,169,173,181,187,191, 193,197,199,209,211,221,223,227,229,...

The new difference sequence, with 48 numbers, is the following: 2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10.

The approximate number of trials at the worst case is approximately equal to 3+(31/105)*N1/2. This algorithm when applied is approximately (and asymptotically) 59.04% faster than the conventional trial division with 2 followed by odd numbers. The prime number 923,456,789,012,347 has been tried for this algorithm and the program was able to factor it (trivially, since it is prime) in less than 2 seconds. (The program was written in Java 1.5.0 and run on a Pentium IV 2.6GHz with Windows 2000 SP4. The source code can be found below.)

--- public static void main(String [] args) { double N=Double.parseDouble(JOptionPane.showInputDialog(null, "Input number to factorize.")); System.out.println(d2S(N)+"="+factorize(N).trim.replaceAll(" ","*")); System.exit(1); /* IMPORTANT: save these three methods in one class. /* And don't forget to add: /*              import java.io.*; /*              import javax.swing.*; */ }

public static String factorize(double N) { int [] inc={2,4,2,4,6,2,6,4,2,4,6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6,2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10}; if (N<1) return ""; else if (N<3) return d2S(N)+""; else { String output=""; while (N%2==0) {   output+="2 ";    N/=2;   } if (N==1) return output; else while (N%3==0) {   output+="3 ";    N/=3;   } if (N==1) return output; else while (N%5==0) {   output+="5 ";    N/=5;   } if (N==1) return output; else while (N%7==0) {   output+="7 ";    N/=7;   } if (N==1) return output; else { double y=Math.sqrt(N); for (int i=11, j=0; i0) { int temp=(int)(u%10); output=temp+output; u=u/10; u=u-(u%1); } return output; }

---

The length of the increment sequence grows faster than O(N!) and yet the reduction of terms converges to almost zero. Thus even though it is practical to implement a trial division algorithm with less trials, the number of terms in the difference sequence is another issue to tackle.

Although this is not yet proven, we can see from our examples above some sort of pattern found in the difference sequences. It is generally a palindromic sequence of even integers followed by an even integer Q (which is apparently the largest in the sequence) followed by 2 and another Q. For example:

2,4 can be expanded into 2,4,2,4. Thus the palindromic sequence is 2, and the Q2Q sequence is 4,2,4. In 4,2,4,2,4,6,2,6 the palindromic sequence is 4,2,4,2,4 and the Q2Q sequence is 6,2,6. In the difference sequence with 48 terms, we can verify that the preceding sequence of 45 even integers is a palindrome, followed by a 10,2,10.

Another thing to note is the sum of the digits of each difference sequence: 2 = 2 = 2, 2+4 = 6 = 2*3, 4+2+4+2+4+6+2+6 = 30 = 2*3*5, 2+4+2+4+6+2+6+4+2+4+6+6+2+6+4+2+6+4+6+8+4+2+4+2+4+8+6+4+6+2+4+6+2+6+6+4+2+4+6+2+6+4+2+4+2+10+2+10 = 210 = 2*3*5*7, etc.

These sums also represent the reciprocals of the amount of composites removed for every sieve you do. (For example, by making a 2-sieve, you remove 1/2. By making a 3-sieve, you remove 1/6. By making a 5-sieve, you remove 1/30. etc.)

Right now I am trying to prove the convergence of the sum of the reciprocals of 2, 6, 30, 210, 2310, 30030, 510510, 9699690, 223092870, 6469693230, etc. They seem to converge somewhere close to 0.70523. It could mean that approximately 70.523% of the positive integers are composite, and the rest are prime (except for 1) which is coincident with the theorem that the set of prime numbers are infinite. This sum of infinite series is asymptotically the number of composites that you will be able to remove by k-sieving from 2 to N1/2 for any sufficiently large N. Best123 04:00, 29 November 2006 (UTC)


 * I don't understand that. The sum of reciprocals of primorial numbers clearly converges faster than a geometric series, and as you say to about 0.705230....  But the density of primes falls towards zero due to the prime number theorem.  --Henrygb 21:45, 5 December 2006 (UTC)


 * Clarification: I was not trying to obtain the reciprocals of primorial numbers. I was trying to obtain the sum of the reciprocals of 2, 6, 30, 210, 210, 2310, etc. What is this trying to show? This is showing the approximate amount of composite numbers removed after a certain amount of times you k-sieved the set of natural numbers. (Let us not forget that the goal here is to be able to reduce the number of trial divisions by taking out numbers that are surely composites. Recall: if N is not divisible by some $$k<\sqrt{N}$$, then N is not divisible by all other positive multiples of k.) For example, by 2-sieving the set of natural numbers, you got one sure prime number (2) and slashed down all other multiples of 2. Since the set of natural numbers are countable, we can safely assume that we "cut down half" of the trials when dividing from 2 to N1/2. We now have a factor of 0.5, because we will be trial dividing only by half of what we should have if we didn't 2-sieve the possible factors. If what we did next was to make a 3-sieve, we would reduce further the set of possible factors, but not by $$1/3$$ because some multiples of 3 are also multiples of 2. Thus the number of multiples of 3 which are not factors of 2 are $$1/2 * 1/3 = 1/6$$. Approximately one-sixth of the set of composite natural numbers are divisible by 3 but not by 2. Thus we now have $$1-(1/2+1/2*1/3) = 1 - (2/3) = 1/3$$. After 2-sieving and 3-sieving we will only need to make approximately $$1/3$$ of the trial divisions we would have initially had if we tried from 2 to N1/2. If we tried to make an additional 5-sieve then we would now have $$1 - (1/2+1/2*1/3+1/2*1/3*1/5) = 1 - (7/10) = 3/10$$. (This is because some multiples of 5 are also multiples of 2 and 3 which were already eliminated during the 2-sieve and 3-sieve.) After 5-sieving we only need to do only approximately 30% of the amount of trial divisions. If we tried to make a 7-sieve (reduces $$(1/2*1/3*1/5*1/7)$$), an 11-sieve (reduces $$(1/2*1/3*1/5*1/7*1/11)$$), a 13-sieve (reduces $$(1/2*1/3*1/5*1/7*1/11*1/13)$$), and so on, the amount of comparisons reduced asymptotically is:


 * $$ Q_m = 1/2+1/6+1/30+1/210+1/2310+1/30030+1/510510+1/9699690+... \approx 0.70523... $$


 * where m is the last prime number we sieved (m-sieve.) Clearly, the infinite series above converges because it is bounded (upper bound) by a geometric sequence with common ratio of ½, which is also convergent. Also, this means that in an asymptotic basis, we can potentially reduce the number of trial divisions to nearly 0.29477 of the original number of trials. However, we can say that it is quite sufficient to reduce the number of trials to 0.3 of the original, since repeatedly sieving our possible factors yields only diminishing returns. Up to 5-sieve should be enough for practical purposes.


 * In corollary to this we can also say that the trial division algorithm can get no better than $$O(\sqrt{N})$$, since no matter how many composites we remove, the number of trials at the worst case is still proportional to $$\sqrt{N}$$. Best123 07:03, 15 December 2006 (UTC)


 * Rather than your sum I think you should be looking at
 * $${1 \over 2}\times {2 \over 3}\times {4 \over 5}\times {6 \over 7}\times {10 \over 11}\times {12 \over 13}\times {16 \over 17}\cdots$$
 * which converges to zero. For example, after sieving 2, 3 and 5 you are left with 4/15 of the numbers and have taken out 11/15 (22/30) of the numbers rather than the 21/30 your calculation would suggest. To check this, you are left with numbers which modulo 30 are 1, 7, 11, 13, 17, 19, 23 and 29, i.e. eight out of thirty.--Henrygb 20:41, 16 December 2006 (UTC)

(Independently of this page) I wrote an applet which uses a similar method, but it finds the difference sequence when you launch it, removing multiples of primes up to 17, making the difference sequence 92160 numbers long. It takes about 5 and a half seconds to prove the the primality of 923,456,789,012,347 on my computer (it's probably highly slowed-down by the fact that it uses arbitrary-length integers). Second external link.24.107.235.184 05:38, 6 January 2007 (UTC)

Testing for primality
Maybe I am missing something, but the statement about testing for primality in the article sounds like nonsense to me, or at least requires clarification. Surely testing the primality of a number $$a$$ is more costly than dividing $$n$$ by that number, at least when $$a$$ is of the same order of magnitude as $$n$$. Hence it would seem to me that it is actually best not to test for primality in trial division. LR (talk) 17:11, 24 February 2009 (UTC)
 * There are a number of ways of looking at this. A very common optimization is to skip even values, which can be generalized. It's not difficult to simultaneously sieve the range [1, sqrt(n)] for primes while doing trial division, if space is available, which seems more likely to be beneficial for large enough values, but I don't know if trial division would be useful for values that large. Also keep in mind that the test factors are on average about half the magnitude of n.
 * Most importantly though, I see this as a sort of lower bound - it's the number of values you have to divide by in trial division, to perform the test correctly (even if primality testing were free). Dcoetzee 23:02, 24 February 2009 (UTC)


 * You are not missing anything. Maybe a polite way to put things is to say that the article is badly worded.  But in actuality, the article is plain wrong.  One does not typically explicitly test factors for primality when doing trial division - at least, not when division is relatively cheap, e.g., in single precision.  It is common (but not required) to use a sieve to generate primes, but that is not an explicit primality test. When testing multiple precision numbers with trial division, it makes sense to use a sieve.  ATBS 19:39, 10 September 2009 (UTC)  —Preceding unsigned comment added by ATBS (talk • contribs)

Misleading Opener
The opening of the article states the following:

"Given a composite integer n (throughout this article, n means "the integer to be factored"), trial division consists of trial-dividing n by every prime number less than or equal to \scriptstyle\sqrt{n}. If a number is found which divides evenly into n, that number is a factor of n"

Personally, I find this somewhat misleading. As I understand it, dividing by all prime numbers <= sqrt(n) proves whether n is prime or not. This won't find all prime factors of n. A trivial example of this is 26. I think the opening should be changed, since it gives the impression that this particular method with this particular constraint is used to find prime factors. Either state that trial by division is used to prove if a number is prime, or remove the reference to sqrt(n). —Preceding unsigned comment added by 124.188.99.169 (talk) 09:37, 4 March 2009 (UTC)

The opening paragraph is wrong in another way - it says that trial division implies dividing by primes. It's perfectly legitimate to divide by non-primes. Sometimes this is very fast and convenient, as one does not have to keep track of primality, or use a sieve. I believe this is mentioned in Crandall & Pomerance, "Primes - a Computational Perspective." This sentence: "trial division consists of trial-dividing n by every prime number less than or equal" is really bad, too. It's not always necessary to divide by every prime number less than or equal to n. For example, if you're factoring a power of 2, you can divide by 2 (repeatedly), and the number is factored - you would likely not want to divide by 3. Also, it's not correct to use the term being defined inside its definition. ATBS 19:33, 10 September 2009 (UTC) —Preceding unsigned comment added by ATBS (talk • contribs)

Elaboration on computational complexity
Perhaps it's worth including some more discussion of the complexity of this algorithm. A casual reader might read that the worst-case complexity of the algorithm is sqrt(n), and think that the algorithm is very fast indeed. The article on Integer factorization, says that no polynomial-time algorithm is known to exist for integer factorization, and here we have one that's in O(sqrt(n))! It's not until you go back and read more carefully that you realize that the complexity of such algorithms is conventionally stated with respect to the number of bits that it takes to represent n (log_2(n)) rather than n itself. So our O(sqrt(n)) algorithm ends up looking more like O(sqrt(2^b)), where b is the number of bits in n's binary representation. Perhaps some mention of this in the article would emphasize the infeasibility of this algorithm for large n. Dindon (talk) 21:30, 3 November 2010 (UTC)

I'd say go ahead and add a more detailed explanation, Dindon (maybe even a section on computational complexity). It does need some clarification. WP:Be Bold.

Zimagic (talk) 16:35, 25 July 2011 (UTC)

Eliminating from primality test category
I'm eliminating this article from "Primality Test" category since it clearly says at the header it's intended for integer factorization. Also, there are parts stating that no polynomial time algorithm exists, which is true only for factorization but not for primality testing. I don't know how to remove it from the "Primality Tests" section under the Number Theory article table.

--179.34.48.190 (talk) 23:32, 15 March 2015 (UTC)

Primes to consider
Here are the ranges for which certain primes need to be considered. GeoffreyT2000 (talk) 21:26, 2 May 2015 (UTC)
 * 4–8: 2
 * 9–24: 2, 3
 * 25–48: 2, 3, 5
 * 49–120: 2, 3, 5, 7
 * 121–168: 2, 3, 5, 7, 11
 * 169–288: 2, 3, 5, 7, 11, 13
 * 289–360: 2, 3, 5, 7, 11, 13, 17
 * 361–528: 2, 3, 5, 7, 11, 13, 17, 19
 * 529–840: 2, 3, 5, 7, 11, 13, 17, 19, 23
 * 841–960: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
 * pn2–pn+12-1: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, ..., pn

For the numbers 2 and 3, no trial division ever needs to be done. Those numbers are vacuously prime. GeoffreyT2000 (talk) 21:26, 2 May 2015 (UTC)