Talk:Proofs of Fermat's theorem on sums of two squares

Don Zagier's proof
A recent addition to Don Zagier gives references to a short proof that perhaps deserves mention here:. 165.189.91.148 20:43, 14 September 2006 (UTC)

Other proofs
Two more proofs can be seen here: http://planetmath.org/encyclopedia/ProofOfThuesLemma.html (I don't know why they call the result Thue's Lemma.)


 * Feel free to add an outline of those proofs, or to add the reference. Magidin 15:08, 21 February 2006 (UTC)

I added a link to those other proofs. Is that you, Arturo? :) LDH 10:54, 22 February 2006 (UTC)

Some of the proofs given in the article would be greatly simplified if they made use of Euler's criterion. This shows that -1 is a quadratic residue modulo p if and only if p is congruent to 1 modulo 4 (sorry, I don't know how to input the mathematical formula!). One day I'll have a go at rewriting the article to use this. --Rmw1246 (talk) 08:50, 22 June 2010 (UTC)

When a recent TV programme demonstrated this result, without attribution, my companion was astonished, and I coudnt say why it was true. None of the proofs on this page are public friendly though (not a criticism) and I quite liked the second proof of Thue's lemma, which can be presented as a constructive algorithm, which would run, if immensely slower than a brute force search. My companion used to be a programmer and may understand it - Would it be worth outlining it here:- it goes like this (please delete if you disagree):- PS must learn laTex PPS why does blockquote reduce the font size. . y = ((p-1)/2)! mod p

x = 1

m = (1 + y^2) / p -- this step needs a pointer to Euler's result.

while m > 1

if m is even m' = m / 2

x' = (x + y) / 2

y' = (x - y) / 2

else a = +/- x mod m -- i.e a is in (-m/2..m/2)

b = +/- y mod m

m' = (a^2 + b^2) / m

x' = (a*x + b*y) / m

y' = (a*y - b*x) / m

fi

x = x'; y = y'; m = m'

end while

This can be seen to be a terminating loop since m reduces at each stage.

and the loop invariant (to prove) is m*p = (x^2 + y^2)

When m is 1, x and y are solutions. A1jrj (talk) 19:20, 27 October 2008 (UTC)

Unique Factorisation Domain
It seems misleading to say "Since Z[i] is a unique factorization domain (in fact, a Euclidean domain), every ideal is principal."

Z[i] is a Euclidean domain and as such it is a PID, but this is not guaranteed by the fact that it is a unique factorisation domain.

Suggest the following revision: "Since Z[i] is a principal ideal domain (in fact, a Euclidean domain), every ideal is principal." Or, alternatively, just say that this follows from the fact that Z[i] is a Euclidean domain. —The preceding unsigned comment was added by 163.1.146.236 (talk) 18:27, 1 February 2007 (UTC).


 * I made the change, but I wanted to point out the reason it was written as it was: basically, it was paraphrased from Dedekind's argument. For rings of integers of number fields, the ccondition of being a PID and being a UFD are equivalent; at the time, these were the only general "rings" of interest. Dedekind was interested n unique factorization, not the PID property, though he had shown that the two were equivalent. Magidin 19:57, 3 February 2007 (UTC)

Euler's proof
In Euler's proof, in his second statement, it is not necessary to state that p2 +q2 divides (ap+bq)2, because in the final expression, a fraction is left out. But a better statement would be that the lift hand side of the final expression is an integer and the second term in the right hand side is an integer, so the first term better be an integer. This better be because, it does not mean that (p2 + q2)2 divides (ap+bq)2 from the above statement


 * Whether this is better or not is not really material. The proof is as it was quoted in Edwards's book, and is meant to be the proof that was actually given by Euler. No doubt it can be improved, but that is not the point here. Also, please remember to sign and date your comments. Magidin 18:24, 25 March 2007 (UTC)

The first term in the left hand side of the expression is ((ap+bq)÷(p2 + q2))2. So (p2 + q2)2 ÷ (ap+bq)2 is an integer. Therefore the remaining is a fraction (1÷(p2 + q2)). This cannot be because the quotient is an integer and the second term in the right hand side is also an integer, therefore the first term better be an integer. This is contradictory.

There's a problem in Step 2 of the proof, which states:
 * "If a number which is a sum of two squares is divisible by a prime which is a sum of two squares, then the quotient is a sum of two squares."

Now 45 = 36+9 = 62+32, so 45 is a sum of squares. And 5 = 12+22, so 5 is a prime which is a sum of squares. But 45/5 = 9, and 9 is not the sum of squares.

I don't see a quick and easy edit to fix the problem and don't have access to an authoritative text of Euler's proof. Anyone know what's wrong? --Michael Ross (talk) 01:37, 3 October 2010 (UTC)


 * 9 is the sum of the square of 3 and the square of 0. Every square is the sum of two squares. Magidin (talk) 01:49, 3 October 2010 (UTC)

0 is to be allowed in the sum of squares definition? Yes, I see it now - the broader definition is assumed throughout the proof, except in steps 4 and 5 where it's explicitly narrowed to only coprime pairs. It all makes sense now. Thanks. --Michael Ross (talk) 20:44, 3 October 2010 (UTC)

Madigin I don't know if one of us should copy the previous posts in our dicussion to this page, I only continued posting to your page in response to your posting to mine so that there was some kind of continuity. My point is precisely context, you nor I cannot know in what context a user will access these pages as they are accessed by the general public and so we cannot know what meaning they will ascribe to the ambiguous term "number". I agree anyone who was reading the Edwards text in full would know the context and what meaning to ascribe within that context but when it is quoted here outside that context the ambiguous term should be defined. I also agree any Mathematician will get the implication even without knowing the full context but I feel that a member of the general public who comes to this page shouldn't be expected to have to work it out. Again I already pointed out that personally I am aware that the statements are correct that was what I stated when I first made the edit but they may confuse some vistors to these pages. The very fact that even whether zero should be included in the Natural Numbers is debatable should illustrate my point about using ambiguous terms such as "number". In other wiki proofs that I have read the domain is often stated even though it would be apparent to any Mathematican. As for my assumptions I did not assume you misundertood me due to lack of knowledge on your part however as you had not specifically responded to my suggestion of a preamble in either the positive or negative I felt I should repeat it in the manner i did both to make clear I understood and took your point about not changing the quoted part of the text and to make clear that although I agreed with that I still felt a short preamble would make the context clear to any reader. If I gave offence it was not intended. Rhuaidhri (talk) 22:59, 18 August 2015 (UTC)
 * Perhaps you can say explicitly what you want to add as a preamble to the paragraph, and where the addition would be. It would be easier to discuss specifics. Note that because this is in a sense a subsidiary page focusing on proofs, a certain amount of mathematical maturity and knowledge is assumed; the main page is the one supposed to be aimed at the members of the general public, with this page being referenced there as one providing more mathematical detail and necessarily assuming more mathematical understanding from the reader). Magidin (talk) 04:11, 19 August 2015 (UTC)

Lagrange's proof through quadratic forms
"Lagrange proved that all forms of discriminant − 1 are equivalent." This can't be right. (x2 + y2) and (- x2 - y2) represent different sets of integers, nonnegative and nonpositive, respectively, so they are not equivalent. Should this read "Lagrange proved that all forms of discriminant -1 with a>0 are equivalent"? 68.162.145.63 (talk) 12:50, 11 March 2009 (UTC)
 * Hrmph. I took that out of Stillwell's introduction to Dedekind's Theory of Algebraic Integers, and there is no such restriction.He seems to be fudging over forms and reduced forms: that Lagrange showed any form with negative discriminant can be transformed into an equivalent form which is reduced in the sense that $$|b|\leq a\leq c$$ (restriction to reduced form occurs just before and just after stating that forms of discriminant -1 are all equivalent). But even taking that into account, it is clear there is something getting lost somewhere... Magidin (talk) 13:28, 13 March 2009 (UTC)
 * I really believe that in the meanwhile that section should be deleted, or maybe there should be an alert that warns non-suspicious users like me from following that statement. Peleg (talk) 18:27, 10 April 2009 (UTC)
 * Deleting the whole section is incredibly draconian. I added the restriction to reduced forms for now. Magidin (talk) 01:31, 11 April 2009 (UTC)

Dedekind's first proof
In Dedekind's first proof there is a nontrivial step: $$\omega^p \equiv \omega \pmod{p}$$ implies that the ideal (p) is not prime. I suggest some sort of proof, or at least an argument, of this step should be presented in the article. There might be an easier proof of this, however i found the following proof:

Let $$p = 4n + 1$$. Then

$$\omega^p - \omega = \omega(\omega^{2n} - 1)(\omega^n + i)(\omega^n - i)$$

Now the polynomial $$\omega(\omega^{2n} - 1)$$ has at most $$2n + 1$$ roots in the field $$\mathbb{Z}_p$$. By choosing a non-root reel integer for $$\omega$$ we get the desired result. In fact, this argument resembles Dedekind's second proof. —Preceding unsigned comment added by 130.243.203.220 (talk) 13:06, 5 June 2009 (UTC)


 * The fact that p is congruent to 1 modulo 4 is used to show that $$\omega^p \equiv \omega\pmod{p}$$; you don't need to use it again. The result is really immediate: from $$\omega^p\equiv \omega\pmod{p}$$ it follows that p always divides $$\omega^p - \omega = \omega(\omega^{p-1}-1)$$. Since not every $$\omega$$ is either divisible by p or has a p-1st power congruent to 1 modulo p, it follows that while $$\omega^p - \omega$$ is always a product divisible by p, it is not always the case that p divides at least one factor, so (p) is not a prime. Magidin (talk) 15:56, 5 June 2009 (UTC)


 * You claim "not every $$\omega$$ is either divisible by p or has a p-1st power congruent to 1 modulo p". To me, this is the nontrivial step. Do you have a simple proof of it? —Preceding unsigned comment added by 130.243.203.220 (talk) 18:12, 5 June 2009 (UTC)
 * Actually, I led you down the wrong path and introduced a red herring; the proof does not depend on counting roots of polynomials or finding non-roots. The claim that the ideal (p) is prime actually follows from something else, namely the result that factors primes (p) in number fields in terms of the smallest value of a such that $$\omega^{p^{a}}\equiv \omega \pmod{p}$$ holds for all values of $$\omega$$. I took the proof from Dedekind's writing, which assumes that result (having proven it a couple of pages earlier), and from which it does immediately follow that (p) is a product of two primes. But you are correct that it should be expanded a bit here.
 * Also, note that your argument above doesn't work in any case: in fact, every rational integer satisfies $$\omega^p-\omega\equiv 0 \pmod{p}$$ (this is Fermat's Little Theorem). Your error lies in thinking that the i in the factorization does not lie in the field of p elements. But when p is congruent to 1 modulo 4, then the congruence $$x^2 \equiv -1 \pmod{p}$$ has two solutions, which correspond to the symbols i and -i; so in fact the polynomials $$\omega^n +i$$ and $$\omega^n - i$$ do have solutions in that field. For example, if p=5, then i and -i "really" means 2 and 3 (in some order), since their squares are both -1. The polynomial you are looking at is identically zero when evaluated in that field (that is precisely why the congruence always holds modulo p), so picking a rational integer that is not a root of $$\omega(\omega^{2n}-1)$$ modulo p is insufficient: it will be a root of one of the other two factors.
 * I'll add the short explanation. Also, don't forget to sign your posts.Magidin (talk) 19:10, 5 June 2009 (UTC)
 * The equation above is in $$\mathbb{Z}[i]$$ and not in $$\mathbb{Z}_p$$. I only moved to $$\mathbb{Z}_p$$ to find a non-root of the first factor. Then I take an integer representant of that non-root as $$\omega$$. This makes sure $$p$$ divides the integer $${(\omega^{n})}^2 + 1$$. The rest follows Dedekind's second proof.
 * But of course, Dedekind's original proof is the important one. Djalal 130.243.135.182 (talk) 06:41, 6 June 2009 (UTC)
 * Actually, you were working over the ring of polynomials, not over the domain in question; you moved to the field of p elements to count roots of the first factor; there is an implicit argument that there must be non-roots by counting there, but if you are not working there then your counting argument does not apply. The point is that you cannot take "an integer representative of that non-root", because there is no such thing. All rational integers satisfy the relevant equation by Fermat's Little Theorem. Magidin (talk) 18:42, 6 June 2009 (UTC)
 * It is a matter of taste if you look at the equation as equality between two elements in a polynomial ring or as an equality between two expressions in one variable.
 * About the proof: Consider a solution $$c$$ to the equation $$\omega^{2n+1}-\omega \ne 0$$ in $$\mathbb{Z}/p\mathbb{Z}$$. Choose a $$d \in \mathbb{Z}$$ such that $$d + p\mathbb{Z} = c$$.
 * We know that $$p = 4n+1$$ divides $$d^p - d = (d^{2n+1} - d)(d^{2n} + 1)$$ in $$\mathbb{Z}$$ and that it does not divide $$d^{2n+1} - d$$. But $$p$$ is a prime number hence $$p$$ divides the integer $$d^{2n} + 1$$. Djalal 130.243.203.220 (talk) 20:01, 6 June 2009 (UTC)

Ehr, no, it is not simply a "matter of taste". When working over finite fields, polynomials are not in one-to-one correspondence with polynomial functions. The expression $$\omega^p$$ and the expression $$\omega$$ are equal when viewed as expressions of one variable evaluated in the field of p elements, but they are not equal when viewed as polynomials in one variable with coefficients in the field of p elements. Magidin (talk) 01:19, 7 June 2009 (UTC)
 * Clearly you misunderstood me once again. I am talking about about a particular equation
 * $$\omega^{4n+1} - \omega = \omega(\omega^{2n} - 1)(\omega^n + i)(\omega^n - i)$$
 * this equality holds in both senses.
 * When it comes to the general case one doesn't usually view expressions as functions, which is what you seem to do. The article Expression_(mathematics) says "Being an expression is a syntactic concept". Djalal 130.243.135.182 (talk) 08:00, 7 June 2009 (UTC)

Z[i]
I had reverted the addition of the symbol Z[i] after "Gaussian integers"; the editor who put them there in the first place reverted that with a summary that said "a correspondence between Z[i] and gaussian integers is needed". I do not understand what he means by "correspondence"; Z[i] is the common notation for the Gaussian integers, but there is no "correspondence" at play. The notation is mentioned in the lead paragraph of the Gaussian integers article; I do not see what it adds here. Someone who doesn't know what the Gaussian integers are is unlikely to be informed by the simple addition of "Z[i]" after the name, and will have to follow the link anyway. I would ask that the editor explain his edit comment, as to what "corresponce" he believes is needed, why, and how the (in my opinion unnecessary and uninformative) addition of the four characters achieves this. Magidin (talk) 01:07, 29 August 2010 (UTC)


 * I was unfamiliar with Gaussian integers, and the article does become more readable if it is made clear that Z[i] is the symbol for Gaussian integers. Otherwise, you encounter the symbol without knowing where to find its meaning. Now other readers in my situation simply click the link. Actually, I do not understand the objection to it. - JT —Preceding unsigned comment added by JeffTowers (talk • contribs) 02:48, 29 August 2010 (UTC)
 * So... before, if you did not know what the Gaussian integers were, you would not click on the link... and now, because there is this extra symbol after the words "Gaussian integers", you would click on the link? How does the extra symbol encourage you to click, or its absence discourage you?
 * As to the objection, I don't see that it adds anything and it is actually rather awkward where you placed it; the symbol is irrelevant to that sentence, hence my removal. If you want to make explicit that Z[i] represents the Gaussian integers (which is a point that might have some merit), then the correct place to do it is not in this sentence, it is in the sentence that starts "Kummer had already established..." which is where it is first used. If you did not know what the Gaussian integers were before, the symbol will not tell you and the second sentence of the first proof will already by unintelligible. I don't see how the symbol encourages you to click on the link, or how its absence would deter you from doing so. Magidin (talk) 03:03, 29 August 2010 (UTC)

It was hard to make the connection between a Z[i] and Gaussian integers, at least for a non-specialist. I am in the wider audience of this article, I am hoping to find or construct a fun proof for non-specialists, one using complex numbers like this would be ideal. A younger audience learning pythagoras theorem could also first encounter pythagorean triples, fermats last, and this little gem.

Incidentally, I found this link http://www.vex.net/~trebla/numbertheory/gaussian.html to be helpful in understanding primeness in gaussian integers, maybe more so than the wikipedia link as I did not have to stare at a vertical bar until the word divides occured to me. The generalization of regular integers to gaussian integers was clearer with less effort. - JT JeffTowers (talk) 03:47, 29 August 2010 (UTC)

It looks to me like the article uses gaussian integers and zi interchangeably, with zi being a shorthand. So I would think the best place is at the first occurence in the first sentence. I was hoping to make the suggested change but unfortunately found I was of another mind. JT (my tilde just became a / so I cannot sign) —Preceding unsigned comment added by JeffTowers (talk • contribs) 04:14, 29 August 2010 (UTC)
 * I think it does not make sense to put it in the introductory sentence to that section. It makes far more sense to establish the correspondence immediately before it is used, which is in the sentence I mentioned before: "Kummer had already established..." The symbol is not used before then. I don't understand your last sentence. Magidin (talk) 21:51, 29 August 2010 (UTC)
 * This is a tempest in a teapot, but for what it's worth, the symbol Z[i] should be placed right after the first mention of the Gaussian integers in the text. Magidin, it makes little sense to place it where you put it because the fact that Gaussian integers can be written as x+yi has been used several times by then. The phrase "a correspondence between Z[i] and gaussian integers is needed" makes perfect sense to me, both as a description of the edit and as a desirable feature. On the other hand, previewing the link to "Gaussian integers" (if you have that option enabled) displays the notation for them. Magidin, I think that you are being unreasonably obstinate and somewhat overbearing. Even if you were right, it would still not be worth it to fight over little things like that, especially since your energy may be used more productively to improve the readability of the article (e.g. the section on Lagrange's proof begs for a clean-up). I suggest that you restore Z[i] to the place where Jeff inserted it first and take a fresh look at what's truly important. Arcfrk (talk) 08:42, 30 August 2010 (UTC)
 * What I don't understand is, if there is a problem of understanding what the Gaussian integers are (which seems to be the case), how the simple mention of the symbol will solve that problem. Simply writing the symbol does not "establish a correspondence", does not explain what the symbol means or what Gaussian integers are. What you are saying here is that there is a need to explain, at least briefly, what the Gaussian integers are; fine. I would have thought that was what the link was for, but there you go. (As I understand it, this being a page about proofs it's not necessarily expected to be at the level of "any interested reader should be able to follow"). If we are going to assume unfamiliarity with the Gaussian integers, then why are we assuming that the symbol will be intelligible by itself? If the problem is that the section needs expansion, then it needs expansion, not just a symbol thrown in.Magidin (talk) 14:14, 30 August 2010 (UTC)
 * Thanks for your input Arcfrk. I agree that the article has bigger issues. When I was googling further I found a text that was too similar for comfort. When I google a small clip of the article, like "succeeded in proving Fermat's theorem on sums of two squares in 1747" I get a hit with section 2.4 of harold edwards book. While it is referenced in the article, I think we are quoting the book more or less verbatim. Is there a way to increase the credit given to harold edwards in this article? more like, in his book[ref] Harold Edwards makes the following points... And then proceed to fill in the gaps. Which brings us to gaps. Anything that increases the lucidity of the article improves it. As Strunk and White say, the writer has to remember that the reader is in a swamp and needs to be rescued. It is a skill to perceive lucidity, and any given author can run hot and cold at times. I suspect an academic community can also foster/abet an attitude that is 180 degrees off. I think these are two important areas for improvement on this article. -JT === —Preceding unsigned comment added by JeffTowers (talk • contribs) 07:44, 3 September 2010 (UTC)
 * I should also add that I did benefit from the article as it stands. Also I found the following article helpful (even admirable): http://fermatslasttheorem.blogspot.com/2005/06/properties-of-gaussian-integers.html by Larry Freeman. -JT JeffTowers (talk) 07:59, 3 September 2010 (UTC) —Preceding unsigned comment added by JeffTowers (talk • contribs) 07:56, 3 September 2010 (UTC)

Degree of a prime ideal?
I've just reworded the proofs by Dedekind. One notable point is that I dropped the mention of "degree of a prime ideal" in the first proof, because it plays no role, and because I have no idea what it meant in this context. But maybe somebody would like to explain this anyway? Also it might be worth while to note that the second proof actually provides an efficient method to find x and y, provided an integer m is known (which is easy, just try $$\tfrac{p-1}4$$-th powers of random integers, each of which has a 50% chance of being such an m): compute the gcd of p and m+i in the Gaussian integers. Or maybe this should go in the parent article? Marc van Leeuwen (talk) 12:43, 5 March 2011 (UTC)

Euler's proof: the $$k$$th differences of the sequence $$1^k, 2^k, 3^k,\dots$$ are all equal to $$k!$$
Hi,

Can someone provide a proof or a link to a proof on that?

If I got it right, it would depend upon the identity:

$$k! = n^k - \sum_{i=1}^{k} \binom ki (n-i)^k$$, for any $$n>k$$ at least

Wisapi (talk) 23:46, 9 May 2011 (UTC)


 * Don't know what this question does here. Look at Finite difference for an answer. Marc van Leeuwen (talk) 13:57, 10 May 2011 (UTC)


 * Thanks, I'm going to look into this article. I posted the question here, because the claim above was made without any explanation in the fifth step of the Euler's proof (haven't you found it?), contrary, for example, to the argument that the product of the sums of two squares is always a sum of two squares, which has a link to another page, while this not.


 * I also realized later on that my attempt to explain this claim with an identity was wrong. The identity does not describe what is made by taking the kth difference—I made a mistake and I have to think it over again. It is just a pity that everything else in this proof was explained so well, but the last step. Wisapi (talk) 18:44, 14 May 2011 (UTC)


 * The explanation is that the difference operator applied to polynomials reduces degree by 1 and multiplies the leading coefficient by the degree of the polynomial (which is easily checked). As a consequence applying the difference operator $n$ times to any sequence given by a monic polynomial of degree $n$, such as $x^{n}$, results in a constant sequence with value $n!$ . This can now be found at the end of Finite difference, which I just modified. Marc van Leeuwen (talk) 07:49, 15 May 2011 (UTC)


 * Hm, I guess the identity:
 * $$\Delta_h^n (x^n)=n!\,.$$
 * is only true for h=1, or am I wrong? Anyways, this is what we have here in the article, so I'm satisfied. I'll add the link onto the article. Thank you for your help :) ! Wisapi (talk) 13:33, 15 May 2011 (UTC)

Before you explain the theorem, illustrate what the theorem means!
This article is completely remiss in not providing an example of any kind to demonstrate the practical application of the proof - that is, not the proof itself, but just what the theorem shows.

129.63.2.71 (talk) 10:09, 17 July 2014 (UTC)
 * This article is about the proofs; the main page contains examples of what the theorem says. Magidin (talk) 16:32, 17 July 2014 (UTC)