Talk:Pumping lemma for regular languages

Error in examples/statements regarding the lemma?
The second paragraph says "w = xyz where the middle portion y must not be empty", but then immediately lists xz as an example? This would mean y is the empty string. The paragraph also goes on and says, "...constructed by repeating y zero or more times..." This seems a tiny bit misleading. It might mean repeating it 0 or more times AFTER the first occurrence of 'y', but it seems to imply y itself can be listed 0 or more times, which again, contradicts the definition. — Preceding unsigned comment added by 73.145.200.192 (talk) 08:24, 19 January 2019 (UTC)

Okay, so I think here is the answer: y, as a string itself must not be the empty string. However, in the context of the pumped string (xyz), it may be repeated 0 or more times, so xz is valid. This could use some more articulation in the article itself though, because the wording is confusing. — Preceding unsigned comment added by 73.145.200.192 (talk) 08:54, 21 January 2019 (UTC)

Easy explanation
I added the the Informal statement and explanation section after remembering the frustration I felt when I first tried to understand the pumping lemma. Any input is welcome. I'm not a native English speaker, so I might have messed up at places. Ulfalizer

TeXify
Any thoughts on TeXifying at least the lemma statement? Might look a little nicer. 141.210.102.198 16:49, 20 April 2007 (UTC)

Lemma statement
Where is the statement of the lemma? Shouldn't it be front and center?
 * Good point. It used to be there.  I haven't found the edit yet that removed it. Fixed. Jrincayc 04:14, 14 November 2006 (UTC)

Don't understand how y can be bcbc
I made the change to y so it would be 'bc' rather than 'bcbc' (below) because [so I thought] |xy| must be less than or equal to p. The pumping length appears to be 4 (4 states in the FSA), so if x=a and y=bcbc, |xy|=6 and the language would not be regular. The conditions in Sipser (p.78 2nd ed.) don't seem, IMHO, consistent, but could be read to say y must be at least 'bc'.

"In terms of the pumping lemma, the string abcbcd is broken into a x portion a, a y portion bcbc and a z portion d. (Note, it could be broken in different ways such as a x portion a, a y portion bc and a z portion bcd)"


 * Yes, you are correct. I will fix the page.  Jrincayc 03:24, 6 October 2006 (UTC)

Example in Lemma Not Sufficient section
The example is rather difficult to follow and actually seems very wrong to me. It needs to prove that for any p > 0, for all strings in the example language, there is a way of dividing it into xyz such that for all integers i &ge; 0, xyiz is also in the example language (where, of course, p i x y z are as in the PL for regular languages). Am I being ignorant or does this need some serious fixes? Citrus538 15:47, 17 December 2006 (UTC)


 * As far as I see, it needs at least one fix. For short palindromic prefixes uuR we can pump the fisrt letter of the remaining part v (as in the article). However, if the prefix is long we must pump the palindrome. Not only pumping up (which is OK as in the article), but also pumping down (taking k=0). This seems not possible in general. Jochgem 07:43, 13 April 2007 (UTC)
 * Indeed, the example is incorrect (as far as I can see) because it does not account for pumping down (k=0). I shall take the liberty of replacing it with an example of my own contrivance, which, though somewhat more complicated, has the advantage of being correct. If someone comes up with a simpler example, please be bold and edit it in.
 * My example involves a 4-character alphabet (0,1,2,3) and a language consisting of {w| at least one substring of w that has length 3 contains at least one duplicate character} U {w| precisely 1/7 of the characters of w are 3s}. The first condition ensures that any string at all whose length is greater than 5 is pumpable (both up and down)--even if the original string is not in the language. The second condition ensures that the language is irregular. JudahH (talk) 04:25, 19 March 2009 (UTC)
 * The example is not correct: pumping destroys the 1/7 property.Talan Gwynek (talk) 02:18, 12 June 2012 (UTC)
 * My notation above wasn't clear because I used ordinary ASCII characters to approximate mathematical symbols, but the capital U was meant to stand for the operation of "union". In other words, the rule is that the language consists of all strings that satisfy one of the two conditions. So pumping will always produce a string that's in the language because it satisfies the first condition, yet there will be other strings that are in the language because they satisfy the second, alternative condition, and some of those are not producible by pumping. JudahH (talk) 14:41, 28 June 2012 (UTC)

Unwarranted assumption in first example?
The first example says: "If we let |xy|=p and |z|=p, then xy is the first half of w, or all p of the as". I don't think we can do this - don't we have to consider any case where |xy| &le; p? The important thing is that y contains at least one a and no bs, which does follow directly from |xy| &le; p. Dcoetzee 22:14, 14 September 2007 (UTC)

I had exactly this confusion when I read the proof. I think it should be assumed that |xy| &le; p and then be observed that y can only contain a's. If we pump y then we get an unbalanced string which is not in the language. This should be changed but I don't feel qualified to do it. -Max

Philosophy?
Why would this be in the philosophy section? It should be in the computer science / lingustics section. —Preceding unsigned comment added by 24.108.202.22 (talk) 09:05, 19 November 2007 (UTC)

Lemma Statement too weak?
IMHO the inner part of the Lemma is always true (edit: which reduces the lemma to $$\forall L\subseteq \Sigma^* (\mbox{regular}(L) \Rightarrow true)$$):

$$\exists x,y,z \in \Sigma ^* (w=xyz \Rightarrow [\text{three conditions}])$$

There always exists a combination of x,y,z which is unequal to w which makes the whole term true without even having to consider the three conditions. This becomes more clear when trying to negate the lemma (in order to prove a language is not regular). The inner part in this case becomes:

$$\forall x,y,z \in \Sigma ^* (w = xyz \land \neg[\text{three conditions}])$$

Obviously w = xyz cannot be true for all x,y,z in $$\Sigma ^*$$.

But what the lemma should say is that there is a concatenation xyz = w which also satisfies the three conditions. So imho the implication should be replaced by a conjunction:

$$\exists x,y,z \in \Sigma ^* (w=xyz \land [\text{three conditions}])$$

Please correct me if I'm wrong. Otherwise I'll change the lemma in a few days.

Tcommbee (talk) 00:57, 22 June 2009 (UTC)
 * This is correct, a small oversight on the part of the original author. Dcoetzee 22:38, 22 June 2009 (UTC)

Confused
Let Σ = {a, b}, is L = {ab} (a language that consists of only one word) a formal language? If so, is it a regular language? [My guess is that it is, since it can be constructed by the grammar ({A, S}, Σ, {S -> aA, A -> b}, S)] If so, does the pumping lemma hold for it? (It doesn't seem to...) Wyverald (talk) 02:37, 8 November 2010 (UTC)


 * Oh wait. Does the pumping lemma hold true for all limited-length languages because you can just let p be greater than the length of the longest word in the language? Silly me. On second thought, shouldn't we clarify this in the article? Wyverald (talk) 02:49, 8 November 2010 (UTC)


 * It is there: "Note that finite languages satisfy the pumping lemma trivially by having p equal to the maximum string length in L plus one." It could probably be clearer though.  Jrincayc (talk) 13:33, 8 November 2010 (UTC)

Removed example of balanced parenthesis
Pumping lemma holds true for a language of balanced parentheses (which is still non regular):

It is always possible to find a substring of balanced parentheses inside any string of balanced parenthesis. Such substring can be safely removed or repeated any number of times without ruining the balance.

For example there are two possible strings of length 4: () and . We can define y to be ' in both cases and see that any number of adding or removing ' from those strings will not break the balance.

WhitestOwl (talk) 01:10, 27 February 2011 (UTC)

General version of the pumping lemma
((Wikipedia email from, received 28 Sep 2015:))

Hello, Jochen!

Thank you for providing an example for the general version of the pumping lemma for regular languages.

But one can solve the problem you've mentioned ({ ambncn : m≥1 and n≥1 }) using a homomorphism from the alphabet to a set of strings. We can just erase "a" and the rest is easy. Could you provide an example where a homomorphism is of no use but the general version of the pumping lemma shows in its full glory?

I'm sorry if I'm asking too much or I've chosen a wrong way to communicate (I'm still new to the whole Wikipedia stuff).

Thanks in advance


 * Hello, I prefer to answer this way, so other editors may help us with the issue.
 * You are right with your objection to the example from Hopcroft and Ullman, but I don't have a full-glory example at hand. I'll try to think about one; possibly { cmbncn : m≥1 and n≥1 } would work? However, someone might ask for a citation, anyway. - Jochen Burghardt (talk) 16:20, 28 September 2015 (UTC)

Hi, the example of a non-regular language given to satisfy the non-generalised Pumping Lemma does not satisfy it. For example the word abc can be pumped down to bc which is not in the given language. This is a deep flaw in the example that cannot be easily fixed - as it relies on m ≥ 1.

131.111.5.175 (talk) 11:11, 16 May 2016 (UTC) Luke Shirley


 * Oops, I think you are right. In fact, Hopcroft and Ullman didn't claim that { ambncn : m≥1 and n≥1 } satisfies the non-generalised Pumping Lemma; they just asked to prove its non-regularity by the generalized version. Also, I agree that there is no obvious way to fix the example. For now, I'll just withdraw the wrong claim from the article. You don't happen to know a good example language that (1) satistifes the non-generalised Pumping Lemma, (2) is nevertheless non-regular, and (3) can easily be proven so by the generalized Pumping Lemma? Or maybe even a language that in addition (4) can't be proven to be non-regular by some of the regular languages' closure properties, such as closure w.r.t. homomorphism mentioned above by ? - Jochen Burghardt (talk) 12:23, 16 May 2016 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified 1 one external link on Pumping lemma for regular languages. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Corrected formatting/usage for http://www.cse.chalmers.se/~coquand/AUTOMATA/rs.pdf

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at ).

Cheers.—cyberbot II  Talk to my owner :Online 12:40, 2 April 2016 (UTC)

Use of lemma section does not make it clear that there must be no p that satisfies the lemma
Re: my initial somewhat confused edit.

Shouldn't the condition be not valid for all $$p$$s?

Basically we need to negate $$(\exists p\geq 1) ( (\forall w\in L) ((|w|\geq p) \Rightarrow ((\exists x,y,z \in \Sigma^*) ...$$, obtaining $$(!\exists p\geq 1) ( (\forall w\in L) ((|w|\geq p) \Rightarrow ((\exists x,y,z \in \Sigma^*) ...$$, but the wording of the usage example does not make it explicit that the $$L$$ can not be regular because there is no $$p$$ that satisfies the lemma (and this because not all (in this case, none) of the $$|w|\geq p$$ are such that $$\exists x,y,z \in \Sigma^* ...$$, for any $$p$$).

A counterexample for a $$L = \{a^n b^m : n \geq 0, m \geq 0\}$$ would also be nice (even if proving the lemma doesn't strictly prove that the language is regular, a counterexample of the usage of the lemma would still be nice).

Danogentili (talk) 16:23, 12 June 2023 (UTC)


 * Yes, in order to prove that $$L$$ is not regular, you need to prove the negation of
 * $$\exists p \geq 1.\; \forall w \in L.\; |w| \geq p \Rightarrow \exists x,y,z \in \Sigma^*.\; w=xyz \land |y| \geq 1 \land |xy| \leq p \;\land\; \forall n \geq 0.\; xy^nz \in L$$, which computes as
 * $$\forall p \geq 1.\; \exists w \in L.\; |w| \geq p \;\land\; \forall x,y,z \in \Sigma^*.\; w=xyz \land |y| \geq 1 \land |xy| \leq p \Rightarrow \exists n \geq 0.\; xy^nz \not\in L$$.
 * [I assume that each quantifier has a maximally long scope, and that $$\land$$ binds stronger than $$\Rightarrow$$ to save parantheses.]
 * So the proof has to start from an arbitrarily given $$p$$ and to find some $$w$$ which has no appropriate decomposition. we chose $$w = a^p b^p$$ and show that for every possible decomposition $$w = xyz$$ with $$|y| \geq 1$$ and $$|xy| \leq p$$, some of its pumpings, viz. $$xyyz$$ isn't in $$L$$. The proof follows the negated formula quite closely.


 * As for $$L = \{a^n b^m : n \geq 0, m \geq 0\}$$, this language is regular, so no counter-example can be found. - Jochen Burghardt (talk) 19:09, 12 June 2023 (UTC)
 * >So the proof has to start from an arbitrarily given $$p$$ and to find some $$w$$ which has no appropriate decomposition.
 * Again, shouldn't we prove that for every $$p$$, not just one (as shown by the fact that the negated expression starts with $$\forall p$$)?
 * $$L = {a^nb^m : n \geq 0, m \geq 0}$$ is indeed regular, so a counter-example could still be made, and it should at least find one $$p$$ such that $$\exists p \geq 1.\; \forall w \in L.\; |w| \geq p \Rightarrow \exists x,y,z \in \Sigma^*.\; w=xyz \land |y| \geq 1 \land |xy| \leq p \;\land\; \forall n \geq 0.\; xy^nz \in L$$, even if finding the $$p$$ does not strictly prove that the language is regular. Danogentili (talk) 08:18, 14 June 2023 (UTC)


 * Proving a universally quantified statement "∀x.φ(x)" is usually done by proving "φ(y)" for an arbitrarily given y; cf. any math textbook that elaborates on proof methods, and e.g. rule "∀R" in Sequent_calculus. Intuitively, since nothing has been assumed about y, if φ holds for that y, then it holds for every value.
 * The pumping lemma can't be used to prove that a given Language L is regular, since it provides a necessary, but not sufficient condition for regularity; cf. the "⇒" after "regular(L)" in the formal expression, and section Pumping_lemma_for_regular_languages. - Jochen Burghardt (talk) 08:47, 14 June 2023 (UTC)

"Converse not true" is grammatically incomplete
Headings are usually complete subjects, however "Converse not true" is not. "Problems with the converse" would be more accurate, or "Invalidity of the converse". Even just "The Converse" would be better, although I suspect it was changed from that specifically to make clear that the converse is untrue. Bearlyalivent (talk) 08:42, 11 February 2024 (UTC)
 * You have a point here. (As I'm not a native English speaker, I confine myself to some mathematical considerations.) Of your suggestions, I dislike "Problems with the converse", for the same reason I dislike e.g. "Problems with 2+2 being 4" - complaining about math facts is of no use. "Invalidity of the converse" is ok, while "The Converse" might suggest at first glance that we have to say something important about it, beyond that it is just false (by analogy, we wouldn't make a heading "2+2 equals 5" only to explain that this is a false proposition). I also thought about using "Not an equivalence", and adding in the section body a sentence like "The implication ($\Rightarrow$) in the lemma cannot be sharpened to an equivalence ($\Leftrightarrow$)". - Jochen Burghardt (talk) 18:12, 11 February 2024 (UTC)