Talk:Context-sensitive grammar

(Contrary to a previous version of this article, the decision problem is not undecidable.)

Natural Language?
The article says that Chomsky invented CSG for natural languages. Are CSGs really used in linguistics? I've only seen context-free grammars (or some mild extensions) in that context.


 * Yes, context-sensitive rewrite rules have been used in linguistics, but I do not know whether they are still in use today. Rp 13:36, 25 July 2007 (UTC)

Added by Paul Ogilvie: in computer science only algorithms have been developed that can easily parse context free languages. These are now common, such as the yacc compiler-generator, an algorithm that can parse a CFL-definition and generate a program that can regognizse the CF language. See Aho, Sethi, Ullman, Compilers - Principles, tools and technisques, 1986. No algorithms have been developed (to my knowledge) that can parse a CSL, except heuristically.


 * Yacc doesn't parse arbitrary context-free languages, but only LALR(1) ones. An example of a general context-free parsing framework is ASF+SDF. Rp 13:36, 25 July 2007 (UTC)

Yes, there are natural languages that are not context free (take for example verbs in Swiss German). The algorithm for parsing CSL is quite straightforward (the complexity is high, obviously).

Contradicts definition?
The example has the rule

cB → Bc

but that means α = c on the left-hand side, but α is empty on the right-hand side? Or is the example simply using the alternative definition of context-sensitive? As you can see, I've only studied context-free grammars so far :)


 * I think the example is using a monotonic grammar for the sake of simplicity (as an equivalent context-sensitive grammar can be constructed, but it probably wouldn't be as simple).


 * A monotonic grammar and a context-sensitive one isn't necessarily the same, so maybe the difference between the grammar and the generated language should be further illuminated. --Bernhard Bauer 00:46, 28 July 2006 (UTC)


 * I think that the following grammar will work:
 * S -> aRc
 * R -> aRT | b
 * bTc -> bbcc
 * bTT -> bbUT
 * UT -> UU
 * UUc -> VUc -> Vcc
 * UV -> VV
 * bVc -> bbcc
 * bVV -> bbWV
 * WV -> WW
 * WWc -> TWc -> Tcc
 * WT -> TT
 * As you can see, it is rather more complicated than the one in the article. 66.218.45.98 00:39, 20 January 2007 (UTC)
 * As you can see, it is rather more complicated than the one in the article. 66.218.45.98 00:39, 20 January 2007 (UTC)
 * As you can see, it is rather more complicated than the one in the article. 66.218.45.98 00:39, 20 January 2007 (UTC)

There is a more standard definition of context sensitive rules (used in most textbooks): a rule x -> y is context sensitive iff |x| <= |y|.
 * Why should people believe those two definitions are equivalent? Liberulo (talk) 21:06, 30 December 2012 (UTC)
 * UPDATE: This paper has a proof of the equivalence of the context-sensitive languages and monotonic languages, but I cannot vouch for its correctness. Liberulo (talk) 21:20, 30 December 2012 (UTC)

According to this definition, Aa -> aA, a is terminal and A non terminal symbol, is context sensitive. Moreover, the authors claim that these 2 definitions are equivalent! How can this be possible? —Preceding unsigned comment added by 85.73.192.119 (talk) 21:01, 11 September 2008 (UTC)
 * I second this question: I'd like to know if and how these two definitions of CSGs are equivalent. Liberulo (talk) 21:06, 30 December 2012 (UTC)
 * UPDATE: Look here, see if this paper contains a satisfactory proof. Liberulo (talk) 21:20, 30 December 2012 (UTC)
 * This is the definition I have seen, for example in Peter Linz, 'An Introduction to Formal Languages and Automata (2nd Ed.)', Chapter 11.3 (1997). Note that if the empty word is to be in the language that needs to be directly specified as a special case, as 'non-contracting' rules obviously can't specify it. The reference Linz cites 'for example' to show that such a grammar is indeed 'context-sensitive' in the sense of the current state of this article is A. Salomaa, 'Formal Languages' (1973). If anyone can track this down it might be the best reference. 150.203.209.134 (talk) 05:08, 18 September 2013 (UTC)

Wrong grammar for language $$ \{ a^n b^n c^n : n \ge 1 \} $$
The grammar given for the language mentioned is not right according to the definition. Moreover the rule CB -> BC makes the grammar unrestricted.

I suggest the following context-sensitive grammar which does apply to the definition given.
 * 1) $$S \rightarrow aSBC$$
 * 2) $$S \rightarrow aBC   $$
 * 3) $$CB \rightarrow HB   $$
 * 4) $$HB \rightarrow HC   $$
 * 5) $$HC \rightarrow BC   $$
 * 6) $$aB \rightarrow ab   $$
 * 7) $$bB \rightarrow bb   $$
 * 8) $$bC \rightarrow bc   $$
 * 9) $$cC \rightarrow cc   $$

Again the derivation for "aaa bbb ccc" is:


 * $$S$$
 * $$\Rightarrow_1 aSBC$$
 * $$\Rightarrow_1 a\boldsymbol{aSBC}BC    $$
 * $$\Rightarrow_2 aa\boldsymbol{aBC}BCBC  $$
 * $$\Rightarrow_3 aaaB\boldsymbol{HB}CBC  $$
 * $$\Rightarrow_4 aaaB\boldsymbol{HC}CBC  $$
 * $$\Rightarrow_5 aaaB\boldsymbol{BC}CBC  $$
 * $$\Rightarrow_3 aaaBBC\boldsymbol{HB}C  $$
 * $$\Rightarrow_4 aaaBBC\boldsymbol{HC}C  $$
 * $$\Rightarrow_5 aaaBBC\boldsymbol{BC}C  $$
 * $$\Rightarrow_3 aaaBB\boldsymbol{HB}CC  $$
 * $$\Rightarrow_4 aaaBB\boldsymbol{HC}CC  $$
 * $$\Rightarrow_5 aaaBB\boldsymbol{BC}CC  $$
 * $$\Rightarrow_6 aa\boldsymbol{ab}BBCCC  $$
 * $$\Rightarrow_7 aaa\boldsymbol{bb}BCCC  $$
 * $$\Rightarrow_7 aaab\boldsymbol{bb}CCC  $$
 * $$\Rightarrow_8 aaabb\boldsymbol{bc}CC  $$
 * $$\Rightarrow_9 aaabbb\boldsymbol{cc}C  $$
 * $$\Rightarrow_9 aaabbbc\boldsymbol{cc}  $$

--Gerel (talk) 15:20, 19 December 2008 (UTC)


 * Thanks. Much simpler than the one I cooked up. Ben Standeven (talk) 07:22, 13 February 2009 (UTC)

This might be a dumb question, but is a context-sensitive grammar allowed to "crash"? As in, end up with non-terminals and have no valid rules to follow. If it is not, then I think I found a derivation that would cause it to crash. Again, sorry if this is legal, I'm just learning about these now and it was not mentioned in class; my assumption was that a valid grammar had to always return a string of all terminals. Here's the derivation that would "crash" it:


 * $$S$$
 * $$\Rightarrow aSBC$$
 * $$\Rightarrow a\boldsymbol{aBC}BC    $$
 * $$\Rightarrow a\boldsymbol{ab}CBC  $$
 * $$\Rightarrow aa\boldsymbol{bc}BC  $$

Again, sorry if this is a dumb question, I just had to answer this question for a problem set and came up with a different answer (that I believe is correct), that does not ever "crash" like this one.


 * It seems that the grammar does not produce only $$ \{a^nb^nc^n\} $$. For example this grammar can produce "aaa bb cccc":
 * $$S	$$
 * $$\Rightarrow_1 \boldsymbol{aSBC}	$$
 * $$\Rightarrow_1 a\boldsymbol{aSBC}BC	$$
 * $$\Rightarrow_2 aa\boldsymbol{aBC}BCBC	$$
 * $$\Rightarrow_3 aaaBCB\boldsymbol{HB}C	$$
 * $$\Rightarrow_3 aaaB\boldsymbol{HB}HBC	$$
 * $$\Rightarrow_4 aaaBHB\boldsymbol{HC}C	$$
 * $$\Rightarrow_4 aaaB\boldsymbol{HC}HCC	$$      <--- ("H" in question below was introduced here)
 * $$\Rightarrow_5 aaaBHC\boldsymbol{BC}C	$$
 * $$\Rightarrow_3 aaaBH\boldsymbol{HB}CC	$$
 * $$\Rightarrow_4 aaaBH\boldsymbol{HC}CC	$$
 * $$\Rightarrow_5 aaaBH\boldsymbol{BC}CC	$$      <--- "H" stemming from a "B" ...
 * $$\Rightarrow_4 aaaB\boldsymbol{HC}CCC	$$      <--- but transformed to "C"
 * $$\Rightarrow_5 aaaB\boldsymbol{BC}CCC	$$
 * $$\Rightarrow_6 aa\boldsymbol{ab}BCCCC	$$
 * $$\Rightarrow_7 aaa\boldsymbol{bb}CCCC	$$
 * $$\Rightarrow_8 aaab\boldsymbol{bc}CCC	$$
 * $$\Rightarrow_9 aaabb\boldsymbol{cc}CC	$$
 * $$\Rightarrow_9 aaabbc\boldsymbol{cc}C	$$
 * $$\Rightarrow_9 aaabbcc\boldsymbol{cc}	$$

(Or am I missing something?)

It seems however that the following grammar works: but it does not follow the rule given in the page. Ref: http://www.cs.cmu.edu/~./FLAC/pdf/ContSens-6up.pdf‎
 * 1) $$S \rightarrow aSBC$$
 * 2) $$S \rightarrow abc   $$
 * 3) $$cB \rightarrow Bc   $$
 * 4) $$bB \rightarrow bb   $$

Metaxal (talk) 13:33, 11 February 2014 (UTC)


 * I agree with your derivation of aaabbcccc. My informal understanding of rules 3-5 is that they are used to swap BC to CB, and that the total count of Bs and Cs can't be changed by these rules if H is counted as B or C in an appropriate way (i.e. 3: CB→$H C$B, 4: $H C$B→$H B$C, 5: $H B$C→BC). Based on that, I tried to locate the "error", and came up with my flagging of your derivation above. The orginal idea of the grammar's author might have been that the "meaning" of an H (i.e. whether it is to be counted as B or as C) is always determined from the nonterminal immediately right to it: count H as B in HC, count H as C in HB. However, in your derivation, the nonterminal immediately right to the "critical" H is changed from C to B due to some unexpected swapping.


 * I wonder what the source of the flawed grammar is; it doesn't appear in Hopcroft+Ullman 1979, which is the only text on CSG I have. If it remains unsourced, it should eventually be removed, anyway. When I've time, I could elaborate H+U's $$a^{2^i}$$ example as a replacement.


 * I had a look at your Ref: Sutner has (on slide 4-->p.1) the same restriction as the wikipedia article, and his rule cB→Bc doesn't satisfy them (the left and right embedding context of B on the rule's left-hand side, viz. c and ε, respectively, should reappear on its right-hand side, but c doesn't). Maybe that is what Sutner expects as an answer to his question "Right?" on slide 7-->p.2. Moreover, not even the Kuroda normal form (slide 11-->p.2) fits into the scheme. Probably Sutner implicitly used the notion of a Noncontracting grammar. The wikipedia article contains his $$a^n b^n c^n$$ grammar as well as the Kuroda NF, and claims equivalence to CSG. - Jochen Burghardt (talk) 09:14, 12 February 2014 (UTC)


 * I got a look into the Mateescu & Salomaa (1997) cited by the Noncontracting grammar article and explained their transformation of noncontracting grammars to context-sensitive grammars, using the $$a^n b^n c^n$$ language as an example. The resulting grammar is different from that you revealed as flawed. - Jochen Burghardt (talk) 16:38, 12 February 2014 (UTC)


 * Today, I changed the $$a^n b^n c^n$$ grammar to the grammar from Noncontracting_grammar, simplified by
 * replacing  [a] by a,   [b] by b,   [c] by c,   Z1 by W,   Z2 by X,   and
 * contracting the last four rules into bB → bb.
 * I hope the simplifications (and the source grammar) are correct; please cross-check. Apparently, Metaxal's above derivation of aaabbcccc doesn't work any longer now. - Jochen Burghardt (talk) 20:33, 1 April 2014 (UTC)


 * Looks good to me (I'm actually testing the grammar with random derivations), except maybe that some derivations can get stuck with the X non-terminal alone, making the branch invalid, like aaabbbcXc. Metaxal (talk) 15:58, 7 April 2014 (UTC)

Confusing
Using a grammar that contradicts the definition is highly confusing. What is a monotonic grammar?


 * I'm moving the definition to "monotonic" grammar to its own page. It's wrong to include it here, since this page is not about classes of grammars that happen to describe the contest-sensitive languages, but about context-sensitive grammers proper.  Rp 13:38, 25 July 2007 (UTC)

HERE IS THE EASY ANSWER:

S → aSBC | abc CB → BC aB → ab bB → bb bC → bc cC → cc


 * The problem with your solution lies in the clause "S -> abc". As far as "abc" terminates your recursion, the structure "aabcBC" always gets created whenever you try to use recursion. The way to go is the usage of "S -> aBC", which is totally equivalent with the grammar in the Wiki page. --AdamDi (talk) 11:11, 29 April 2012 (UTC)

can you please explain this textbook problem
That why AB -> BA is not type 1 grammar —The preceding unsigned comment was added by Ra.ravi.rav (talk • contribs) 11:24, 18 February 2007 (UTC).

Is context-dependent the same thing as context-sensitive?
In the Turing completeness article, there is a redlink for context-dependent grammar. If that is the same thing as context-sensitive grammar, please fix the link. Paul Foxworthy (talk) 06:51, 10 June 2010 (UTC)

Formal Definition is not accurate
Specifically, "...and S does not appear on the right-hand side of any rule..." It seems the definition is not accurate. I would like to propose to recommend that the definition changes to a more accurate definition. A more accurate definition would be that the length of the left hand side of the formula is less than or equal to the length of the right hand side of the formula and the grammar cannot be represented in Chomsky Normal Form (i.e. there must be at least one string on the right that is non-contracting and has at least three symbols). Thus, the start symbol, can still appear on the right side of the rule as long as those conditions are met. Being able to include the start symbol on the right of the grammar would be able to simplify many essentially non-contracting context sensitive grammars with equivalent constructions where the the start symbol would not be allowed on the right.

Consider this quick and dirty example I thought of below, as, to write it without the start symbol would create many more production rules, but the start symbol on the right does not effect the fact it the grammar is essentially non-contracting and context sensitive.


 * 1) $$S \rightarrow \lambda $$
 * 2) $$S \rightarrow aBC   $$
 * 3) $$S \rightarrow DSd   $$
 * 4) $$S \rightarrow DSe   $$
 * 5) $$B \rightarrow bb   $$
 * 6) $$C \rightarrow ccc   $$
 * 7) $$D \rightarrow dcc   $$

Not having the S symbol on the right is more of a rule of thumb or maybe a notation convention, not a formal definition. As it might be beneficial for the student to not write it as such for confusion resulting from the following situation:


 * 1) $$S \rightarrow \lambda $$
 * 2) $$S \rightarrow SAB   $$
 * 3) $$A \rightarrow Saa   $$
 * 4) $$B \rightarrow SAb   $$

Which might appear to be context sensitive, but isn't, because it could be rewritten in Chomsky Normal Form.

Thus, I suggest we change or clarify the definition in this article.

reference: http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Context-sensitive_grammar.html

Jmark13 (talk) 16:50, 8 August 2013 (UTC)

Update:

After reading a few more texts on the matter It seems clear to me that the formal definition of a Context Sensitive grammar is simply the following:

Beginning of Formalism:

A context-sensitive grammar G is a quadruple (V, $$\Sigma$$, R, S) where V is a finite set of symbols, $$\Sigma$$ is the subset of V which contains only the terminal symbols and S is the start symbol in V, V$$\notin \Sigma$$.

R is a finite set of production rules in the form $$\alpha \rightarrow \beta $$ such that $$\alpha$$ and $$\beta$$ are members of V and $$\Sigma$$ and |$$\alpha$$| $$ \leq $$ |$$\beta$$| where |x| is the length of x.

End of Formalism

Also, we should note that an essentially non-contracting non-context sensitive grammar is that context sensitive grammar which can be represented in such a way where no Start symbol appears on the right of the production Rule set. This is in contrast to what is written now, which states that a grammar cannot be context sensitive if there is a Start symbol on the right hand side of the production rule set which can go to the empty string. The actual formal definition of context sensitive grammars is broader based on the references cited.

Jmark13 (talk) 19:27, 9 August 2013 (UTC)


 * Hi Jmark13.
 * I don't think it's obvious why your references should be used to change the definition. Here are some other search results supporting the tendency of the current definition:, ,.
 * However, I agree that the exceptional rule concerning $$S\rightarrow\lambda$$ is a bit informal (and even inaccurate, because it does not clearly state that S may appear on the right side, if there is no rule $$S\rightarrow\lambda$$). In addition, there are more quite important authors who defined context-sensitive grammars as non-contracting (Aho and Ullman, for example, and Ullman and John Hopcroft in Introduction to Automata Theory, Languages, and Computation). So it seems legitimate to adopt their definition in this article.
 * Still, since there are authors who distinguish context-sensitive grammars and monotonous grammars (e.g. Grzegorz Rozenberg and Arto Salomaa in ), I would object to replacing one definition by another. Rather than claiming that some authors additionally require non-contracting rules, it should be mentioned that the current definition implies non-contracting rules (except the exception) and that the generative power is the same (see my first link).
 * There's one thing I don't understand: Why do you mention Chomsky Normal Form as a negative criterion? Consider a grammar with two rules, $$S\rightarrow AA$$, and $$A\rightarrow a$$, the grammar clearly is context-sensitive according to both definitions, yet it is in CNF – or am I missing something?
 * --Zahnradzacken (talk) 22:57, 19 August 2013 (UTC)

Thank You, Zahnradzacken, I don't think you are missing anything, but I do think something in the formal definition of context-sensitive grammars is a bit ambiguous. And no, we should definitely not change the formal definition in so far as it agrees with the sources mentioned.

However, a less ambiguous definition of CSG would be one that would be formalized in terms of LBA, and be those re-writing rules that form the languages accepted by an LBA, since it has been proven that the languages produced are equivalent.

In terms of CNF, I was incorrect in my assertion, as languages produced by context free grammars is a strict subset of those produced by context sensitive grammars. I apologize for any confusion.

All this said, essentially non-contracting context sensitive grammars is itself a subset of uniform context sensitive grammars, which may be "mildly contracting" (not to be confused with mildly context sensitive) in that there is a contraction, but the length of the right hand side of every rule, even after contraction, is still strictly greater than or equal to the left... The languages produced by this definition should be obviously equivalent to those languages produced by essentially non-contracting grammars.

In my opinion, a set of grammar production rules should be in it's simplest form, i.e. the fewest amount of rules that produce all strings in the language. However, this often requires "mild contraction" on a set of rules that would otherwise be context-sensitive and essentially non-contracting. And while this might seem like splitting hairs here, there are a lot of results to theorems that depend on grammars that are deemed essentially non-contracting and context sensitive that would have to be re-proved for a "mildly contracting" context sensitive grammar, but despite the fact there is an S on the right, it should be obvious that these mildly contracting context sensitive grammars can be re-written as essentially non-contracting context sensitive grammars (just add more rules and replace each S symbol accordingly), and have an equivalent expressive power.

So, "What is true?" And if it is true that mildly contracting CSGs produces the same languages as essentially non-contracting CSGs, then the source definition, which includes that S cannot be on the right hand side, isn't a uniform or optimal definition, but applies only to essentially non-contracting CSGs. By eliminating this extra rule, and observing the class of difference between Recursively Enumerable languages with unrestricted grammars that are not CSGs with mildly contracting CSGs, we may simplify our Rule sets when appropriate and still have a CSG.

Anyway, if you see the logic above, and the benefits of being able to reduce some grammar rule sets to a mildly contracting CSG, then I do propose to at least add to the article that only some definitions add the extra requirement that S not occur on the right hand side and that these grammars are called "essentially non-contracting".

Jmark13 (talk) 00:49, 21 August 2013 (UTC)

On definitions and types of equivalence
There seems to be some confusion about the equivalence between context-sensitive grammars and noncontracting grammars. It's true that CSGs and noncontracting grammars are equivalent in the sense that they can describe the same sets of languages. But the definitions of the grammars aren't equivalent.

A definition is basically a sentence that talks about mathematical objects (formally speaking, it's a formula, as sentences are formulas without free variables). An example of a definition is "an integer x such that there exists integer y such that y*2 = x". The defined concept (even numbers) consists of those objects from the universe of discourse which yield a true sentence when substituted for x in the definition. Another definition of even numbers is "an integer x such that there exists integer y such that y + y = x". Those two definitions are equivalent. Any object either satisfies both definitions or doesn't satisfy either of them.

The definitions of CSGs and noncontracting grammars aren't equivalent, because e.g. the grammar with productions (S -> Bc; Bc -> Bd; Bd -> bd) satisfies the definition of noncontracting grammars but doesn't satisfy the definition of context-sensitive grammars, as the middle production changes a terminal symbol. That's that.

CSGs and nocontracting grammars are equivalent in their ability to describe languages. For any language L which is generated by a context sensitive grammar G, there exists a noncontracting grammar G' which generates language L. For any language L generated by a noncontracting grammar G, there exists a CSG G' which also generates G.

Some writers may equivocate between those two kinds of equivalence and say that two definitions are equivalent when in fact they define distinct concepts with equal expressive power. However, that should be limited to situations when the definitions differ in minor details and the expressive equivalence of the defined concepts is trivial to see. This is not quite the case here. 178.182.26.47 (talk) 18:19, 28 June 2014 (UTC)


 * Thank you for your explanations; I agree with you. We should distinguish between equality of grammars and equality of their languages. The notion of weak equivalence (formal languages) could be used for the latter relation.
 * However, I wonder why you changed the grammar in section "Examples". The former version was obtained from Noncontracting grammar and Noncontracting grammar which is based on Mateescu & Salomaa (1997, see full ref. at Noncontracting grammar); however I'd forgotten to mention the source up to some minutes ago. There was a lot of confusion about that example (see section "Wrong grammar for language $$ \{ a^n b^n c^n : n \ge 1 \} $$" above), which we shouldn't repeat. Also, following Mateescu & Salomaa (1997), sect.3.1, p.29-30, there is no need to forbid S in a production's rhs, unless a production S→ε exists. - Jochen Burghardt (talk) 17:27, 29 June 2014 (UTC)


 * Today, I changed "equivalent" to "weakly equivalent" where appropriate in the article, but restored the original grammar version. - Jochen Burghardt (talk) 11:43, 9 July 2014 (UTC)

Continuing issue With the listed grammar
There is a problem with the grammar listed on the site as of now. As I'm a new user, I'm reluctant to edit the page without the consensus of the group.


 * 1) $$S \rightarrow abc $$
 * 2) $$S \rightarrow aSBc   $$
 * 3) $$cB \rightarrow WB   $$
 * 4) $$WB \rightarrow WX   $$
 * 5) $$WX \rightarrow BX   $$
 * 6) $$BX \rightarrow Bc   $$
 * 7) $$bB \rightarrow bb   $$

This leads to problems like:

$$S \rightarrow aSBc \rightarrow aabcBC \rightarrow aabWBc \rightarrow aabWXc \rightarrow aabBXc \rightarrow aabbXc $$

There is no available non-terminal to fix this; it looks like this could be fixed with the additional rule: If there is agreement here, I'll happily edit the page. Pmeixner (talk) 22:19, 7 August 2014 (UTC)
 * 1) $$bX \rightarrow bc $$


 * The definitions of Context-sensitive grammar, derivation, and language don't require every derivation to result in a string of only terminal symbols, so your example is not an issue of the given grammar. However, the notion of a derivation isn't mentioned at all in the article, nor is the notion of the language of a grammar explained there, so your example reveals an issue of the article.
 * I intend to fix this in the next time, also hinting at Garden path sentences, a related phenomenon known from natural languages (e.g. the sentence "The horse raced past the barn fell" tempts to build a derivation that gets stuck, similar to your example; however, there is another one that properly derives the sentence, the same applies to your example). - Jochen Burghardt (talk) 08:52, 8 August 2014 (UTC)


 * On second thought, sentences like "The horse raced past the barn fell" are quite a different kind of garden path, since there a terminal symbol string is given that should be derived from S; in your example, no such string is given. So, I didn't refer to Garden path sentences in the article, but just defined "⇒", "⇒*", and "L(G)", and explicitly stated that derivation that get stuck in a mixed string of nonterminal and terminal symbols are allowed, but don't contribute to L(G). - Jochen Burghardt (talk) 10:29, 10 August 2014 (UTC)

I just noticed too that the example contradicts the one given (from a reliable source) in Noncontracting grammar which doesn't have the extra non-terminals and rules. I don't have time right now to figure it all out, but I don't see anything wrong with simpler grammar right now. JMP EAX (talk) 00:05, 16 August 2014 (UTC)
 * I see the distinction now between grammar and language, but I do have wonder if everyone defines CSG like this. This article is basically citing only one source... JMP EAX (talk) 00:22, 16 August 2014 (UTC)

It seems that the distinction between Context-sensitive grammar and Noncontracting grammar is a source of confusion for many readers. Probably, many authors use the name "context-sensitive grammar" for what wikipedia calls a "noncontracting grammar"; the sentence "Some definitions of a context-sensitive grammar only require that for any production rule of the form u → v, the length of u shall be less than or equal to the length of v." in Context-sensitive grammar tries to make that clear, but it might be necessary to rephrase it (e.g. to "Some authors define ...") to give it more emphasis. Another possibility could be to merge the articles Noncontracting grammar and Context-sensitive grammar. Hopcroft+Ullman define (on p.223-224) a CSG as wikipedia does in Noncontracting grammar, mentioning in their next sentence that the definition at Context-sensitive grammar is a normal form for them, and leaving the proof as excercise 9.9 (p.230); I think that is a reasonable treatment of the issue.

I would like to discuss about the example issues, but I didn't understand which example you found to contradict to which other one. You didn't mean Noncontracting grammar (which is simpler, but not context-sensitive in the wikipedia sense) vs. Context-sensitive grammar, did you? Jochen Burghardt (talk) 09:22, 16 August 2014 (UTC)
 * I think it would help to move/copy to the lead the equivalence to non-contracting grammars and the "some authors [consequently] define CSG this way". A brief survey of the textbooks that Google Books indexes finds that it's not uncommon to have CSG defined as non-contracting: some examples (from the 1st page of hits) . This includes authors like Martin Davis who are normally very scrupulous about historical accuracy an such (def as contracting on p. 189 the Chomskyan def given way later on p. 330).  JMP EAX (talk) 12:03, 16 August 2014 (UTC)

Missing info and possibly the missing link
The so-called left-context and right-context grammars, which have rules on the form $$\alpha A$$ -> $$\alpha \gamma $$ (and the dual) are [weakly-only I assume] equivalent to CSG. I do have wonder if you use "forced swaps" like $$\alpha A$$ -> $$\gamma \alpha$$ what do you get. Probably the same thing. JMP EAX (talk) 00:42, 16 August 2014 (UTC)


 * By duplicating each nonterminal A to A and A2 and transforming each rule αA→αγ to αA→A2α and A2α→αγ it should at least be possible to establish that you get no less expressive power by forced swapping. Vice versa, since the forced swapping rules are noncontracting (I assume), you can't get more, either. - Jochen Burghardt (talk) 09:30, 16 August 2014 (UTC)
 * Yeah, swapping twice was what I had in mind too. The other direction is obvious. JMP EAX (talk) 11:44, 16 August 2014 (UTC)
 * Somewhat related: interesting enough that CFG + AB -> BA rules doesn't give rise to the whole CSG class . JMP EAX (talk) 00:47, 17 August 2014 (UTC)

On a slightly different tack, It would be interesting to find and add historical info about: when (1) Chomsky defined his CSG, (2) who[ever] gave the non-contracting def, (3) equivalence to LBA was proven. JMP EAX (talk) 11:44, 16 August 2014 (UTC)
 * I put Hopcroft+Ullman's "Bibliographic notes" (p.232) here, in order not to interfere with your article editing. Please insert it where appropriate.

 The Chomsky hierarchy was defined in Chomsky (1956, 1959). (...) Kuroda (1964) showed the equivalence of LBA's and CSG's. Previously, Myhill (1960) had defined deterministic LBA's, and Landweber (1963) showed that deterministic LBA languages are contained in the CSL's. Chomsky (1959) showed that the r.e. sets are equivalent to the languages generated by type-0 grammars. (...) 


 * Mateescu+Salomaa prove the equivalence of noncontracting and context-sensitive grammars on p.187; they refer to Salomaa (1973) for details.


 * In Chomsky (1956), I found on p.118 (=p.6 in the pdf file) the quote: "A rule of the form Z X W → Z Y W indicates that X can be rewritten as Y only in the context Z--W." That seems to indicate that Chomsky had wikipedia's CSG definition in mind, not the noncontrating grammar definition.


 * - Jochen Burghardt (talk) 17:55, 16 August 2014 (UTC)
 * In 1963 Chomsky gave the non-contracting def too. See the history section I added to noncontracting grammar. JMP EAX (talk) 00:50, 17 August 2014 (UTC)
 * Alas his notion of strong equivalence appears to have little practical relevance (per ), so I'm not sure is worth pegging down equivalence with weakly every time, because hardly anyone seems to consider the strong one interesting. JMP EAX (talk) 01:24, 17 August 2014 (UTC)
 * Alas his notion of strong equivalence appears to have little practical relevance (per ), so I'm not sure is worth pegging down equivalence with weakly every time, because hardly anyone seems to consider the strong one interesting. JMP EAX (talk) 01:24, 17 August 2014 (UTC)

Duplication of properties etc. with the CSL page
I'm not really sure what to do about that; there's more at context-sensitive language, but the two pages evolved independently so they aren't really a super-set of each other. But except for the normal forms, I'm not sure there are really any properties that are of CSGs per se but don't belong to the CSL page (too). JMP EAX (talk) 13:20, 16 August 2014 (UTC)
 * The CFG vs CFL page have the same issue. I've started a centralized discussion at JMP EAX (talk) 15:24, 16 August 2014 (UTC)

Kuroda normal form
Maybe I'm missing something again, but it seems to me that the "Kuroda normal form" is not really a normal form for CSG as defined by Chomsky. The first rule AB → CD doesn't seem to fit the CSG template of expanding a single non-terminal. JMP EAX (talk) 13:24, 16 August 2014 (UTC)

Also there's "Kuroda normal form" for unrestricted grammars as well. JMP EAX (talk) 13:32, 16 August 2014 (UTC)

Is the grammar in the example a context-sensitive one?
3. $$cB \rightarrow WB  $$

It would be context-free if c would be a non-terminal, isn't it? -- Harp (talk) 09:57, 8 October 2014 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified 1 one external link on Context-sensitive grammar. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20110708224600/https://danielmattosroberts.com/earley/context-sensitive-earley.pdf to http://danielmattosroberts.com/earley/context-sensitive-earley.pdf

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at ).

Cheers.— InternetArchiveBot  (Report bug) 18:05, 11 September 2016 (UTC)

COBOL language context sensitive?
Is the 1774 COBOL programming language context sensitive?

It seams the divisions each having specific syntax would make it contaxt sensitive.

In the data division you can have picture defined variables having pictures clauses like: 999.99 that defines a five digit numeric field having two decmal places. The same string in the procedure division would be a numeric constant. Other picture string would not even valid in orher divisions. Basicly each COBOL division has its own syntax. This I think makes it context sensitive in the Chomsky hierarchy.

The ENVIRONMENT division was just comments, skiped by the compilers of the day.

So having different syntax bounded by constant division name strings, i.e. terminal symbols, seams to be make a context.

I haven't studied formal linguistics. Except for reading a pdf on Chomsky's grammars I found on the web. I worked on a COBOL 74 compiler in the late 1970s. Its syntax was defined by an analytical grammar/programming language.

Steamerandy (talk) 21:37, 21 August 2018 (UTC)

k repetitions of a string
Using the grammar in section Context-sensitive_grammar, what about the following derivation:
 * $$S$$
 * $$\rightarrow _2   S B_1 B_2 B_3$$
 * $$\rightarrow _4   A_{L1} A_2    A_3    B_1    B_2    B_3   $$
 * $$\rightarrow _8   a      A_{L2} A_3    B_1    B_2    B_3   $$       using the instance "$$A_{L1} A_2 \rightarrow a A_{L2}$$"
 * $$\rightarrow _8   a      a      A_{L3} B_1    B_2    B_3   $$       using the instance "$$A_{L2} A_3 \rightarrow a A_{L3}$$"
 * $$\rightarrow _8   a      a      a      B_{L1} B_2    B_3   $$       using the instance "$$A_{L3} B_1 \rightarrow a B_{L2}$$"
 * $$\rightarrow _8   a      a      a      b      B_{L2} B_3   $$       using the instance "$$B_{L1} B_2 \rightarrow b B_{L2}$$"
 * $$\rightarrow _8   a      a      a      b      b      B_{L3}$$       using the instance "$$B_{L2} B_3 \rightarrow b B_{L3}$$"
 * $$\rightarrow _{10} a     a      a      b      b      b     $$

Did I overlook some restriction in the grammar that forbids this derivation? - Jochen Burghardt (talk) 08:10, 2 February 2022 (UTC)


 * Yes. If you look closer, you will notice that neither of your usages of the rule number 8 was according to its prerequisites. The $$i$$ doesn't stand for three occurances of any three numbers, it stands for three occurances of a single number. 2001:718:2:22:0:0:0:52 (talk) 21:18, 18 February 2022 (UTC)
 * Ok, I see. I'm beginning to get the idea behind your grammar. Could you rephrase rule 8 in a more fool-proof way? You don't happen to have a source for the grammar? By the way: you could use rules 7 and 8 "as is" if you use a noncontracting grammar. - Jochen Burghardt (talk) 22:06, 18 February 2022 (UTC)
 * I think it's fairly foolproof, but maybe expicitly specifying that $$i = i = i$$ could help. As for source, I made it up. Also, this is suppossed to be example of context-sensitive grammar, not of a noncontracting grammar. 2001:718:2:22:0:0:0:52 (talk) 06:22, 19 February 2022 (UTC)
 * Haha! Rule 8 uses e.g. $$\tau_i$$ and $$\tau_{Li}$$, but "defines" (in the "where"-clause) just $$\tau$$. Apparently, you don't use $$\tau_i$$ as variable (like in rule 7, and even like $$\sigma_{Li}$$ in rule 8), but as pair $$(\tau,i)$$, where $$\tau \in \{A,B,C\}$$ and $$i \in \{1,2,3\}$$; similar for $$\tau_{Li}$$. - Your example can be turned into a noncontracting grammar by omitting the "using rules ..." explanations in rule 7, 8. This would result in an easier-to-understand presentation. Conversely, applying the algorithm at Noncontracting_grammar would just re-add the "using rules ...".
 * No offense, but I think it can be understood by anyone trying to understand it. At the time of writing it seemed to me that explicitly declaring every combination of indexes and major nonterminals would make this grammar even more unbearable, and I still stand by my opinion. I'm not saying you technically don't have a point, but I seem to have missed a proposition of a solution. - This page is about context sensitive grammars. I don't see a problem with mentioning that if you omit the "using rules ..." parts, you get a noncontracting grammar, but I don't really see a point either. It doesn't get that much shorter by omitting these rules, because you still have to specify the prerequisites. 2001:718:2:22:0:0:0:52 (talk) 17:39, 19 February 2022 (UTC)
 * I understand the rule now, but I didn't understand it when I devised my (flawed) counter-example. And while it is easily possible to transform most of your rules into, say, a Prolog program to parse or generate strings, rule 8 can't be transformed in the same way. I'll try to come up with a suggestion, but this may take some time. - As for context-sensitive vs. noncontracting, what about moving the example to the latter article, which currently doesn't have any example? However, I won't insist on that. - Jochen Burghardt (talk) 08:10, 20 February 2022 (UTC)
 * I see. Would moving it to serve as example of noncontracting grammar and linking it in this article be an option? I don't see a reason to not place in the other article, other than that I wrote it because I couldn't find equivalent context sensitive grammar anywhere else. 2001:718:2:22:0:0:0:52 (talk) 21:19, 21 February 2022 (UTC)

(No reply for almost 2 weeks; moving dubious section to here, see below) - Jochen Burghardt (talk) 17:41, 13 February 2022 (UTC)

=== k repetitions of a string === The following context-sensitive grammar with start symbol S generates $$\left \lbrace www : w \in \left \lbrace a,b,c \right \rbrace ^ + \right \rbrace $$:

Because rule 7 can only switch two non-terminating symbols if the index of the non-terminating symbol on the left is larger, we know that non-terminals with the same index cannot be switched around, thus malforming the generated string. If, however, the rules 9 or 10 are used prematurely, the transformation to string of terminal symbols will be impossible, either because index of the leftmost non-terminal will be too high to use rule 8, or because it will be transformed into terminal symbol with other non-terminal symbols remaining. Therefore this grammar does indeed generate only the language $$\left \lbrace www : w \in \left \lbrace a,b,c \right \rbrace ^ + \right \rbrace $$.

This grammar can also be with little effort extended to generate empty string (by adding new start symbol, that can produce either empty string or current start symbol), to generate more copies of $$w$$, or to work with larger alphabet. A generation chain for abcabcabc is:
 * $$S$$
 * $$\rightarrow _3 S C_1 C_2 C_3$$
 * $$\rightarrow _2 S B_1 B_2 B_3 C_1 C_2 C_3$$
 * $$\rightarrow _4 A_{L1} A_2 A_3 B_1 B_2 B_3 C_1 C_2 C_3$$
 * $$\rightarrow _7 A_{L1} A_2 A_3 B_1 B_2 C_1 B_3 C_2 C_3$$
 * $$\rightarrow _7 A_{L1} A_2 A_3 B_1 B_2 C_1 C_2 B_3 C_3$$
 * $$\rightarrow _7 A_{L1} A_2 B_1 A_3 B_2 C_1 C_2 B_3 C_3$$
 * $$\rightarrow _7 A_{L1} A_2 B_1 B_2 A_3 C_1 C_2 B_3 C_3$$
 * $$\rightarrow _7 A_{L1} A_2 B_1 B_2 C_1 A_3 C_2 B_3 C_3$$
 * $$\rightarrow _7 A_{L1} A_2 B_1 B_2 C_1 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _7 A_{L1} A_2 B_1 C_1 B_2 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _7 A_{L1} B_1 A_2 C_1 B_2 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _7 A_{L1} B_1 C_1 A_2 B_2 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _8 a B_{L1} C_1 A_2 B_2 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _8 a b C_{L1} A_2 B_2 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _9 a b C_{L2} A_2 B_2 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _8 a b c A_{L2} B_2 C_2 A_3 B_3 C_3$$
 * $$\rightarrow _8 a b c a B_{L2} C_2 A_3 B_3 C_3$$
 * $$\rightarrow _8 a b c a b C_{L2} A_3 B_3 C_3$$
 * $$\rightarrow _9 a b c a b C_{L3} A_3 B_3 C_3$$
 * $$\rightarrow _8 a b c a b c A_{L3} B_3 C_3$$
 * $$\rightarrow _8 a b c a b c a B_{L3} C_3$$
 * $$\rightarrow _8 a b c a b c a b C_{L3}$$
 * $$\rightarrow _{10} a b c a b c a b c$$