Talk:TREE-META

Corrections
From Alan Kay

Thanks for starting this page on Tree-Meta (and its ancestors).

Here are a few more historical references and minor corrections.

Steve Carr, Dave Luther and I were all grad students together at Utah and also colleagues and friends.

The master paper for this style of meta-translation is by its inventor Val Shorre, originally at UCLA. Published in 1964, this is a gem of a paper.

Shorre, D.V., META II a syntax-oriented compiler writing language, Proceedings of the 1964 19th ACM National Conference, pp. 41.301-41.3011, 1964

(The "style" is to give a procedural meaning to a BNF like grammar, by making a few modifications, limiting the lookahead, etc. The result is beautifully compact and simple, yet is able to handle large number of interesting real language cases.)

The paper itself is a wonderful gem which includes a number of excellent examples, including the bootstrapping of Meta II in itself (all this was done on an 8K (six bit bytes) 1401 ! ) Shorre then went on to System Development Corp to continue developing this style of translator. Tree Meta was one of these styles (so called because it mapped the parse into a tree (a kind of abstract syntax) and this allowed manipulations for optimizations very familiar today. So Tree-Meta had two passes and two syntaxes, the first for parsing an input language, and the second for manipulating the trees (and this latter could have been done more usefully and generally)

The NLS group at SRI had the happy combination of a visionary (Doug), a great funder (ARPA via Bob Taylor) and one of the best action groups of the 60s (headed by Bill English, with the major systems design and implementation by Jeff Rulifson, Don Andrews, Bill Paxton, and others).

They were fortunate to have the SDS-940 appear just when they were ready to design and implement the second version of their system. (This machine was actually created at UC Berkeley by Mel Pirtle, Butler Lampson, Peter Deutsch, and others. Bob Taylor at ARPA "induced" SDS to make more of them for the ARPA research projects.)

The SDS-940 that they used had 192K bytes of memory and was about 1/2 MIP(!) Anyone who looks at the videos of "the mother of all demos" will wonder "how did they do it?".

Part of it was vision, part was rapid prototyping, part used NLS "live" to improve itself. And part was to "go meta" to make special languages that would strongly fit the problems they were working on, and then to compile these very efficiently to make their system.

Steve Carr and I (grad students at the ARPA project at Utah) were frequent visitors to the two ARPA supported groups at SRI (the other one was Bert Raphael's robot project). Jeff Rulifson was both a king pin on the NLS project, but eventually did his Stanford PhD thesis by inventing an AI language (QA4).

Reference [1] was a reimplementation and bootstrapping of the Tree-Meta system at Utah. It was based very strongly on the NSL work.

I did not use the Tree version of Meta in any of my systems. One of the FLEX extensible languages incorporated a Meta-like facility live in the language.

(However William Newman and others did use a version of Meta to implement a subset of FLEX on the PDP-9.)

The Meta idea was used even more strongly by me in 1972 to describe Smalltalk-72 on one page (basically to replace McCarthy's nice way of doing eval with one that could match patterns procedurally, so that almost all of the eval could be removed and distributed to the objects as extensible interface languages. This is described in "The Early History of Smalltalk".

An extremely fun modern extension of the Meta ideas is by Alex Warth which can be played with online at Some papers about this can be found at

12.71.90.2 (talk) 18:29, 10 December 2008 (UTC) Alan Kay


 * Thank you for for this information! May I add it to my less formal page about TREE-META? The bit about your thesis and TREE-META was based on this quote 'Henri Gouraud did the FLEX compiler in TREE-META (12) on the SRI (Engelbart) SDS-940.'.  I've added some clarification to the reference (all of which need reformatting).  Feel free to change the verbiage yourself too of course.
 * JamesPaulWhite (talk) 23:49, 10 December 2008 (UTC)


 * The Shorre paper is available if you have an ACM membership: http://portal.acm.org/citation.cfm?id=808896

Metageek (talk) 14:38, 11 December 2008 (UTC)


 * Also at http://www.hcs64.com/files/pd1-3-schorre.pdf and http://ibm-1401.info/Meta-II-schorre.pdf I've also got a transcript (i.e. text) that I made with permission in the 1980s, but I'm reluctant to upload it anywhere lest the ACM accuse me of violating their copyright. MarkMLl (talk) 08:38, 9 October 2014 (UTC)

I think we have here
These are very old programs and we didn't have the compiler lingo we have today. Take a look at Parsing expression grammar. META II, TREEMETA and CWIC syntax rules are a Parsing expression grammar not BNF. Or at least PEG is closer to what they are. They all say they are BNF like. But that was basically all they had back then. META II and TREEMETA do not have token rules but I would still say they are PEGs and should be given credit for being so. They predate the conception of the term PEGs. But that doesn't prevent them from being one. You program the parsing. It's not just a set of rules to be interpreted anyway. Their order of execution is specifically defined Left to Right Rules work left to right only deviating by looping, grouping and alternants. The programmer controls the tree produced::

expr = term $(('+':ADD | '-':SUB) term [2]);

generates a left handed tree. Given a+b-c would generate SUB[ADD[a,b],c] the rule:

expr = term (('+':ADD | '-':SUB) expr [2]|.EMPTY);

parsing a+b-c generates a right handed tree ADD[a,sub[b,c]]. CWIC added list generating notations and used ! to build trees.

expr = < term $('+' term | '-':MNS term !1)>:SUM!1;

generates a list: SUMa,b,MNS[c];

If this isn't an example of a PEG what is it? Maybe even more then a PEG. The programer has control of the parsing and the output structuring. Much closer to a PEG then to BNF. Also note that the tree output is a form of functional notation. They are early syntax directed compilers.

CWIC was developed at Systems Development Corporation. Dewie was a member of the CWIC development team. I developed SLIC System of Languages for Implementing Compilers at Cerritos College. My contrabution was the PSEUDO and MACHOP instruction languages. In the MACHOP language you define assembly instructions and their translation to a sequance of bit fields. PSEUDO instructions performed the functions of assembly macros calling MACHOPS to output relocatable machine code.

--Steamerandy (talk) 20:30, 22 October 2014 (UTC)

maybe missed why unparse rules called unparse rules
Changing unparse rule I think was wrong!! They are rules and can fail just the same as a syntax rule. Unparse is descriptive of what they do. For example the case of matching x=x+1 can be matched by the unparse rule:

STORE[-,ADD{*1,"1"]]=>

"*1" is the left leg of the STORE. It can be used to match another leg as in matching the left leg as the ADD node above.

STORE[-,ADD[-,*1]=>

or the right leg of the ADD in the above unparse rule. Looks like it is unparsing what was parsed to me. These are called patterns in CWIC. But after reading the TREEMETA docs I think unparsed is more descriptive. A parese rules returns success or failure. And so does the unparse rule.

Thw x=x+1 is from an example in one of the TREEMETA documents referenced here. CWIC can match complex trees as well. Lists can be produced from te parser.

See the META II page for other examples of controlling the TREE. These are not toys. They are real compiler writing tools. There is lots to be said about them. I got a CWIC manual from Erwin Book at an ACM meeting before it was classified. I never got the news. I wrote SLIC (System of Languages for Implement Compilers) when a Student (sort of) at Cerritos Collage. SLIC implemented CWIC exactly except for outputting code. For that I added a machine instruction language that the defined the assembly syntax. The machine produced binary object code.. A pseudo instruction was implemented that was output by the generator language instead of 360 code. Pseudo instruction intern emitted the machine instructions. Pseudo instruction were emitted into declared named sections. Section were flushed by a call from a generator. Pseudo were simply a list attached to the section/ An ISO Insequence optimizer could optionally process the pseudo instruction before they were executed. During pseudo execution the machine operations were called and output relocatable binary code. The last stage was a linker that processed polish prefix fixup blocks and converted the bianary to a loadable format of the target machine. OH. Machine operations were defined in a bit addressable memory space. So any addressable memory storage size could be generated. 36 bit word or 8 bit byte. A 16 bit word aligned instruction in an 8 bit byte addressable memory. TI-990 is an example. SLIC was developed and ran on a DEC-System-10. It was used to write a COBOL cross compiler running on the DEC-10 outputting TI-990 code. The compiler ran faster then the native DEC COBOL compiler. I have forgoten the lines per minute number but is was as fast or faster then other COBOL compiler of the day.

The generator language of SLIC and CWIC was a LISP 2 clone. Erwin Book was one of the developers of LISP 2 at SDC.

--Steamerandy (talk) 09:25, 1 October 2014 (UTC)


 * I'm a little confused here, having followed this article for a few years. You say "I got a CWIC manual from Erwin Book at an ACM meeting before it was classified.", but you also edited this article to remove the statement about TREE-META's successors being classified specifically with the comment, "Removed incorrect statement. CWIC was not classified tek."  Forgive me for finding that more than slightly confusing.  Also, I am curious as to why you so extensively edited the example program and used the comment "Who wrote this in the first place?" - originally this example was copied verbatim from a TREE-META manual, and the article still sort of implies that it is (though, since your edits, I can only conclude that it is no longer).96.89.42.188 (talk) 18:32, 20 March 2017 (UTC)

I was informed that CWIC was not classified, but internally classed as proprietary when SDC went private. However at that time programming languages were not patientable.

I have talked to a lawyer and such a thing would be hard to defend. My SLIC compiler development package has syntax and generator language very simular to CWIC. SLIC's syntax language is very close to CWIC's. But so are TREEMETA's and META II's. My generator language is vary simular but also different outputing PSEUDO instructions programed in the PSEUDO language. There is a generalized list processing runtime used by all the sub-languages except the MACHOP sub-language. Steamerandy (talk) 22:13, 18 May 2019 (UTC)

TREEMETA UNPARSE rules are UNPARSE rules.
Why are not the unparse rules described as unparse rules? From the TREEMETA Manual:

27 Unparse Rules 27A Syntax 27Al outrul = '[ outr (outrul / .empty); 27A2 outr = items '] "=>" outexp; 27A3 items = item (', items / .empty); 27A4 item = '- / .id '[ outest / nsimpl / '. .id / .sr / '' .chr / '.pound; 27B Semantics

27B1 The unparse rules are similar to the parse rules in that they test something and return a true or false value in the general flag. The difference is that the parse rules test the input stream, delete characters from the input stream, and build a tree, while the unparse rules test the tree, collapse sections of the tree, and write output.

27B2 There are two levels of alternation in the unparse rules. The highest level is not written in the normal style of Tree Meta as a series of expressions separated by slashes; rather, it is written in a way intended to reflect the matching of nodes and structure within the tree. Each unparse rule is a series of these highest-level alternations. The tree-matching parts of the alternations are tried in sequence until one is found that successfully matches the tree. The rest of the alternation is then executed. There may be further test within the alternation, but not complete failure as with the parse rules.

27B3 The syntax for a tree-matching pattern is a left bracket, a series of items separated by commas, and a right bracket. The items are matched against the branches emanating from the current top node. The matching is done in a left-to-right order. As soon as a match fails the next alternation is tried.

27B4 If no alternation is successful a false value is returned.

27B5 Each item of an umparse alternation test may be one of five different kinds of test.

27B5A A hyphen is merely a test to he sure that a node is there. This sets up appropriate flags and pointers so that the node may be referred to later in the unparse expression if the complete match is successful.

27B5B The name of the node may be tested by writing an identifier that is the name of a rule. The identifier must the be followed by a test on the subnodes.

27B5C A nonsimple construct, primarily an asterisk-number-colon sequence, may he used to test for node euivalence. Note that this does not test for complete substructure equivalence, but merely to see if the node being tested has the same name as the node specified by the construct.

27B5D The .id, .num, .chr, .let, or .sr checks to see if the node is terminal and was put on the tree by a .id recoenizer, .num recognizer, etc. during the parse phase. This test is very simple, for it merely checks a flag in the upper part a word.

27B5E If a node is a terminal node in the tree, and if it has been recognized by one of the basic recognizers in Meta, it may be tested against a literal string. This is done hy writing the string as an item. The literal string does not have to be put into the tree with a .sr recognizer; it can be any string, even one put in with a .let.

27B5F If the node is terminal and was generated by the .chr recognizer it may be matched against another specific character by writing the apostrophe-character construct as an item.

27B5G, Finally, the node may be tested to see if it is a generated labe1. The labels may he generated in the unparse expressions and then passed down to other unparse rules. The test is made writing a .pound-number construct as an item. If the node is a generated label, not only is this match successful but the label is made available to the elements of the unparse expression as the number following the .pound. 28 Unparse Expressions 28A Syntax 2RB 28Al outexp = subout ('/ outexp / .empty); 28A2 subout = outt (rest / .empty) / rest; 28A3 rest = outt (rest / .empty) / gen (rest / .empty); 28A4 outt = .id '[ arglst '] I '( outexp ') I nsimpl (': (S / L / N / C) / .empty); 28A5 arglst = argmnt (', arglst / .empty) / .empty; 28A6 argmnt = nsimp / '.pound .num; 28A7 nsimpl = 't nsimp I nsimp; 28A8 nsimp = '* .num (': nsimp / .empty); 28A9 genl = (out / comm) (genl / .empty); 28A10 gen = comm / genu / '< / '> ;

28B Semantics

28B1 The rest of the unparse rules follow more closely the style of the parse rules. Each expression is a series of alternations separated by slash marks.

28B2 Each alternation is a test followed by a series of output instructions, calls of other unparse rules, and parenthesized expressions. Once an unparse expression has begun executing calls on other rules, elements may not fail; if they do a compiler error is indicated. and the system stops. — Preceding unsigned comment added by Steamerandy (talk • contribs) 21:11, 14 October 2014 (UTC)

Observation on the unparse stage
It's interesting to note that Schorre et al. put an explicit marker in the syntax description to indicate at what point a constructed parse tree should be "unparsed" to a linear sequence of machine instructions. It appears that at least some compilers of the 1970s era built the entire parse tree in memory and only generated code when it was intact, for example see http://people.cs.clemson.edu/~mark/s1.html on Stallman's early experience with the Livermore/Stanford "Pastel" compiler. MarkMLl (talk) 09:09, 9 October 2014 (UTC)
 * Many compilers wrote the parse tree to a file to be processed by other passes.


 * I am not sure that Schorre was directly involved with TREE-META. I have never seen a TREE-META document with Schorre an author. It is a Schorre language. CWIC with Schorre a named participant builds a tree structure exactly the same way and has simular unparse rules. The only difference is the tree construction operator. TREE-META bring [ ] and CWIC using ! . CWIC was far more advanced having a symbol table and token rules. The unparse actions are written in A LISP II dialect. All minuplatable data are objects that carried their type. Variables were not typed as they are all pointers to a general object.


 * CWIC could write the tree to a file to be processed by subsequent passes or process it in small chunks as in the TREE-META example here.


 * The fact that these Schorre based metacompilers are actually compiling the languages as written. Giving the programmer control of the parsing and transformation by the way it is coded is the major defining difference that seperate it from other so called compiler compilers. In TREE-META a left or right handed tree can be produced by the way a rule is coded. CWIC can be just as easly coded to control tree production and adds the ability to generate lists.

Steamerandy (talk) 23:56, 15 March 2015 (UTC)

Unparse rules
In the SUB[-,-] unparse rule, what does the % do (casual perusal of the Andrews/Rulifson ref doesn't turn this up)? Is there any easy way of explaining *1:*1 to the casual reader? MarkMLl (talk) 09:45, 1 August 2016 (UTC)


 * I've Added something for that, leaving the < > to be explained. MarkMLl (talk) 10:53, 3 August 2016 (UTC)

ALGOL in Tree Meta
As a placeholder for possible future research, HP Journal June '78 makes brief reference to using Tree Meta to implement an ALGOL compiler for an instrument's firmware, based on Lynn McArthur Wheelwright's Masters thesis "An Algol Implementation Using Tree Meta". Can anybody lay hands on this? MarkMLl (talk) 19:19, 17 December 2017 (UTC)

RFC 101
RFC101 has a few lines on SRI converting from their XDS 940 to a PDP-10, with the help of Utah. https://www.ietf.org/rfc/rfc101.html

Also there are a few general thoughts and links at http://lambda-the-ultimate.org/node/3122, although I'm not sure whether than includes anything not touched on here already. MarkMLl (talk) 21:17, 27 January 2022 (UTC)

Also (via http://lambda-the-ultimate.org/node/5648) http://venge.net/graydon/talks/CompilerTalk-2019.pdf?utm_source=thenewstack&utm_medium=website&utm_campaign=platform p28 mentions Tree-Meta favourably, commenting that it was bootstrapped from Meta-II and that it was used to create MPL https://exhibits.stanford.edu/stanford-pubs/catalog/jh370ps3049 MarkMLl (talk) 07:55, 31 December 2023 (UTC)