User:Cscott/Ideas/Improved for-loops for Lua

What for-loops are missing
At the March 2024 MediaWiki Engineering Offsite, User:MatmaRex proposed a lightning talk titled "What the for-each loop is missing". Without spoiling his talk too much, his observations boil down to two common features programmers have to manually implement on top of standard for-loops:

1. Some way to detect whether you are at the first or last iteration of the loop. For example: 2. Code which executes "only if the loop iterated at least once" or "only if the loop never executed". For example:

Implementation in Lua
MatmaRex presented his proposal in a language-independent way, and partial implementations for various languages exist. For example, python3 has an else clause used with for-loops which executes "only if loop completes normally". A custom iterator in PHP was also written that provides the first iteration and last iteration booleans. Since I had a Lua grammar and interpreter handy, I decided to take a shot at a Lua implementation of the full proposal.

As shown in the above examples, there are two additions to the lua grammar. First, an optional  clause is added in both the for-in and for-num productions. The first  names a boolean local variable which is true during the first iteration, and the second   names a boolean local variable which is true during the final iteration. Note that these can be named anything, although in most of my examples they will be named  and   for clarity. But for nested for loops you may very well have  and , or   and  , etc.

For simplicity and clarity I've chosen to always make both the "first" and "last" identifiers mandatory; there is no way to ask for only "first" and not "last", or only "last" and not "first". This does have some runtime implications in the for-in case: with lua's implementation of iterators/generators, we can't determine whether we are on the last element without actually requesting it. Thus, when a  clause is present, a for-in loop always executes "one iteration behind"; that is, it requests element N+1 before executing the loop with element N. In some corner cases with user-implemented generator functions this behavior might be observable. Consider this example, adapted from the lua manual's description of how  is implemented: Without the  in the for-in loop this prints: But when  is added this prints: It would be possible to add  (without the  ) as an alternative production, and when only the "first" boolean is required we wouldn't need to execute one iteration behind, but I haven't done that in this implementation.

The second grammar feature is adding optional  and   clauses to the for-in and for-num loops. We have a number of design questions here: what local variables should be visible in the scope of the  and/or   block, and what should their values be? What should the behavior of the  and   block be when   is used in the   loop? (Lua does not have a  statement.) And, finally: our choice regarding   behavior made it desirable sometimes to combine the "more than zero iterations"  and "zero iterations"  cases; how should this be done?

I made the following choices:

1. In  blocks, the iteration variable is visible and it is reset to the value it had on the last iteration of the loop. (Any local writes to this variable in the  block are discarded.)  In   blocks, the iteration variable is not defined; it would not have a useful value in this case at any rate. This makes this example adapted from Python work in Lua as well:

If a  clause is present, neither of its local variables is defined in the   or   block.

2. When executing a  statement,   and   blocks are skipped. This tweaks the semantics for  and   blocks: they are executed only on normal completion (non- ) of non-zero/zero iterations of the   loop. This makes this example adapted from Python work in Lua as well:

3. If a  block is present without an   block, then the   block is executed on any normal completion of the loop, even if it had zero iterations. This seems to match the common use case when only a  block is present, as in the example above. You can think of this as effectively duplicating the  block and using it as the   block as well, but note that (unlike usually in an   block) the loop variables are declared in the body of the   block; if the loop was not executed they will all be set to. It could be argued that we should use a different grammatical marker for this case, perhaps something like  (as a single keyword), but we've opted to keep it simple in our implementation.

The final grammar, using the LPegRex grammar formalism, looks like:

Using this syntax in Scribunto
The Lua grammar and interpreter is written to be compatible with Scribunto and can be used on wiki. One caveat is that Scribunto enforces syntax-checking on Lua code stored in the  namespace, which means that Lua code using with/for-then/for-else can't be successfully saved in that namespace. However, we can parse and execute modules from other namespaces; my examples will use Lua code stored under my user namespace.

To execute code using mlua, which supports this extended for-loop syntax, you just need to replace  with   in your wikitext. Note that mlua's  method defaults to the   namespace like Scribunto's   does; because our extended syntax "is not syntactically-correct lua" we need to use   which can execute from the   (or other) namespace. The arguments after  are the title of the module and then the function name within that module, just as with.

Live examples using mlua:
 * Source code: /example1
 * Executing :
 * Executing :
 * 41 =
 * 42 =
 * Executing  (see below):

This works in French as well (use ):
 * Source code: /example1/fr
 * Executing :
 * Executing :
 * 41 =
 * 42 =
 * Executing  (see below):