Most generated parsers don’t let you select the target production at run time: they are always looking for the same syntax. But what if you want to call your parser in different ways, for example sometimes expecting a statement, and sometimes just an expression? Here’s a trick I came up with to use any parser as a switchable-start parser.
ID EQ expr SEMI
| PRINT expr SEMI
| IF expr STMT
| IF expr STMT ELSE stmt
| WHILE expr STMT
LPAREN expr RPAREN
| expr op expr
op -> PLUS | MINUS | STAR | SLASH ;
(This is a very small language, expressed in PyGgy’s syntax.) The parser generated by this file will parse a statement. Let’s say you want to now call the parser to parse for just an expression. How to do it? PyGgy doesn’t have the option of specifying the start production at run time: it always starts with the first non-terminal.
Here’s a trick I came up with: Extend the grammar with a special token that indicates what to parse for. Then at run time, “tell” the parser what it’s looking for by pushing that token into the parser before the real text.
We add a new rule to the grammar:
| EXPECT_EXPR expr
When we start the parser, if we’re looking for a statement, we push EXPECT_STMT as the first token in the stream. If we’re looking for an expression, we push EXPECT_EXPR. The first token switches the parser onto the proper track, and the parser takes care of the rest. Of course, this can be extended to any number of start states.
I wanted to come up with a catch alliterative name for this technique, like Duff’s Device. But there’s aren’t many useful words starting with N or B, so I’m going with Ned’s Prescriptive Prefix Pushing Ploy.