Ned's Prescriptive Prefix Pushing Ploy

Thursday 14 October 2004

Most generated parsers don't let you select the target production at run time: they are always looking for the same syntax. But what if you want to call your parser in different ways, for example sometimes expecting a statement, and sometimes just an expression? Here's a trick I came up with to use any parser as a switchable-start parser.

Let's say you're using a parser generator (such as Bison, Lemon, or PyGgy). You've defined a grammar for a little language with statements:

stmt ->
      ID EQ expr SEMI
    | PRINT expr SEMI
    | IF expr STMT
    | IF expr STMT ELSE stmt
    | WHILE expr STMT
    ;

expr ->
      LPAREN expr RPAREN
    | expr op expr
    | NUMBER
    | ID
    ;

op -> PLUS | MINUS | STAR | SLASH ;

(This is a very small language, expressed in PyGgy's syntax.) The parser generated by this file will parse a statement. Let's say you want to now call the parser to parse for just an expression. How to do it? PyGgy doesn't have the option of specifying the start production at run time: it always starts with the first non-terminal.

Here's a trick I came up with: Extend the grammar with a special token that indicates what to parse for. Then at run time, "tell" the parser what it's looking for by pushing that token into the parser before the real text.

We add a new rule to the grammar:

main ->
      EXPECT_STMT stmt
    | EXPECT_EXPR expr
    ;

When we start the parser, if we're looking for a statement, we push EXPECT_STMT as the first token in the stream. If we're looking for an expression, we push EXPECT_EXPR. The first token switches the parser onto the proper track, and the parser takes care of the rest. Of course, this can be extended to any number of start states.

I wanted to come up with a catch alliterative name for this technique, like Duff's Device. But there's aren't many useful words starting with N or B, so I'm going with Ned's Prescriptive Prefix Pushing Ploy.

Comments

[gravatar]
Stuart Dootson 1:43 PM on 14 Oct 2004

Ned - don't know if you've seen Antlr - but that allows you to select the starting production, as all productions are methods on a generated class. To use your example, you'd have something like:

class SomeParser
{
void stmt();
void expr();
void op();
};

No Python, though - only C++, Java or C#...

[gravatar]
Ned Batchelder 1:58 PM on 14 Oct 2004

I have seen (and even used!) ANTLR. In fact, I knew one or more parser generators let you select the start production, but I couldn't remember which it was. It was probably ANTLR I was thinking of.

[gravatar]
Phillip J. Eby 1:32 PM on 15 Oct 2004

How about calling it "Ned's Nicety"? Or maybe "Batchelder's Bait-and-Switch"? :)

[gravatar]
Jon 3:04 PM on 15 Oct 2004

... Ned's Knob? ... maybe not.

[gravatar]
Bob 9:44 AM on 16 Oct 2004

Here are a few:

Ned's Notion
Batchelder's Band-aid
Batchelder's Brainchild
Batchelder's Breakthrough
Batchelder's Blackbox

--bob

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.