![]() | Ned Batchelder : Blog | Code | Text | Site Python parsing tools » Home : Text |
Created 30 September 2004, last updated 19 July 2008 Recently I went looking for Python parsing tools. I spent a long time researching the various options. When I was done, I had a cheat sheet on the different alternatives. This is that cheat sheet, cleaned up a bit. It is very spotty. If you have updates to the information here, let me know. Because this is a compilation of factoids freely available on the web, it is in the public domain. The tools are presented here in random order. I tried organizing them, but I couldn't find a scheme that seemed to help. Some points of comparison:
The toolsPyGgy Ply BisonGen pyparsing ANTLR Simple Top-Down Parsing in Python Aperiot Parsing Rparse SableCC GOLD Parser Plex yeanpypa ZestyParser DParser for Python Yapps PyBison Yappy Toy Parser Generator kwParsing Martel SimpleParse SPARK mxTextTools FlexBisonModule Bison In A Box Berkeley Yacc PyLR Standard ModulesThe Python standard library includes a few modules for special-purpose parsing problems. These are not general-purpose parsers, but don't overlook them. If your need overlaps with their capabilities, they're perfect:
See also
| |
Comments
Re BisonGen.
True: docs are sparse. We developed it really as an internal tool for generating parsers needed in 4Suite, but got some interest in using it standalone, so started releasing versions of it.
Earlier versions of BisonGen used to generate a bison and flex file for second-srtage processing by the GNU tools, but Jeremy, in a fit of brilliant madness rewrote all the state table analysis and construction code from those packages in Python, so now you're right, it really has little to do with bison. Perhaps a name change is in order, but again given our shallow follow-thru w.r.t. BisonGen...
I'll at least cobble together a home page.
Thanks, all, for your comments. I've updated the page to include them.
You haven't mentioned Antlr. From version 2.7.5 it supports python as a target language.
Very good generator IMO.
Thanks, I've added it to the page.
PLY 1.6 was released.
Thanks: I (finally) updated the PLY info.
Thanks for maintaining your list. It really helped me find a python parser.
Thanks a lot. Great list, came across pyparser thanks to this.
Nice compilation!
Was very helpful for knowing many things i did not know previously.
Also, it would be great if you can add an a rating and reviews to the tools so that it can help novices like me to select a parser and get going.It will be much better in the long run.
Suman, I don't have the time to rate and review each of these. I tried to objectively describe them. Trying out each of them would be a much larger undertaking. And I don't know that my criteria would be the same as yours.
There is also a comparison on
http://wiki.python.org/moin/LanguageParsing
Another one for the list: SableCC has a Python backend.
http://www.mare.ee/indrek/sablecc/
In the standard modules, you might consider adding the cmd module. It's convenient when you don't want to bother with writing a grammar for a simple command-line tool. It uses a naming convention to map the user's input to function names; if a match is found, the function is called, otherwise, an error function is called.
--dang
Thanks for the suggestions, I've incorporated them above.
Nice listing of resources. I hate to admit, but i don't know if i need a parser or not.. Essentially i know i spend a lot of time using regular expressions, but don't know if i can get a better deal with a parser.
The links i have so far focus on the technical aspects. So far i can not find detail on where lex / parse should and should not be used. Proly i should keep reading... Thanks for the info.
Would be so helpful if was some sort of blurb about how fast these are. I am currently writing a mud, and will definitely need a parser, looked into many of these parsers, honestly speed is a very big issue. Unfortunately haven't seen any kind of benchmarks for most of these.
Ned -- thanks.
I appreciate the overview. This is helping jumpstart me.
I also second the request for some kind of review.
-- joe
Niiice list. Ply's 1.8, updated just this month, might want to update the entry.
Hello, Thank for this pages! We use your work in order to choose the best tool for our needs. We work on a Flight Management projet (100 ingenieers)...
Hi. I recently found a **great** public-domain Python-based parsing library.
It's called "yeanpypa" (YEt ANother PYthon PArsing lib) and is inspired by PyParsing and Boost::Spirit (a C++-based parsing lib that I've used a fair bit).
Indeed, IMO yeanpypa feels very much like Spirit.
The main difference to some other parsing libs is that with (say) Spirit, you specify the BNF-grammar from the top down. So, to use a nonsensical but easy-to-understand example, if you did a BNF grammar for a book (say a novel), Spirit would do it as (in pseudo-code)-
Book = one-or-more chapters
Chapter = one-or-more pages
Pages = one-or-more paragraphs
Paragraph = one-or-more-lines
Line = one-or-more words
... and so on. In yeanpypa, you would do -
Words = one-or-more letters
Line = one-or-more words
Paragraph = one-or-more-lines
Pages = one-or-more paragraphs
Chapter = one-or-more pages
Book = one-or-more chapters
Yeanpypa is great! I've tried PyParsing but just couldn't get the hang of it. Then I tried yeanpypa and (having used Spirit) I "got it" *immediately!*
Here are the URLs for yeanpypa -
http://freshmeat.net/projects/yeanpypa/
http://www.slash-me.net/dev/snippets/yeanpypa/documentation.html
'Construct' is a declarative framework for the definition of arbitrary data structures. These data structures, called 'constructs', allow both parsing and building (symmetrically).
http://construct.wikispaces.com
Ned, pyparsing.
pyparsing!!!!!!!!!!
Add a comment: