Accidental haikus

Wednesday 16 September 2009This is close to 14 years old. Be careful.

Jonathan Feinberg has made a neat hack: Haiku Finder. It uses NLTK to parse English text, looking for sentences that happen to fit the syllabic pattern required of haikus. It seems to work really well. I ran it on my longer text pieces, and it found these:

These are personal
    tools, meaning they do just what
I want them to do.

People who visit
    the page in their browser will
see the new entry.

But this powerful
    feature of C++ is missing
in those languages.

I know this sounds like
    coddling, or bending over
backward, and it is.

And maybe you don’t
    want to put effort into
improving your log.

It sounds simple, but
    there are right ways and wrong ways
to go about it.

Ask them to tell you
    what they’re thinking as they look
for the solution.

If you get it wrong,
    the object will be freed out
from under you: crash!

This is a macro
    that creates the initial
fields in the structure.

It seemed like I was
    buried in that dark harsh towel
cyclone for ages.

Note that it did a great job, but didn’t know that “C++” is a three-syllable word, not one syllable.

If you want to try this, you’ll have to install NLTK, which is a large package. It requires the punkt dataset, so you have to install that from the NLTK page after the code is installed. The whole process is automated, but perhaps more than you expected, so be forewarned.

You may remember Feinberg as the creator of Wordle, so I expect we’ll see more inspired language-related hacks from him...


I like the first and last of these the best. Very cool tool.
worker bees can leave
even drones can fly away
the queen is their slave
Jonathan Feinberg 12:34 PM on 17 Sep 2009
You'll be relieved to know that I've made a couple of changes that make "C++" work as expected, and furthermore permit you to customize the syllable-count look-up at runtime.
Whew! I'm so relieved! :)
One step closer to testing the million monkey => sonnet problem. How well does it find randomly generated iambic pentameter?
I ran this over some of my own writing. Here were the best ones it found.

From my description of attending the opening-night Patriots game at Gillette Stadium:

Part of the problem
might have been the luxury
box experience.

From my description of a fireworks show:

There's just a barge out
there with the entire show
programmed into it.

And from my write-up of why we abandoned attending the "NFL Experience" at the Super Bowl in favor of exploring San Diego:

Although I heard good
things about the event, I'm
glad we walked around.
I love this one, because it describes interacting with customers perfectly!

Ask them to tell you
what they're thinking as they look
for the solution.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.