Ned Batchelder : Blog | Code | Text | Site A regular crossword » Home : Blog : February 2013 |

## A regular crosswordSaturday 9 February 2013 A friend shared a link to an unusual puzzle (click for a full-page PDF): Each row has a regular expression indicating the letters to fill the row. Each cell is at the intersection of three rows, so there are a number of constraints to satisfy at each point. Overlapping constraints are a good basis for logic puzzles. Sudoku, Ken-Ken, Nonograms, and plenty of other puzzle forms follow the same recipe: determine the contents of a cell, based on multiple simultaneous constraints. But the regexes here add an extra dimension. Each of the regexes
here has a different form, resulting in different levels of
information. Two rows have I wrote to the author, Dan Gulotta, about how he constructed it, and he told me,
A few times during my solving of the puzzle, I used the classic piece of puzzle meta-information as part of the deduction: there is a unique solution. A friend of mine said he solved it without using that fact, but I don't see a reason to avoid it. By the way, I didn't realize this when I solved it, but there's another level to the puzzle, which is to identify the phrase in it. It was part of the 2013 MIT Mystery Hunt. | |

## Comments

ChrisF8:34 PM on 10 Feb 2013There is provably more than one unique solution to the puzzle:

First row, second to last slot, has the following constraints:

Across: could be anything, including one of the 2 mandatory H's (tho there could be more than 2 on that row).

Down: A, M or C.

Up: due to a partial solve putting an M somewhere else on that line, not an M (the regex dictates only one M on that line). If this were not the case, M would also be a possibility.

Thus, that cell could be at least either of A or C, but there are

no outstanding constraintsthat would decide between the two, thus there are at least 2 solutions.ChrisF8:38 PM on 10 Feb 2013...then again, now I look at it again, the down regex indicates that if it were C then there would not be a CM sequence elsewhere down that line, this a constraint that

woulddictate which of A or C it was, given more of the solution... Poopy.erez10:00 AM on 11 Feb 2013"Another has [CR]*, so we know the entire row is C's and R's."

That's incorrect, [CR]* means 'matches zero or more instances of either C or R'. Any kind of string could fit this one.

Ned Batchelder10:25 AM on 11 Feb 2013@erez, perhaps I (or the author) should have made clear: each regex matches the entire row. The only way [CR]* can match the entire row is for the row to be composed entirely of C's and R's.

ChrisF1:33 PM on 11 Feb 2013I am treating all the regexes as if I'd done a

on them.Ivan Mikhailov3:28 PM on 11 Feb 2013200% perfect idea! BTW an automatic solver of the puzzle is a nice task for CS stundents, as well as a generator of new puzzles of the sort.

ChrisF3:35 PM on 11 Feb 2013The interesting thing in terms of creating a solver is programatically identifying the sub-constraints from each RE - it's the set of those that is solvable. E.g. things like if that cell is an X then the two following on this line are both C's, or the set of characters this cell could be is just C or R. That could be the guts of the task tho - then let prolog do the rest (c:

It's a lot easier to write a checker than a solver, and not feasible to brute force even a single line normally, even given the only implicitly stated constriction of each cell containing only an uppercase letter.

Aron Griffis3:50 PM on 11 Feb 2013I have a checker at https://gist.github.com/agriffis/4746429

An entirely brute force solver is easy to write, but would take a very long time to run. I think a true solver can't lean on an regex engine to execute the given patterns against a search space, because the overlapping constraints can't execute against the same space. Rather you have to explode them to establish individual cell constraints then chain discovered information to the crossed patterns.

ChrisF5:55 PM on 11 Feb 2013One thing that could help is the PCRE partial match support but that only really helps for prefixes - doesn't do much to help where the interesting constraints are at the end of the line.

Michael Brazier7:57 PM on 12 Feb 2013Having just finished solving the puzzle by hand, I can say that not only isn't it necessary to assume a unique solution, there is no point at which you need to guess a character for a cell. At every stage of solving, there is at least one cell which has only one possible character remaining.

Audrey Tang2:02 AM on 13 Feb 2013I'm happily surprised to find my earlier work, regex-genex ( http://hackage.haskell.org/package/regex-genex ), has been adapted by LeventErkok++ into a 100-LOC solver for the crossword:

https://gist.github.com/LeventErkok/4942496

It uses the Z3 theorem prover to find the unique solution.

Jordan Dimov10:16 AM on 13 Feb 2013Amazing work, Audrey. In fact, Mr. Erkok's Haskell solution is significantly less than 100 LOC, seeing as 60 lines of the source code are taken to represent the puzzle data itself.

Joe4:44 PM on 14 Feb 2013I solved it. I agree with michael. It was a very well written puzzle and now I'm totally hungry for more!

Sylvain Frandaz11:03 AM on 5 Apr 2013I solved it too. Took me between 2 and 3 hours of cumulated time (I spent it by short bursts at start), and the start is noticeably slower than the finish (due to the fact that you have much more regexps to explore).

Also, I thought at first that there would probalbly be several solutions, but solving it sequentially made those "alternate" solutions disappear by themselves, by showing new constraints (due to letters having been placed already, for instance).

All in all, a very nice puzzle, and I'm looking forward to more of that kind, if you ever lay your hands on those :)

kuba5:15 PM on 8 Apr 2013Too bad link is broken ;)

Ned Batchelder8:43 PM on 8 Apr 2013@kuba: I've restored the PDF.

## Add a comment: