Here’s a Python gem: a small bit of Python that uses the power of the language and standard library well.
It’s a function to list strings generated by a pattern with embedded alternatives. It takes an input string with brace-wrapped possibilities, and generates all the strings made from making choices among them:
>>> list(explode("{Alice,Bob} ate a {banana,donut}."))
[
'Alice ate a banana.',
'Alice ate a donut.',
'Bob ate a banana.',
'Bob ate a donut.'
]
Here’s the function:
def explode(pattern: str) -> Iterable[str]:
"""
Expand the brace-delimited possibilities in a string.
"""
seg_choices = []
for segment in re.split(r"(\{.*?\})", pattern):
if segment.startswith("{"):
seg_choices.append(segment.strip("{}").split(","))
else:
seg_choices.append([segment])
for parts in itertools.product(*seg_choices):
yield "".join(parts)
I call this a gem because it’s concise without being tricky, and uses Python’s tools to strong effect. Let’s look at how it works.
re.split: The first step is to break the string into pieces. I used re.split(): it takes a regular expression, divides the string wherever the pattern matches, and returns a list of parts.
A subtlety that I make use of here: if the splitting regex has capturing groups (parenthesized pieces), then those captured strings are also included in the result list. The pattern is anything enclosed in curly braces, and I’ve parenthesized the whole thing so that I’ll get those bracketed pieces in the split list.
For our sample string, re.split will return these segments:
['', '{Alice,Bob}', ' ate a ', '{banana,donut}', '.']
There’s an initial empty string which might seem concerning, but it won’t be a problem.
Grouping: I used that list of segments to make another list, the choices for each segment. If a segment starts with a brace, then I strip off the braces and split on commas to get a list of the alternatives. The segments that don’t start with a brace are the non-choices part of the string, so I add them as a one-choice list. This gives us a uniform list of lists, where the inner lists are the choices for each segment of the result.
For our sample string, this is the parts list:
[[''], ['Alice', 'Bob'], [' ate a '], ['banana', 'donut'], ['.']]
itertools.product: To generate all the combinations, I used itertools.product(), which does much of the heavy lifting of this function. It takes a number of iterables as arguments and generates all the different combinations of choosing one element from each. My seg_choices list is exactly the list of arguments needed for itertools.product, and I can apply it with the star syntax.
The values from itertools.product are tuples of the choices made. The last step is to join them together and use yield to provide the value to the caller.
Nice.
Comments
For fun I tried a Ruby version of this for comparison, although maybe not as much of a gem (no pun) because I don’t think Ruby has a nice way to return an iterable (lazy enumerable) for a cartesian product
Very nice. I would use
segment[1:-1]
instead of
segment.strip("{}")
since strip() would strip all leading and trailing braces and thus a pathological case like
"{Alice,Bob,}}"
would not work?
@Kari: in fact, I originally had [1:-1], until I wrote in the text, “strip off the braces”. Then .strip() seemed like a better expression of the idea. But I could go either way.
Really beatiful gem and it is straight-forward to transform it into comprehension-style:
You can do something similar in Bash shell, with brace expansion
Alice ate a banana
Alice ate a donut
Bob ate a banana
Bob ate a donut
Calvin ate a banana
Calvin ate a donut
Mary ate a banana
Mary ate a donut
Add a comment: