Partition for Python

Sunday 30 July 2006This is over 18 years old. Be careful.

This article about new assignment semantics in JavaScript 1.7 used Ruby’s partition method as an example of multiple assignment. It was a funny example to me, since I’m quite familiar with multiple assignment, having used it many many times in Python code. The new thing for me was partition. I’d never heard of it, though a number of programming languages have it.

It’s easy to implement in Python:

# An implementation of partition for Python

def partition(l, pred):
    yes, no = [], []
    for e in l:
        if pred(e):
            yes.append(e)
        else:
            no.append(e)
    return yes, no

And it’s easy to test:

import unittest

def is_odd(n):
    return (n % 2) == 1

class PartitionTest(unittest.TestCase):

    def testSimple(self):
        odd, even = partition(range(10), is_odd)
        self.assertEqual(even, [0,2,4,6,8])
        self.assertEqual(odd, [1,3,5,7,9])

    def testAllYes(self):
        odd, even = partition([1,3,5], is_odd)
        self.assertEqual(odd, [1,3,5])
        self.assertEqual(even, [])

    def testAllNo(self):
        odd, even = partition([2,4,6], is_odd)
        self.assertEqual(odd, [])
        self.assertEqual(even, [2,4,6])

    def testEmpty(self):
        odd, even = partition([], is_odd)
        self.assertEqual(odd, [])
        self.assertEqual(even, [])

if __name__ == '__main__':
    unittest.main()

I also wondered about generalizations: take a function returning a small integer, and return a list where the i’th element is a list of all the values for which the function return i? Or return a dictionary whose keys are the values returned by the function, and the values are lists of input values that caused them? I imagine different cases could be made for either of these.

I don’t have a use for any of this at the moment, just an idle mental exercise.

BTW: JavaScript 1.7 turns out to have a ton of new stuff. It sounds very cool, but how will developers make use of it generally? They can’t count on these features being in general deployment. I guess it’s just for version-specific development.

Comments

[gravatar]
Maybe an interesting technique would be:

def group(seq):
. result = {}
. for item, category in seq:
. . result.setdefault(category, []).append(item)
. return result

partitioned = group((color, is_primary(color)) for color in colors)
by_first_letter = group((name, name[0]) for name in names)

Comprehensions allow you to avoid lambdas in this case.
[gravatar]
It's also useful for people using Mozilla as a platform; deploying apps built in XUL on top of the xulrunner, for example.
[gravatar]
I didn't now that the technical term for this was "partitioning" but this is what I describe as the "Aschenputtel problem" (after the sorting of the peas in the Cinderella fairy-tale).

We discussed some different approaches to the problem on the tutor mailing list a while ago and I did some benchmarking as well:

http://tinyurl.com/hvly7.

I still haven't found the ideal, elegant syntax for it. I think there should be a function implemented in C for this (maybe in itertools). Maybe there already is and I just missed it ;-)
[gravatar]
1.7 will be useful to the mozilla developers for Firefox, Thubderbird etc. And to other mozilla platform developers such as Active State. Actually, Active State developed pyxpcom so they use lots of python too. Python will soon be a fuully supported language on the mozilla platform:
http://conferences.oreillynet.com/presentations/os2006/hammond_mark.pdf">
[gravatar]
Here's a generator version. It passes the unit tests, with a slight change to list() the result from partition. As to if it's useful -- that's a different topic entirely. :)


import collections

def partition(l, pred):
l_iter = iter(l)
pending_true = collections.deque()
pending_false = collections.deque()
def partition_true():
while 1:
if pending_true:
yield pending_true.popleft()
else:
while 1:
item = l_iter.next()
if pred(item):
yield item
break
else:
pending_false.append(item)

def partition_false():
while 1:
if pending_false:
yield pending_false.popleft()
else:
while 1:
item = l_iter.next()
if pred(item):
pending_true.append(item)
else:
yield item
break

return partition_true(), partition_false()

import unittest

def is_odd(n):
return (n % 2) == 1

class PartitionTest(unittest.TestCase):

def testSimple(self):
odd, even = partition(range(10), is_odd)
self.assertEqual(list(even), [0,2,4,6,8])
self.assertEqual(list(odd), [1,3,5,7,9])

def testAllYes(self):
odd, even = partition([1,3,5], is_odd)
self.assertEqual(list(odd), [1,3,5])
self.assertEqual(list(even), [])

def testAllNo(self):
odd, even = partition([2,4,6], is_odd)
self.assertEqual(list(odd), [])
self.assertEqual(list(even), [2,4,6])

def testEmpty(self):
odd, even = partition([], is_odd)
self.assertEqual(list(odd), [])
self.assertEqual(list(even), [])

if __name__ == '__main__':
unittest.main()



BTW, it's hard to put Python code in your comments. I ended up using a P element with style="white-space:pre" because a pre element was converted into br elements, not keeping indentation.
[gravatar]
You could generalize this to:

>>> def partition(seq, cond, parts=2):
. . . .res = [[] for i in range(parts)]
. . . .for e in seq:
. . . . . . . .res[int(cond(e))].append(e)
. . . .return res

Usage:
>>>a,b,c = partition(range(10), lambda e:e%3, 3)
>>>a
[0, 3, 6, 9]
[gravatar]
That above partition function is absolutely brilliant.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.