Partition for Python

Sunday 30 July 2006

This article about new assignment semantics in JavaScript 1.7 used Ruby's partition method as an example of multiple assignment. It was a funny example to me, since I'm quite familiar with multiple assignment, having used it many many times in Python code. The new thing for me was partition. I'd never heard of it, though a number of programming languages have it.

It's easy to implement in Python:

# An implementation of partition for Python

def partition(l, pred):
    yes, no = [], []
    for e in l:
        if pred(e):
            yes.append(e)
        else:
            no.append(e)
    return yes, no

And it's easy to test:

import unittest

def is_odd(n):
    return (n % 2) == 1

class PartitionTest(unittest.TestCase):

    def testSimple(self):
        odd, even = partition(range(10), is_odd)
        self.assertEqual(even, [0,2,4,6,8])
        self.assertEqual(odd, [1,3,5,7,9])

    def testAllYes(self):
        odd, even = partition([1,3,5], is_odd)
        self.assertEqual(odd, [1,3,5])
        self.assertEqual(even, [])

    def testAllNo(self):
        odd, even = partition([2,4,6], is_odd)
        self.assertEqual(odd, [])
        self.assertEqual(even, [2,4,6])

    def testEmpty(self):
        odd, even = partition([], is_odd)
        self.assertEqual(odd, [])
        self.assertEqual(even, [])

if __name__ == '__main__':
    unittest.main()

I also wondered about generalizations: take a function returning a small integer, and return a list where the i'th element is a list of all the values for which the function return i? Or return a dictionary whose keys are the values returned by the function, and the values are lists of input values that caused them? I imagine different cases could be made for either of these.

I don't have a use for any of this at the moment, just an idle mental exercise.

BTW: JavaScript 1.7 turns out to have a ton of new stuff. It sounds very cool, but how will developers make use of it generally? They can't count on these features being in general deployment. I guess it's just for version-specific development.

tagged: » 7 reactions

Comments

[gravatar]
Ian Bicking 11:31 PM on 30 Jul 2006

Maybe an interesting technique would be:

def group(seq):
. result = {}
. for item, category in seq:
. . result.setdefault(category, []).append(item)
. return result

partitioned = group((color, is_primary(color)) for color in colors)
by_first_letter = group((name, name[0]) for name in names)

Comprehensions allow you to avoid lambdas in this case.

[gravatar]
Stuart Langridge 3:30 AM on 31 Jul 2006

It's also useful for people using Mozilla as a platform; deploying apps built in XUL on top of the xulrunner, for example.

[gravatar]
Chris Arndt 7:05 AM on 31 Jul 2006

I didn't now that the technical term for this was "partitioning" but this is what I describe as the "Aschenputtel problem" (after the sorting of the peas in the Cinderella fairy-tale).

We discussed some different approaches to the problem on the tutor mailing list a while ago and I did some benchmarking as well:

http://tinyurl.com/hvly7.

I still haven't found the ideal, elegant syntax for it. I think there should be a function implemented in C for this (maybe in itertools). Maybe there already is and I just missed it ;-)

[gravatar]
Dan Sickles 10:24 AM on 31 Jul 2006

1.7 will be useful to the mozilla developers for Firefox, Thubderbird etc. And to other mozilla platform developers such as Active State. Actually, Active State developed pyxpcom so they use lots of python too. Python will soon be a fuully supported language on the mozilla platform:
http://conferences.oreillynet.com/presentations/os2006/hammond_mark.pdf">

[gravatar]
Andrew Dalke 4:48 PM on 31 Jul 2006

Here's a generator version. It passes the unit tests, with a slight change to list() the result from partition. As to if it's useful -- that's a different topic entirely. :)


import collections

def partition(l, pred):
l_iter = iter(l)
pending_true = collections.deque()
pending_false = collections.deque()
def partition_true():
while 1:
if pending_true:
yield pending_true.popleft()
else:
while 1:
item = l_iter.next()
if pred(item):
yield item
break
else:
pending_false.append(item)

def partition_false():
while 1:
if pending_false:
yield pending_false.popleft()
else:
while 1:
item = l_iter.next()
if pred(item):
pending_true.append(item)
else:
yield item
break

return partition_true(), partition_false()

import unittest

def is_odd(n):
return (n % 2) == 1

class PartitionTest(unittest.TestCase):

def testSimple(self):
odd, even = partition(range(10), is_odd)
self.assertEqual(list(even), [0,2,4,6,8])
self.assertEqual(list(odd), [1,3,5,7,9])

def testAllYes(self):
odd, even = partition([1,3,5], is_odd)
self.assertEqual(list(odd), [1,3,5])
self.assertEqual(list(even), [])

def testAllNo(self):
odd, even = partition([2,4,6], is_odd)
self.assertEqual(list(odd), [])
self.assertEqual(list(even), [2,4,6])

def testEmpty(self):
odd, even = partition([], is_odd)
self.assertEqual(list(odd), [])
self.assertEqual(list(even), [])

if __name__ == '__main__':
unittest.main()



BTW, it's hard to put Python code in your comments. I ended up using a P element with style="white-space:pre" because a pre element was converted into br elements, not keeping indentation.

[gravatar]
Erik Wilsher 3:23 PM on 1 Aug 2006

You could generalize this to:

>>> def partition(seq, cond, parts=2):
. . . .res = [[] for i in range(parts)]
. . . .for e in seq:
. . . . . . . .res[int(cond(e))].append(e)
. . . .return res

Usage:
>>>a,b,c = partition(range(10), lambda e:e%3, 3)
>>>a
[0, 3, 6, 9]

[gravatar]
Pzelnip 1:24 PM on 4 Jul 2011

That above partition function is absolutely brilliant.

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.