PyCon 2008 notes

Monday 17 March 2008

I got back from PyCon last night. I’d taken notes on all the sessions I attended. They’re kind of sketchy, and I don’t know if they’ll be of any use to anyone else, but I figured I’d put them up anyway. My apologies to speakers whom I have crudely paraphrased here. The quality of these notes varies as my energy level waxed and waned.

Michael H Goldwasser Using Python To Teach Object-Oriented Programming in CS1

  • Objects later, or objects first?
  • Objects first sounds better, but:
    • Students are overwhelmed by Java.
    • People are trying to make Java simpler to teach, but why not just use a different language?
  • Python is better:
    • Students can use classes before they create classes
      • stdlib is full of classes: str, list, etc.
    • Interactive interpreter is a great tool for experimentation.
    • Ironically, though you can’t write Java code except in a class, Python primitive types are classes, but Java’s are not.
  • Writing classes
    • self is a bit of an obstacle.
      • Code of a method is very consistent.
      • But callers see a different number of parameters, error messages can be confusing.
    • “we’re all adults” access control can be confusing for students.
    • Generics and polymorphism are a strong plus for Python.
  • After Python, need to transition to other languages.
    • Don’t have to unteach any Python lessons, unlike other languages.
    • Switch from dynamic to static is easily motivated: static allows the compiler to do lots of work for you.
    • If they can do Python and C++, everything else is in the range in the middle.

Peter Skomoroch, datawrangling.comMPI Cluster Programming with Python and Amazon EC2

  • Netflix prize, large dataset.
    • need to do lots of runs, lots of power.
    • How can I get a beowulf cluster?
  • Amazon EC2
    • __init__( <machine> ) graphic.
    • Launch instances of machines. Price is $.10 per hour.
    • http://del.icio.us/pskomoroch/ec2
  • Parallel programming in Python
    • lots of tools available
  • ElasticWulf
  • batteries included.
  • MPI
    • high performance message passing interface.
    • Standard, point-to-point.
    • flexible but complex.
  • examples of pyMPI code.
  • I was in over my head here!
    • He kept saying, “This is really easy”
    • I kept hearing, “This is really complicated”

Bruce FrederiksenApplying Expert System Technology to Code Reuse with Pyke

  • Pyke grew out of a consulting gig that wanted to reduce code reuse
  • What is Code Reuse?
    • Function reuse, but could happen at multiple layers.
    • Current solutions: Adapters, Zope stuff.
    • Limitations: Need to identify functions to call before calling them
      • Dynamic adaptation won’t help.
      • Backward chaining algorithms help here.
  • Applying Backward Chaining to Code Reuse: Works great.
  • Needed a better explanation of the problem area.
  • Pyke is a language for specifying rules with Python code attached.
  • Essentially: an inference engine, but the resulting plan from matching rules is Python code to execute.

Kevin Dangoor Rich UI Webapps with TurboGears 2 and Dojo

  • Client-side apps in django and Turbogears
  • web apps where almost all logic is at the client.
  • Dojo templates, Dijit, etc.
  • Similar concepts to how server-side web apps work, but moved into Javascript.
  • Comet is cool too.
  • Summary: Hyper-ajax where as much as possible happens on the browser.

Adrian Holovaty State of Django

  • State of Django, Adrian Holovaty
  • PyCon has been a nexus of Django activity over the years.
  • .96 is latest release, a year old, very conservative.
  • trunk == hot_action
  • What’s new:
    • Unicode
    • autoescaping
    • Oracle
    • GeoDjango
    • a few sprints.
  • Community stuff:
    • Djangosites
    • Djangosnippets
    • Djangogigs
    • Djangopeople.net
    • The Django Book
    • Five other django books
  • What’s coming
    • Django is mostly mature
    • queryset refactor
      • model subclassing.
      • 1-1 models
      • finer-grained select_related
    • newforms admin
      • better separation of model and admin
      • hooks for controlling authorization for viewing, changing, etc.
  • Announcement: Django Software Foundation formed!

Michael Carter High performance Network IO with Python + Libevent

  • Alternatives
    • Twisted
    • asyncore
    • ctypes+libevent
  • Quick overview of Network Communication
  • Pyevent:
    • Python wrapper over libevent
    • fast: lots happens in C
    • Not much docs

Friday lightning talks

  • Noonhat: connect for lunch
  • Saturday House: Take your big ideas, make them small, change the world.

Plenary: Twisted announcement: they have a foundation.

Plenary: You *can* Fool All of the People All of the Time Brian Fitz Fitzpatrick

  • Three good reasons to lie to your users
  • Perception is .9 of the law
  • It isn’t lying if you have both sizzle and steak
  • Software is additive
  • Chinese menu vs zen menu
  • Abstractions: leaky
  • Keep it simple.
    • Google screen is simple, but is a lie
    • It hides enormous complexity.
    • iPod is a lie: the elegance hides the complexity.
  • Don’t be lazy
    • MS Word toolbar craziness: lazy.
    • Larding on options is avoiding decisions.
  • Put users first.
    • Listen to what they want.
    • But: they don’t know what they want.
  • Speed matters.
    • 1M users * 20 req/day * 500ms * 365 days/yr = 116 years
    • Stop killing your users!
  • Google anecdotes:
    • “I didn’t know people worked there!”
    • The whole co going to Disneyland. “But then who’ll do the searches!?”

Plenary Keynote: Intellectual Property and Open Source Van Lindberg. IP and Open Source

  • First kill all the lawyers
    • Greatest hits: bad things that came out of IP suits, SCO, RIAA, etc.
    • These days, the most valuable part of a business is its IP.
  • Frankly, too much information about the legal aspects of goods, IP, etc.

Mike Bayer SQLAlchemy 0.4 and Beyond

  • SQLAlchemy 0.4
    • Lots of developers helping
    • Improved speed
    • simplified code
    • SQL Expression language
      • Smart operators
      • Geeric functions
      • Lots of engine-aware differences handled automatically.
    • Collections API: auto-map records to collection classes.
    • “Dynamic” relations:
      • Handle very large manu-to-one relationships.
    • Polymorphic Inheritance
    • Transactional Sessions
      • Including nested transactions where savepoints are supported
      • Two phase commit, for coordinated transactions.
    • Mutable primary keys
    • Can assign SQL expressions to columns for atomic updates.
    • Lots of dialects: Sybase, DB2, Informix
    • Horizontal sharding.
    • Connection event hooks
  • Coming up:
    • Migrate

Matt Harrison Managing Complexity (and testing)

  • Discussion of code paths, branch coverage, motivations for path analysis
  • Continuum of coverage:
    • No testing
    • Line testing
    • Branch testing
    • Structured testing
    • Path testing
  • PyMetrics: doesn’t get the metrics quite right.

Brandon Rhodes Using Grok to Walk Like a Duck

  • duck typing: cool, but what if your objcet doesn’t quack at all?
  • subclassing:
    • Works, but:
    • Creation methods need to take factory arguments.
    • Testing can be difficult: real classes like that may be expensive.
    • May be conflicts between base class methods and the needs of derived
    • methods to implement the duck interface.
  • mixin: like subclassing, but new methods are in a mixin
    • tests can be easier, b/c you mixin to a new dummy class for testing.
  • monkeypatching
    • mentioned only for completeness.
    • Ruby people do this: F
    • “Ruby people are excited b/c their language is prettier than Perl.”
  • adapters
    • Give up on the idea of making Messages do something they don’t already do.
    • make a new object using has-a instead of is-a
    • Tests can easily use mock objects.
    • Works great, but wrapping is annoying.
  • Adapters work b/c they provide what another piece of code needs.
    • Interfaces are the way to express this need.
  • Zope provides interface and adaptation tools
  • Grok makes those tools even better.

Jono DiCarlo Case Study of Python Application Development -- Humanized Enso

  • build bot
  • contractify: program by contract
  • Crash exceptions reported back automatically.
  • Isn’t Python slow?
    • Profile
    • Take an algorithms course (n00b!)
    • Rewrite critical parts as C extensions.
    • SWIG is cool, SCons is cool.
  • Drawing on the screen
    • protoype with wxpython or pygame
    • real thing: pyCairo
  • OS interaction
    • win32 extensions.
  • Releasing
    • py2exe
    • NSIS

Anna Ravenscroft To RE or not to RE -- Parsing text in Python

  • Simple string methods to deal with text.
  • Tips and tricks for regular expressions.

Plenary: Making Client-Side Python Suck Less Aza Raskin

  • Python on the desktop sucks.
  • Big download, silos, ugly
  • toolness.com has a prototype of installing Python as a platform ala .net.

Plenary Keynote: Snake Charming the Dragon: the past, present and future of Python and Mozilla Mark Hammond

  • Python is the first of the second-class languages
  • History, 1998
    • open source was still a curiousity.
    • Netscape released source code to Navigator
    • Then tried to rewrite it from scratch
  • Same code used to do the chrome and the meat.
  • MS-COM and Corba models adapted as XPCOM, cross-platform, language-independent
  • The 1st second-class language:
    • #1 goal is fast stds-compliant browser
    • be the platform for experimenting with new web stds.
  • Why the first?
    • first chronologically
    • language features borrowed from python: generators, for ex.
  • Hired by ActiveState to build Python XPCOM bindings.
  • Python in Mozilla:
    • Python can be used directly
    • <window script-type=”application/x-python”><script src=”...py”/>
    • XULRunner lets you use CSS and XHTML to make native apps.
  • pyXPCOM experiences
    • User’s reactions to pyXPCOM
    • Some love it, but perception is no community
  • Why no community?
    • Complexity is a barrier to hobbyists.
    • Used by large projects like Mozilla, already busy with their own communities
    • Mozilla and Python each think the other owns it.
  • The future of Mozilla
    • 1.8/1.9: mozilla version numbers, not firefox version numbers.
  • Mozilla 2.0:
    • Smaller, faster
    • JS 2.0, JIT
  • Tamarin virtual machine
    • Created for ActionScript
    • Open-sourced by Adobe
    • JIT, light-weight
    • Trying to fit it in 100K.

Plenary: OLPC Update Ivan Krstić

  • Working on power management
  • Plugins are the new ifdef
    • Need discipline to make it work
  • Deployed laptops in Uruguay and Peru
    • Uruguay: centralized, good tech
    • Peru: dispersed, understand constructionism
  • Peru Arahuay pilot
    • Hilltop village in the middle of nowhere
    • What happened when laptops were handed out? Everyone very engaged.

Ian Bicking Consuming HTML

  • HTML is democratic, therefore tag soup
  • HTML is the most important markup language in the world.
  • What to do about bad stuff?
    • Punish, or Guess
  • XML’s philosophy is Punish
  • Postel’s Law:
    • Be conservative in what you produce, liberal in what you consume.
    • HTML does one half, XML does the other.
  • Presentation vs. Semantics
  • XHTML: it will never catch on.
  • BeautifulSoup
    • Written for screen scraping
    • Forgiving
  • html5lib:
    • Reference impl of HTML 5 parsing
    • In theory, *the* correct parsing.
  • HTMLParser:
    • Old, rejects lots of HTML, awkward (SAXish)
  • lxml:
    • libxml2
    • pretty good parser, similar results to html5lib
    • fast
  • minidom: horrible
  • lxml: same API as ElementTree, plus a parent pointer, plus lxml.html
  • lxml.html: adds methods to Element for HTML.

Jason Pellerin nose: testing for the lazy coder

  • Laziness is good:
    • Don’t want to waste your time.
  • Write tests first so you never write wasted code.
  • How did laziness drive nose
    • Traditional unittest is high-friction
    • py.test is complex
  • Demo of writing a simple application with nose tests.
  • TDD panic sets in.
    • Remain calm.
    • Write a high-level test that documents and expresses what your project does.
  • Nose is easier than unittest
    • Don’t have to write scaffolding to find tests.
  • Basics of nose
    • Extends unittest, doesn’t replace it.
    • Simple test definition
    • Automatic test discovery
    • Same output as unittest.
  • Writing useful tests
    • Organize tests into modules and packages, with fixtures at every level.
    • Use plain-old assert to test.
    • Use print for debugging.
    • Generate tests from other data.
  • Plugins
    • pdb
    • coverage
  • Test selection
    • rich methods for subsetting the tests.

Titus Brown Introducing Agile Testing Techniques to the OLPC Project

  • Solving social problems technically
  • Agile methods include lots of stuff like automated testing.
  • Forensic code analysis
    • large sw projcets are living organisms
    • We lack tools to study them.
    • CPython has cool hooks
  • sys.settrace lets you measure line execution
    • Wouldn’t it be nice to get that from long-running processes.
  • OLPC
    • OSS community is intolerant of manual builds
    • They should be intolerant of manual tests
  • OLPC GUI has no automated tests
  • Testing Death Spiral
    • As features are added, code starts to break.
  • Cascade of Attention-Deficit Teenagers
  • Continuous Integration
    • sugar-jhbuild pulls 51 packages from the internet!
    • Breaks frequently.
  • Simple GUI automation + Live coverage data
  • Figserve showing lines of code executed in a running Sugar process.
    • Great for reverse engineering code.
  • Conclusions
    • Runtime code tracing is fun
    • GUI automation is not fun

Comments

[gravatar]
Josh 10:37 AM on 18 Mar 2008

Very complete-looking notes, Ned.

"XHTML will never catch on."

Sadly, IAWTC.

[gravatar]
Pete Skomoroch 1:54 PM on 18 Mar 2008

Ned,

Glad you made the Elasticwulf talk. I just subscribed last week, but have been reading your blog for a while. Sorry about the MPI & live demo sections, I rushed through the material towards the end as I was confused about the talk timing (along with being sleep-deprived).

On the MPI being easy/hard... you are right, it is a pain to write MPI code, difficult to debug, and if one node dies your job is kaput.

More time should have been dedicated to parallel programming with IPython1, but I was only recently turned on to it myself. It allows you to juggle numpy arrays in a parallel fashion without the ugly MPI syntax. It is installed and configured on the ElasticWulf images, and seems well suited to embarrassingly parallel problems which can be handled on EC2 clusters.

I should be writing up an introductory post on running IPython1 on EC2 soon.

-Pete

[gravatar]
Shawn Wheatley 2:32 PM on 18 Mar 2008

I disagree that it will never "catch on", although I appreciate Ian's pragmatic approach to dealing with the data. I attended the talk, and definitely agree with most of the sentiments. I do think, however, that the more platforms we have for content management and app development that do a good job of supporting XHTML, the easier it will be for it to continue to "catch on".

BTW, did anyone who catch the talk notice Ian getting heckled by an accessibility fanatic?

[gravatar]
Ned Batchelder 8:22 PM on 18 Mar 2008

@Pete: your talk was kind of a classic Pycon talk: a lot of real meat, squeezed into the aggressive 30-minute time slots. I went hoping to learn something I knew almost nothing about. And you structured the talk assuming previous knowledge, which is fine. Your words and my ears were just not ready for each other!

[gravatar]
Ned Batchelder 8:25 PM on 18 Mar 2008

@Shawn: How could I miss the heckler? He was pretty impressive. For those not there, in the middle of Ian's talk, he made a point about how average users use b and i tags rather than em and strong, because they just know they want bold, etc. This guy just shouts out, "That leaves out blind people". Ian calmly responded, and the guy returned with some other comment, and after Ian responded again, he finally ended with an emphatic, "I disagree!" It definitely stood out in the normally quite friendly Pycon environment.

[gravatar]
est 5:33 AM on 17 Nov 2008

> duck typing: cool, but what if your objcet doesn't quack at all?

This should be *object*?

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.