We are in full swing now in the presidential campaign, and we are constantly bombarded with poll numbers. Funny thing is, most of those polls are just national polls, a prediction of how the nation-wide popular vote will turn out. But as the 2000 election underscored, that doesn't matter at all: what matters is the electoral vote. To predict that, you'd have to track individual state-by-state polls to see who wins the popular vote in each state, and compute the electoral vote totals. Sounds like a lot of work, but FiveThirtyEight.com (Electoral Predictions Done Right) has done all the work already. They also run statistical simulations to predict the likelihood of various outcomes (for example: the chance of McCain losing the popular vote but winning the election is 1.7%).

Add extensive tables of data detailing the poll data, the simulations, their predictions, maps of outcomes, more of the same for congressional races, and so on, and you have a quantitative political junkie's dream site.

BTW, as of this moment, they predict an Obama win, with 339 electoral votes to McCain's 199.

And they aren't the only game in town: there's also Electoral-vote.com (currently predicting a 329 over 194 win for Obama), and Election Projection (364 to 174 for Obama).

tagged: politics» 4 reactions

Aptus 2.0, the latest version of my Mandelbrot explorer, is now available. It's got a lot of improvements over the previous version, including speed improvements, multiple top-level windows, tool windows for displaying information and Julia set support.

fractal image from the Mandelbrot set

It's built with wxPython, so it runs on Windows, Linux, and Mac.

tagged: my code, math» 1 reaction

In writing the python registry switcher, I needed to search the registry for references to my old Python version. Another good use for a Python script:

""" Search the Windows registry.
"""

import _winreg as reg
import itertools

RegRoots = {
    reg.HKEY_CLASSES_ROOT:   'HKEY_CLASSES_ROOT',
    reg.HKEY_CURRENT_USER:   'HKEY_CURRENT_USER',
    reg.HKEY_LOCAL_MACHINE:  'HKEY_LOCAL_MACHINE',
    reg.HKEY_USERS:          'HKEY_USERS',
    }

class RegKey:
    """ A handy wrapper around the raw stuff in the _winreg module.
    """
    def __init__(self, rawkey, root, path):
        self.key = rawkey
        self.root = root
        self.path = path
        
    def __str__(self):
        return "%s\\%s" % (RegRoots.get(self.root, hex(self.root)), self.path)
    
    def close(self):
        reg.CloseKey(self.key)

    def values(self):
        """ Enumerate the values in this key.
        """
        for ikey in itertools.count():
            try:
                yield reg.EnumValue(self.key, ikey)
            except EnvironmentError:
                break

    def subkey_names(self):
        """ Enumerate the names of the subkeys in this key.
        """
        for ikey in itertools.count():
            try:
                yield reg.EnumKey(self.key, ikey)
            except EnvironmentError:
                break
        
    def subkeys(self):
        """ Enumerate the subkeys in this key.
        """
        for subkey_name in self.subkey_names():
            if self.path:
                sub = self.path + '\\' + subkey_name
            else:
                sub = subkey_name
            yield OpenRegKey(self.root, sub)

def OpenRegKey(root, path):
    try:
        rawkey = reg.OpenKey(root, path)
    except Exception, e:
        #print "Couldn't open %r %r: %s" % (root, path, e)
        return None
    return RegKey(rawkey, root, path)

def grep_key(key, target):
    for name, value, typ in key.values():
        if isinstance(value, basestring) and target in value:
            print "%s\\%s = %r" % (key, name, value)

    for subkey in key.subkeys():
        if not subkey:
            continue
        grep_key(subkey, target)
        subkey.close()

def grep_registry(args):
    for root in RegRoots.keys():
        grep_key(OpenRegKey(root, ""), args[1])

if __name__ == '__main__':
    import sys
    grep_registry(sys.argv)

Most of this is a pythonic wrapper around the _winreg module, with a few simple functions at the end to actually search the registry.

I forget what software first set up these associations, but I have .py files registered with Windows so that they can execute directly. The registry defines .py as a Python.File, which has a shell open command of:

"C:\Python24\python.exe" "%1" %*

My PATHEXT environment variable includes .py, so the command prompt will attempt to execute .py files, using the registry associations to find the executable.

But: I wanted to switch from Python 2.4 to Python 2.5. That meant updating the registry in a handful of places. A Python script to the rescue!

""" Change the .py file extension to point to a different
    Python installation.
"""
import _winreg as reg
import sys

pydir = sys.argv[1]

todo = [
    ('Applications\python.exe\shell\open\command',
                '"PYDIR\\python.exe" "%1" %*'),
    ('Applications\pythonw.exe\shell\open\command',
                '"PYDIR\\pythonw.exe" "%1" %*'),
    ('Python.CompiledFile\DefaultIcon',
                'PYDIR\\pyc.ico'),
    ('Python.CompiledFile\shell\open\command',
                '"PYDIR\\python.exe" "%1" %*'),
    ('Python.File\DefaultIcon',
                'PYDIR\\py.ico'),
    ('Python.File\shell\open\command',
                '"PYDIR\\python.exe" "%1" %*'),
    ('Python.NoConFile\DefaultIcon',
                'PYDIR\\py.ico'),
    ('Python.NoConFile\shell\open\command',
                '"PYDIR\\pythonw.exe" "%1" %*'),
    ]

classes_root = reg.OpenKey(reg.HKEY_CLASSES_ROOT, "")
for path, value in todo:
    key = reg.OpenKey(classes_root, path, 0, reg.KEY_SET_VALUE)
    reg.SetValue(key, '', reg.REG_SZ, value.replace('PYDIR', pydir))

Invoke this with your desired Python installation directory, and the registry is updated to point to it.

Note that this doesn't affect what the command "python" means, that's determined by your PATH enviroment variable. These registry settings change which Python executable is found when you invoke a .py file as a command.

We pushed new code to our production servers last week. There were a lot of changes, including our upgrade to Django 1.0. As soon as the servers restarted, they immediately suffered, with Python processes bloated to 2Gb or more memory each. Yikes! We reverted to the old code, and began the process of finding the leak.

These are details on what we (Dave, Peter, and I, mostly them) did to find and fix the problem.

» read more of: A server memory leak... (34 paragraphs)

One of those simple typos that turns into an embarassing public mistake: Cisco home page FAIL, where (it is theorized) a regex that should have had \t had only t, and as a result, all lowercase t's were removed from the page, breaking it completely.

tagged: funny, web   /   via: Justin Mason» react

I really don't know what Apple is thinking. First they release a really cool phone, good. Then they release an SDK for it, also good. But developers aren't allowed to talk to each other about developing for the phone. That's bad, doesn't Apple realize how developers learn? Then Apple sets up a store and keeps control over what apps can be sold there. Partly good (no malware can pollute the ecosystem), but partly bad (no one knows how Apple will decide what can be sold).

Then Apple started to reject apps from the app store, which is bad, because app developers only find out they've been rejected after they've expended all the effort to build the app, and it can be hard to predict whether an app will be rejected or not, making it risky to build iPhone apps.

After this breathtaking descent into cluelessness, Apple has topped itself by deciding that app rejections are subject to the non-disclosure, making it illegal for developers to talk about the fact that their app has been rejected! Is Apple actively trying to discourage app development? Is there any other company that could act this way without raising the ire of the development community? This is the company that used Gandhi in an ad? What exactly is Apple thinking?

tagged: business, mac» 9 reactions

I've just bought a new car: a Honda Civic hybrid. I don't buy cars that often. The car I just replaced was a 1994 Civic. To keep the same pace, I'll add an entry to my calendar for 2022 to buy my next car.

I like the Civic for its gas mileage, 45 mpg highway. The extra expense over a non-hybrid Civic is actually more than I'll save on gas over the life of the car, but I like being the change I want to see in the world.

One thing that surprised me about this car is how familiar it felt after having driven a 1994 Civic. Lots of extra bells and whistles that I'd gotten used to in my wife's larger cars are still absent in this car.

Features in the hybrid I didn't have in my 1994 Civic (other than the hybrid engine):

  • A temperature setting in the climate control
  • Front seat map lights
  • A chime to alert me that I've left my headlights on
  • An auxilliary jack for the stereo
  • Electronic dashboard with thermometer, etc

Things that work in the hybrid that used to work in the 1994 Civic, but no longer do:

  • Remote entry buttons
  • Reliable low-speed wipers
  • Rear left passenger door handle
  • Exhaust system. The last thing that failed on the 94 was the exhaust. For its last two days, it sounded like a four-door Harley.

Fancy features the Hybrid doesn't have that my wife's car does:

  • Motorized seat adjustments with memory
  • Heated seats
  • Lighted mirrors in visors
  • Fold-in side mirrors
  • Leather seats
  • Separate temperature settings for driver and passenger
  • Individual lights for rear passengers

I'm pleased to have a new car that just works, and especially one that does so well on gas.

tagged: cars» 18 reactions

Having observed Hewlett-Packard from the inside for almost 18 months now, I'm struck by a paradox: our economy is a chaotic marketplace of capitalist competition, practiced and championed by corporations, but internally, companies are run as top-down, centrally-planned dictatorships. Why is that? Why isn't a company simply a microcosm of the larger economy?

Take the case of IT services: inside HP, there is a large IT organization, and they provide services to the rest of the company. When my group joined HP, we had no choice about how to get, for example, email service. The IT group provided email, and we used it. When we need to buy a laptop, there is one group that provides that service. When we need servers hosted, we have only one place to turn.

I'm sure the reason for this is the efficiency gained by eliminating redundancy. If there were two groups providing email services, surely one group could do the job of both, with less total staff, equipment, and so on.

That's certainly true, but then why don't we apply the same logic to the larger economy? After all, HP's email group has a huge overlap with Dell's, IBM's, Sun's, Microsoft's, and so on. Couldn't our economy gain by eliminating the overlap? When these questions are considered at the national level, we tout the increased efficiency produced by competition. The economy as a whole gains from the pressure competition puts on each company. Without competition, there is no incentive to improve, no reason to do your best. In a centrally-planned nationalized economy, incompetence is not punished, incentives are mis-aligned, and apathy takes over. There's no reason to improve because your customers have nowhere else to turn, poor service will not lead to loss of business, there's no price pressure, and your existence is guaranteed by the state.

That's logic that every capitalist believes, and we laugh at economies that have tried central planning and failed. So why doesn't the same logic hold inside companies? Why are monopolies and lack of competition not just accepted, but enforced? Don't we believe the same forces will be at work? Is there any compelling reason to improve if you have no competition?

Why couldn't a company have three IT groups (call them Red, Green, and Blue). Each is separate, and lives or dies based on their ability to attract business from the rest of the company. When my group needs servers hosted, we shop around. Maybe Red is the deluxe service, and Blue is economy, and we've heard from friends that Green has the best service. For whatever reason, we choose one of them, and spend our internal dollars with them. The groups will compete, and that competition will force them to optimize and find the best solutions for their customers. If they don't, they will go out of business.

I know it seems wasteful to have all that going on inside a company. There will be duplication. But remember the capitalist logic: without competition, there's no reason to do your best. Just as with the larger economy, the duplication will be worth it because of the increased efficiency forced by competition. And without competition, your only option will be a poor one.

Of course, not all work inside corporations could be run this way. For example, legal departments deal with the outside world, and the corporation must speak with one voice there. But couldn't competition be used in at least some parts of large companies?

Where's the flaw in this logic? Why isn't competition inside corporations a good idea?

At work we upgraded to the shiny-new Django 1.0, and we had to make a lot of small changes in the process. Most were what you would expect: adapting to the 1.0 way from the older 0.96 code we had been using.

But some of them were undoing ad-hoc patches to Django that we had accreted over the two years we'd been banging away at it. Over the course of a week or so, we'd found dozens of things broken, pointing to work yet to be done to finish the 1.0 upgrade, just as you'd expect. We have a large code base, and many things changed between 0.96 and 1.0.

Yesterday, I couldn't log in on my dev server. Everyone else had been working just fine for the last few days, so it seemed mysterious. I asked our main Django guy Dave for help, and together we logged some session information, saw that there was no session being established at all. He realized what the problem was. "Oh, I changed SESSION_COOKIE_DOMAIN back to a string, we don't use the list any more." Turns out it was one of our ad-hoc Django changes that we threw overboard, and my settings file still had the old setting in it.

This is where the software should have diagnosed itself. If the settings/main.py file had these two lines added to it:

if isinstance(SESSION_COOKIE_DOMAIN, list):
    raise Exception("SESSION_COOKIE_DOMAIN should be just a string now.")

Then I would have immediately gotten an exception on my server console (and browser) pointing to precisely what the problem was. I could have fixed it, and been running in two minutes, rather than being frustrated for half and hour, and bother Dave for another ten minutes.

Our development team is small (five), and all sit next to each other most days of the week, so the cost of this sort of out of band communication about changes to infrastructure is small. Also, I seem to have been the only developer who had a list in their settings file. So perhaps the cost here was a total of about an hour. Not so much, but adding those two lines in the first place would have cost about five minutes. And in addition to the five developers, there are probably five other "development environments" floating around for other purposes: intern work, demos, backups, evaluation tarballs sent to other groups, etc, and who knows if those will have the same problem.

And besides the simple time spent, there's the loss of focus, the distraction of the other developers, the frustration, and so on. Developer attention is a very valuable resource. A speed bump like this in the road is like a CPU cache miss: your pipelines are flushed, and you have to re-focus. The time taken doesn't tell the whole story.

Yesterday was just one of those days, because later, I was entering a zipcode into my dev machine, and was consistently told that there were no facilities near that zipcode, even though I knew there should be.

Turns out that somehow, my database table of zipcodes was empty. We still don't know how that happened, but it would have been great if the software could have helped diagnose this anomalous condition. I changed this:

try:
    z = ZipCode.objects.get(pk=zipcode)
except ZipCode.DoesNotExist:
    raise KeyError

to this:

try:
    z = ZipCode.objects.get(pk=zipcode)
except ZipCode.DoesNotExist:
    if settings.DEBUG:
        # Sometimes the problem isn't one bad zipcode, but that there
        # are no zipcodes in the db at all!
        if ZipCode.objects.all().count() == 0:
            print "*** You have no zipcodes! Run bin/load_zipcodes.py"
    raise KeyError

It would have been another half-hour saved. I don't know how the zipcodes were deleted, so it's hard to guess how often someone will be in this position again, but I know it is worth it to add these sorts of diagnostics. I'll take a guess that the next time the zipcodes are missing will be five minutes before a critical demo, when everyone is panicky and no one will be able to think through the possible causes clearly. An unambiguous diagnostic will be very welcome.

Take the time to make your software self-diagnosing. The more you can automate about the job of writing software, the better your software will be.

tagged: coding» 17 reactions

At work, there are security awareness posters that read,

HP is protected by you

A colleague, in a fit of linguistic pique, railed against the passive voice. He pasted a new message over the poster:

You protect us.

I suggested a more powerful version:

Protect us!

Or even,

Help!

Maybe something got lost along the way...

tagged: language» 4 reactions

OpenID is one of those web technologies I would love to love: it addresses a need, seems pretty well thought-out, and all the cool kids are doing it. But the fact is, it's still a bit too hard for what it's trying to be. When I first heard about OpenID, I read about it, and didn't quite get it. People kept talking about it, so I kept going back to read about it, and it still mystified me.

Big players started adopting it (AOL, Yahoo), so it seemed like it was here to stay, but I still didn't have the incentive to get over the learning curve. Earlier this week I visited yet another site that encouraged me to get an OpenID, and I decided I would finally cross OpenID off my list of technologies I should at least understand and probably use.

The simplest way to use OpenID is to pick a provider like Yahoo, go to their OpenID page, and enable your Yahoo account to be an OpenID. This in itself was a little complicated, because when I was done, I got to a page that showed me my "OpenID identifiers", which had one item in it:

https://me.yahoo.com/a/.DuSz_IEq5Vw5NZLAHUFHWEKLSfQnRFuebro-

What!? What is that, what do I do with it? Am I supposed to paste that into OpenID fields on other sites? Are you kidding me? Also, in the text on that page is a stern warning:

This step is completely optional. After you choose an identifier, you cannot edit or delete it.

(Emphasis theirs). So now I have a mystifying string of junk, with a big warning all over it that I can't go back. "This step" claims it's optional, but I seem to have already done it! Now I'm afraid, and I'm a technical person — you expect my wife to do this?

Luckily I can choose to enable other identifiers, so I also enable my flickr account as an OpenID.

Since I am a technical person, I've learned that OpenID supports delegation. That's a way to have your website act as an OpenID simply by adding some HTML to your page. The HTML points to another OpenID behind the scenes. That way, I can use nedbatchelder.com as my OpenID, and later be able to change who is actually hosting my OpenID.

Simon Willison shows the simple way to delegate your OpenID on your home page. You need the id you just got from your provider, and you need a URL for the provider's server. Oh, bad news: Yahoo won't say what their server's URL is. I can't delegate to Yahoo. Why? Don't know. Time to get another provider.

So I go to a more savvy provider, get an ID and a delegate server URL, edit my page, and I can't log in to my desired site. I must have messed something up. A good debugging tool for this is to log in to jyte.com. Since it was built by JanRain, the company behind a lot of OpenID, they helpfully provide very geeky error messages if the OpenID login fails for some reason. Turns out I had omitted one place in the HTML that I had to put my user id. Once I fixed that, all was well.

But what have I really gained? Ted Dziuba exuberantly rants about OpenID, since it is why he hates the Internet, and his points are accurate: OpenID is still really difficult, and doesn't gain you that much.

Stefan Brands rounds up lots of issues with OpenID, and I think they need to be taken seriously. OpenID may be one of those Internet technologies that will be fabulous among the savvy and well-intentioned, but falters when spread to the wider population on the web.

Older:

Sat 30:

Phelps

Even older...