« | » Main « | »

The first servers

Wednesday 29 October 2008

Doing some research into the theory of load testing and traffic loads, I read about the Poisson distribution, which led to Agner Krarup Erlang, which led to early phone switches.

It's fascinating to realize that the work we do every day with web servers, which seems like a recent modern technology, was predated by guys like Erlang working with early phone switches over 100 years ago. Phone switches were the first servers: central machines connected to a large number of potential clients. In building these switches, the early engineers had to figure out from scratch how to anticipate the possible work load, so they could build switches large enough but not too large. The whole of queueing theory springs from the theories worked out by telephone switch engineers.

And they were clever guys, even adjusting the UI to lighten the load on the switches. When dialing a rotary phone, the particular digits determined how long the switch was engaged before the call could be routed. So when they allocated area codes,

the biggest population areas [got] the numbers that took the shortest time to dial on rotary phones. That is why New York City was given 212, Los Angeles given 213, Chicago 312, and Detroit 313, while Vermont received 802 (a total of 20 clicks, 8+10+2). Four areas received the then-maximum number of 21 clicks: South Dakota (605), North Carolina (704), South Carolina (803), and Nova Scotia/Prince Edward Island in the Canadian Maritimes (902).

Tabblo is hiring: web front-end developer

Monday 27 October 2008

Tabblo is looking for a web front-end developer, an HTML guru who can turn mock-ups into functioning web pages, isn't afraid of Javascript, has dealt with the dynamic nature of web 2.0 applications, and can function in a (cliché alert) dynamic, fast-paced environment.

Tabblo, a part of Hewlett-Packard, is the group that built Tabblo.com, but has also created a half-dozen other web experiences exploring the boundaries between online and offline content. It's a fun group doing fun things.

Here's a link to a full job posting. If you apply, drop me a line to let me know...

Funkload ftw

Saturday 25 October 2008

We are getting ready to roll out a new service at work, and are particularly concerned about the traffic levels it could receive, so we embarked on a quick-and-dirty load test.

I took a quick look at the open source tools available, and most are only capable of hammering on a URL, or on a list of URLs. Because our application is dynamic, with newly-created unique IDs being passed from URL to URL, we needed more expressive power than a data file of URLs could give us.

Then I found funkload. As the name (sort of) suggests, it can do both functional and load testing, and that is where its power lies. The functional testing focus means that you write your tests as Python unit tests, with the full power of Python and webunit to build the test case. Funkload can run the tests like functional tests (run each once and see if they succeed), or like load tests (run just one, but over and over).

I wrote a test case that runs through the entire experience end-to-end, about a dozen URLs, screen-scraping the returned HTML where I needed to extract dynamic data for use in the next URL. Funkload then ran the test on many thread at a measured rate, producing a report on tests per second, successes and failures, and so on. Looking at the funkload's numbers and the server stats during the test, we could determine the traffic load our servers are capable of.

Funkload isn't the fanciest load tester out there, and if you only need to hammer on a home page, it's more than you need, but for load testing dynamic applications, it's just right.

Animated sorting algorithms

Saturday 25 October 2008

David Martin has a great page showcasing Animated Sorting Algorithms. Eight different algorithms are shown sorting four different input sets, animated before your very eyes, so that you can see how the different algorithms behave. Not only is this a good looking site, but he has the right pedagogical goals:

These visualizations are intended to:

  • Show how each algorithm operates.
  • Show that there is no best sorting algorithm.
  • Show the advantages and disadvantages of each algorithm.
  • Show that worse-case asymptotic behavior is not the deciding factor in choosing an algorithm.
  • Show that the initial condition (input order and key distribution) affects performance as much as the algorithm choice.

Contrary to some reports, the animations are not coded in Javascript, they are animated gifs, but that's a fine tradeoff. Professor Martin is spending his server's bandwidth to save browsers around the world reproducing the same computations over and over again.

Authonomy

Sunday 19 October 2008

Authonomy is a great use of familiar web technologies to help the remarkably backward world of traditional publishing. HarperCollins took a problem of theirs, trying to separate the wheat from the chaff in their unsolicited manuscripts, and has delegated it to the internet. It's almost obvious in retrospect: their editors don't have the time to read through every manuscript sent to them, and the internet is full of book lovers who would gladly read and rate new work, even if it is incomplete.

Once the site is out of beta, the top five manuscripts each month will get the attention of a HarperCollins editor. That's certainly carrot enough for hopeful writers to put energy into the site.

The last question in the Authonomy FAQ explains it well:

Why won’t HarperCollins read all the manuscripts itself, instead of enlisting internet users to help them?

HarperCollins, like all publishers, is inundated with new manuscripts, and cannot hope to consider them all fairly. We don’t feel that our current, closed 'slush pile' system is fair to authors themselves — nor do we believe it is giving us the best chance of finding the brightest new talent. authonomy is a genuine attempt to find a better way to determine the books on our shelves — and it hands selective power to the readers who will ultimately be buying them.

Python mystery #6237: solved

Sunday 19 October 2008

I was refactoring some code today, and couldn't figure out why my simple textbook change was making code break, until I relearned a subtlety of Python.

The code looked like this:

def __eq__(self, other):
    # Compare the parts, but be clever when comparing the specs.
    if self.__class__ != other.__class__:
        return False
    return self.spec.make_div_only() == other.spec.make_div_only() and self.chunks == other.chunks

I didn't like that long last line, so I made the simple change to this:

def __eq__(self, other):
    # Compare the parts, but be clever when comparing the specs.
    if self.__class__ != other.__class__:
        return False
    if self.spec.make_div_only() != other.spec.make_div_only():
        return False
    return self.chunks == other.chunks

and ran my unit tests, and they failed. What!? How could doing a simple boolean refactoring cause breakage? I scratched my head, re-read the lines, questioned my understanding of the short-circuit nature of the and operator, and so on. The usual "am I going crazy?" debugging techniques.

Undoing the refactor made the tests work again, I changed it again to the shorter lines, and the tests failed again. Adding some print statements to see the actual values being compared, I realized that the result of make_div_only is an object (of class Spec), and that object defines its own Spec.__eq__ method to define the meaning of the == operator for its instances.

Then it hit me: the class doesn't define a __ne__ method. My refactoring changed the operator from == to !=, the first was overridden to provide custom semantics, but the second was not, so simple object inequality was being checked, so make_div_only inequality test was always true, and the method incorrectly always returned False.

The Python docs are clear on this point:

There are no implied relationships among the comparison operators. The truth of x==y does not imply that x!=y is false. Accordingly, when defining __eq__(), one should also define __ne__() so that the operators will behave as expected.

Adding a __ne__ method to my Spec class made everything work properly:

def __ne__(self, other):
    return not self.__eq__(other)

Ugly pages

Tuesday 14 October 2008

I can't explain why exactly, but I am fascinated by a pair of truly horrendous web pages I ran across in the last week.

  • Opening Page is a screeching roll of outrageous claims of wealth that will befall whoever buys the domain openingpage.com. He can't really believe this stuff, can he?
  • I think HavenWorks is trying to be a helpful news site, but with the oops-wrong-palette color scheme, rows of tiny photos with blue link borders, and razor-thin columns running down the page, it's a usability nightmare, a case study in all the ways not to make a web page.

When I was growing up in New York City, Carvel ice cream had commercials on TV with Tom Carvel himself doing the narration. These commercials stood out even to children as having remarkably low production values. The theory at the time was that it was done intentionally to make an impression. Are we seeing the same tactic with these web sites? Or am I reading too much into it, and these are simply horribly designed?

In the case of HavenWorks, the owner seems proud of his distinction, since at the bottom of the page is this notice:

!-! Nominated for Most Poorly Designed Website in the World by Digg.com

and he even has a page of his other designs.

3 down, 47 to go

Saturday 11 October 2008

Connecticut has joined the ranks of states allowing gay marriage, good for them. The process was similar to Massachusetts and California: couples sue for the right to marry, eventually the state Supreme Court finds that either existing laws don't preclude gay marriage, or the state constitution won't allow distinguishing between straight and gay couples. I for one am glad. I believe that eventually this will be accepted across the country, and people will wonder what the fuss was about. Those predicting the downfall of society will be proven wrong. We continue to have thriving families here in Massachusetts even after four years of gay marriage.

For a vibrant "debate" on the issue, check out the comments on Hot Air's post about the news. The post itself, while disagreeing with the decision, does a good job analyzing the legal arguments in it. The comments, though, consist mostly of people hurling invective at each other, no one being swayed by either sides' arguments.

This decision will bring the usual complaints of judicial activism (actually, they were interpreting the constitution, that's their job), the collapse of morality (how exactly?), harm to families (by creating more of them? I don't get it), the disenfranchisement of the people (the whole point of judges is to decide independently of public opinion) and so on. To all of them I say, open your eyes and close your mouths. Everything is fine. The boogey-man of gay marriage simply doesn't exist.

Five thirty eight

Monday 6 October 2008

We are in full swing now in the presidential campaign, and we are constantly bombarded with poll numbers. Funny thing is, most of those polls are just national polls, a prediction of how the nation-wide popular vote will turn out. But as the 2000 election underscored, that doesn't matter at all: what matters is the electoral vote. To predict that, you'd have to track individual state-by-state polls to see who wins the popular vote in each state, and compute the electoral vote totals. Sounds like a lot of work, but FiveThirtyEight.com (Electoral Predictions Done Right) has done all the work already. They also run statistical simulations to predict the likelihood of various outcomes (for example: the chance of McCain losing the popular vote but winning the election is 1.7%).

Add extensive tables of data detailing the poll data, the simulations, their predictions, maps of outcomes, more of the same for congressional races, and so on, and you have a quantitative political junkie's dream site.

BTW, as of this moment, they predict an Obama win, with 339 electoral votes to McCain's 199.

And they aren't the only game in town: there's also Electoral-vote.com (currently predicting a 329 over 194 win for Obama), and Election Projection (364 to 174 for Obama).

Aptus 2.0

Sunday 5 October 2008

Aptus 2.0, the latest version of my Mandelbrot explorer, is now available. It's got a lot of improvements over the previous version, including speed improvements, multiple top-level windows, tool windows for displaying information and Julia set support.

fractal image from the Mandelbrot set

It's built with wxPython, so it runs on Windows, Linux, and Mac.

Python registry grepper

Thursday 2 October 2008

In writing the python registry switcher, I needed to search the registry for references to my old Python version. Another good use for a Python script:

""" Search the Windows registry.
"""

import _winreg as reg
import itertools

RegRoots = {
    reg.HKEY_CLASSES_ROOT:   'HKEY_CLASSES_ROOT',
    reg.HKEY_CURRENT_USER:   'HKEY_CURRENT_USER',
    reg.HKEY_LOCAL_MACHINE:  'HKEY_LOCAL_MACHINE',
    reg.HKEY_USERS:          'HKEY_USERS',
    }

class RegKey:
    """ A handy wrapper around the raw stuff in the _winreg module.
    """
    def __init__(self, rawkey, root, path):
        self.key = rawkey
        self.root = root
        self.path = path
        
    def __str__(self):
        return "%s\\%s" % (RegRoots.get(self.root, hex(self.root)), self.path)
    
    def close(self):
        reg.CloseKey(self.key)

    def values(self):
        """ Enumerate the values in this key.
        """
        for ikey in itertools.count():
            try:
                yield reg.EnumValue(self.key, ikey)
            except EnvironmentError:
                break

    def subkey_names(self):
        """ Enumerate the names of the subkeys in this key.
        """
        for ikey in itertools.count():
            try:
                yield reg.EnumKey(self.key, ikey)
            except EnvironmentError:
                break
        
    def subkeys(self):
        """ Enumerate the subkeys in this key.
        """
        for subkey_name in self.subkey_names():
            if self.path:
                sub = self.path + '\\' + subkey_name
            else:
                sub = subkey_name
            yield OpenRegKey(self.root, sub)

def OpenRegKey(root, path):
    try:
        rawkey = reg.OpenKey(root, path)
    except Exception, e:
        #print "Couldn't open %r %r: %s" % (root, path, e)
        return None
    return RegKey(rawkey, root, path)

def grep_key(key, target):
    for name, value, typ in key.values():
        if isinstance(value, basestring) and target in value:
            print "%s\\%s = %r" % (key, name, value)

    for subkey in key.subkeys():
        if not subkey:
            continue
        grep_key(subkey, target)
        subkey.close()

def grep_registry(args):
    for root in RegRoots.keys():
        grep_key(OpenRegKey(root, ""), args[1])

if __name__ == '__main__':
    import sys
    grep_registry(sys.argv)

Most of this is a pythonic wrapper around the _winreg module, with a few simple functions at the end to actually search the registry.

Switching python versions on windows

Wednesday 1 October 2008

I forget what software first set up these associations, but I have .py files registered with Windows so that they can execute directly. The registry defines .py as a Python.File, which has a shell open command of:

"C:\Python24\python.exe" "%1" %*

My PATHEXT environment variable includes .py, so the command prompt will attempt to execute .py files, using the registry associations to find the executable.

But: I wanted to switch from Python 2.4 to Python 2.5. That meant updating the registry in a handful of places. A Python script to the rescue!

""" Change the .py file extension to point to a different
    Python installation.
"""
import _winreg as reg
import sys

pydir = sys.argv[1]

todo = [
    ('Applications\python.exe\shell\open\command',
                '"PYDIR\\python.exe" "%1" %*'),
    ('Applications\pythonw.exe\shell\open\command',
                '"PYDIR\\pythonw.exe" "%1" %*'),
    ('Python.CompiledFile\DefaultIcon',
                'PYDIR\\pyc.ico'),
    ('Python.CompiledFile\shell\open\command',
                '"PYDIR\\python.exe" "%1" %*'),
    ('Python.File\DefaultIcon',
                'PYDIR\\py.ico'),
    ('Python.File\shell\open\command',
                '"PYDIR\\python.exe" "%1" %*'),
    ('Python.NoConFile\DefaultIcon',
                'PYDIR\\py.ico'),
    ('Python.NoConFile\shell\open\command',
                '"PYDIR\\pythonw.exe" "%1" %*'),
    ]

classes_root = reg.OpenKey(reg.HKEY_CLASSES_ROOT, "")
for path, value in todo:
    key = reg.OpenKey(classes_root, path, 0, reg.KEY_SET_VALUE)
    reg.SetValue(key, '', reg.REG_SZ, value.replace('PYDIR', pydir))

Invoke this with your desired Python installation directory, and the registry is updated to point to it.

Note that this doesn't affect what the command "python" means, that's determined by your PATH enviroment variable. These registry settings change which Python executable is found when you invoke a .py file as a command.

« | » Main « | »