|Ned Batchelder : Blog | Code | Text | Site|
» Home : Blog
A longish tale of debugging a failed server assertion, in which I suspect revelation of deep mysteries, but in the end am merely graphically reminded of what everyone knew all along.
CNN reports that Single-letter domains might earn seven figures. Really? Do people still believe these domains are that valuable? Why? Some are already in use. For example, q.com belongs to Qwest, and x.com goes to Paypal. Are these companies reaping some benefit from these domain names? Would companies really pay a million dollars for other such names? Go figure.
Two interesting pieces: Tim O'Reilly on What Is Web 2.0, and Evan Williams on Ten Rules for Web Startups, though most of them have nothing to do with the Web. I intend to continue to work for startups, and I've always got my eye on Web technologies, so these are good manifestos to keep at hand.
Over the years, a number of people have asked how I produce this site. The process was never architected and polished, so I've been reluctant to throw open the curtains. But I'm not ashamed of it, so here it is: Xuff, px, bx, etc. There's a brief explanation there, and a download of tools and site sources. Enjoy!
For Your-Welcome-Giving Day, some leftovers:
¶ Interpreting the Data: Parallel Analysis with Sawzall. Who knew Rob Pike was at Google, creating new languages, and with a cool email address of r at google dot com?
¶ All the water and all the air, nifty visualization.
¶ wish jar journal: fun with stickers, cute photos.
¶ Comparison of different SQL implementations, these tables may come in handy some day.
¶ Best Word Book Ever, how Richard Scarry's classic has been updated to be more politically correct.
Mitt Romney, the Republican governor of Massachusetts, recently had a death penalty bill defeated. I don't think anyone was surprised by the outcome. Romney did it for political reasons. Not that he doesn't want a death penalty law, I'm sure he does. But he knew it would be defeated. He proposed it because he is going to be making moves on the national political stage, and he wants some conservative credentials to talk up in the red states. Now he'll be able to say that he pushed for the death penalty in Massachusetts.
I don't know what Romney is thinking when it comes to national politics anyway. Not only is he a RINO, but he's from Massachusetts, the bluest of the blue states. Here's my prediction for how the Republican primary debates will go:
BTW: in other death penalty news, even trigger-happy Texas is realizing the death penalty is prone to error: Executed man may have been innocent. In my mind, this is what is wrong with the death penalty. No matter how many safeguards are in a bill, it is still people who apply judgement all along the line, and those people may not behave impartially, honestly, or flawlessly. Abuses of power, however slight or even unintentional, happen all the time. The death penalty is too final.
It's interesting to see how Amazon is opening up their infrastructure as web services. I just heard about Amazon Mechanical Turk, which is subtitled "Artificial Artificial Intelligence". It's a clever idea: if you want a computer to do a task automatically, but the job is too hard for a computer, then farm it out to humans. Kind of like SETI@home in reverse. Where SETI@home farms out difficult computations to lots of computers (which are good at computation), Mechanical Turk farms out difficult image recognition to lots of humans (which are good at image recognition).
I'd heard of this approach before, for example, a security firm handing out images from security cameras, asking people to tell whether there's a person lurking in the photo someplace. Amazon seems to have set this up to find businesses in photos for their BlockView maps.
Then once they had it set up, they exposed it via a Mechanical Turk web service. It's a great idea. Since they had the infrastructure built, building a web service opened up their marketplace of human workers to others who wanted to provide work and needed to find people. That will bring more human workers (since there is more work there of a more varied nature), which will bring competition among work providers to pay well, which will bring more workers. Capitalism in action. It'll be interesting to see if there are enough businesses with work needing to be done to give it critical mass.
I was talking to a fellow Python enthusiast the other day, and the topic came up of Python's great dynamic nature. I said that I didn't think it had had that big an effect on me. He was surprised. "You don't like dynamic typing?" he asked. It's not that. Looking back over my experience with Python, it's not the dynamic typing that has affected me most. Let me explain:
Python fits my expectations: the dynamic typing was simply what my primitive monkey coding brain expected. There's a strong contrast to C++'s static-typing friction, and include file hell, but because of my mental expectations, that contrast feels like a point taken from C++, not a point given to Python.
When working in C++, the Python thing I miss more than dynamic typing are the easy to use built-in data types like lists and dictionaries. The amount of noise you have to introduce into your code when using the STL is just staggering. It's a true impediment to progress. Let's say I have an A and want a B, and there's a simple mental path between the two (string to split array, array to joined string, and so on). In Python (and most other "scripting" languages), it's a simple expression that makes sense. In C++ or Java, it's three or four lines of idiomatic yet still not memorizable code. I find myself having to go and find the last place I did it in the code, and copying.
Also, I haven't yet used a lot of Python's deeper mysteries: metaclasses, decorators, and so on. I've benefited from them in other people's code (like SQLObject's magic), but I haven't made use of them myself. I've built much larger systems in C++ and Java than I have in Python, so I haven't yet needed some of the raw power in Python.
But if I had to pick the biggest effect Python has had in me as a programmer, it's been about testing. My experience writing automated tests in Python projects is that I am freed from fear. I don't have to worry about forgotten corners of the code. I've tried to apply that mind set in non-Python projects. The change in philosophy about testing is a much bigger change in me because I can apply it outside of Python. I wrote code differently, and approach development differently. And I can use those concepts in C++ code, whereas dynamic stuff is left behind with the Python.
So no, Python's dynamic nature is not its defining characteristic for me. And to be perfectly honest, I have a lot of static typing habits that I'm still trying to break.
Peter Callesen makes paper sculptures cut from a single sheet of paper. He leaves the leftover paper as a base so you can marvel at his skills. Some are intricate, some simple and poetic. Amazing. (Boing Boing linked to the second site as a Japanese site: it is Korean.) He has also made floating castles of styrofoam.
In a posting about bug triage, Bob also linked to the wikipedia page about made-up words in The Simpsons. It's a funny list, but I was surprised that Professor Frink's signature word, "flavin", was missing. Maybe it's because Flavin is actually a word?
Tomorrow is Nat's birthday, and tonight we made him a swimming pool birthday cake, since he likes to swim:
The lane lines are Smarties (which it turns out don't taste good with cake and frosting), the diving board is a graham cracker on a fun-size Snickers base. The droopy things on the left are supposed to be flags suspended over the pool, but the Starburst and candy-corn flags were too heavy for the Tootsie Roll supports! The Lego figures were created by Max and Ben, who came up with the genius idea of having partial figures to represent partially submerged swimmers.
Last year, I wanted a parser generator for a Python program I was working on, and I went in search of a tool. As is typical in the Python world, I found many, of varying degrees of maturity, activity, and depth. To make sense of it all, I compiled a list or python parsing tools and put it on my site.
Since then, many people have found it. About ten people a day come to this site with search queries like "python parsing", "bison for python", "lexical analyzer python", and the like.
Recently, though, Robert Kieffer wrote to me, asking me to do the Right Thing:
Last week, my post about the French riots drew more comments (37!) than any other post on this blog ever. Also, during the time when most of those comments were accumulating, I was getting about twice the number of hits as usual. Looking at the stats, it looks like a lot of it was people refreshing the comment page to see if there were any new ones.
So I've added a new feature to my home-grown comment system. Now at the bottom of the form is a checkbox: Email me future comments. If you check it and provide an email address, then all comments after yours on that post will be emailed to you. Please try it out, and let me know if you have any problems.
I wrote two weeks ago about Google cancelling my AdWords account. Since then, Google responded to my announcement of cutting off their payment. They sent me an email that apologized for my frustration and said, "Enter a new credit card number, and your account will be reactivated." I was skeptical, but tried it. After two days with the new number, the account was still not reactivated.
I sent another angry email that questioned whether they were even reading these emails properly. Finally, they responded saying the account had been reactivated, and indeed it has. So far, it has not been shutdown again.
So in a way, it is a happy ending: my ads are running, as I wanted them to. But I still never got an answer about why the account was closed in the first place, and it would have been good service for them to say something like, "We're sorry for the mess-up, here's a week of ads free." At the very least, they could have admitted it was a mistake, and apologized. So I'm reserving judgement on the whole experience.
Ben, my seven-year-old, loves to tell us about the imaginary world he's devised, called Stickfus, so-named because it is populated with stick figures. He's meticulous about the details, and a bit impatient with the rest of us when we don't remember things he's told us before.
So when I found The Big Big Big Book of Tashi, I knew it was for us. Tashi is the creation of a boy named Jack, who tells the stories to his parents, who don't always follow the story line properly ("Dad! You always ask the wrong questions!").
The stories take place in a foreign old-world vaguely Chinese land populated by dragons, ghosts, warlords, witches, and the like. Tashi is a clever can-do fellow who always outsmarts the bad guys with little more than his own optimistic good will. The stories are short enough to read in one bedtime sitting, but meaty enough to hold a 2nd-grader's attention, or to be read by the child himself.
The illustrations are an awesome bonus: they are surefooted pencil drawings full of character, one or two on each page. The Tashi series consists of a dozen short books. The Big Big Big Book collects the first seven into a chunky volume. The series comes from Australia, which may explain why Tashi is not better known here. He should be.
For you math geeks out there: have you ever wanted to know what was unusal about a particular number and had nowhere to go? What's Special About This Number? can tell you. For example, 1789 is the smallest number with the property that its first 4 multiples contain the digit 7. And that's one of the more ordinary factoids in the list...
One of the difficult things about programming in Windows, and especially with COM, is that when you receive an error code, it can be difficult to find the message that goes with it. I was writing some new LDAP code today against Active Directory, and got this message:
Couldn't read LDAP property member: COM exception 0x80005010
What does it mean? The "Couldn't read LDAP property member" was my text, the 0x80005010 part is the HRESULT. Our error handling code tries to find a text message, but it couldn't, and so made do with just the hex. I fired up errlook, the Microsoft error code lookup tool, but it couldn't find a message either. Using DevStudio, I searched all of the include files for "80005010". It found this section of adserr.h:
OK, now I knew what the error meant, and I had a symbol to use in my code if I wanted to handle it specially. But I'd like to display error messages at run time. How can I get that text message automatically? The errlook tool couldn't find the message, so it was natural that my application couldn't, but could I look further?
Windows stores error messages in DLLs, and the FormatMessage function can take an explicit module handle to a module containing error messages. My application already has a list of modules it searches when looking for error messages. If I could find the DLL with this message, I could get a nice message next time.
To find the error message, I resorted to the blunt instrument of the command line tool strings. It searches binary files for printable strings. Here I've told it to search the entire file (rather than interpret the executable format), look for a minimum string length of 10, print the file name along with each string, and look for little-endian 16-bit characters:
$ cd c:\windows\system32
Aha! ActiveDS.DLL is my guy. I registered it into my application's search list of error modules, ran the app again, and got a nice error message:
Couldn't read LDAP property member: COM exception 0x80005010:
An awesome table of all of the known CSS Hacks, and which browsers see which. If you don't know what I'm talking about: CSS Hacks are the dirty underbelly of Cascading Style Sheets, where the differences in CSS implementation in the different browsers are worked around by exploiting differences in the parsers to have certain rules exposed only to certain browsers. Yuk!
Dan Bricklin has announced his next Software Garden product: WikiCalc. I haven't tried it, but it looks like a client-side wiki thing with embedded spreadsheet behaviour. I'm fascinated by the addition of structured to wikis. At work, I installed TWiki, which has a wide array of features for dealing with wiki pages in organized ways. It reminded me a little bit of Notes, the way documents carry structured information, and the system provides ways to add little bits of "code" to pages to build applications. TWiki doesn't rise to Notes' level of programmability, but is far more flexible in the ways parts can be combined.
I know there are others as well: JotSpot comes to mind, but I've never used it. I'd love to build a structured wiki: it's in a sweet spot at the intersection of document production, web technologies, and developers as customers.
There is a great deal of violence in the world today caused by Muslims. Many of the perpetrators attribute their violence to their religion, or to their desire to defend it against perceived attacks. There's no denying that the rioters in France are young Muslims.
But they aren't rioting because they are Muslim. They are rioting because they are an underclass in France, they are experiencing 40% unemployment, they are reviled, they are poor, and they feel disenfranchised. They aren't considered French, although they were born and raised in France, and they aren't considered Northern African, because they've never lived there. There are deep difficult issues at work here.
I'm not excusing what they are doing: the rioters, whatever brought them to this point, are now violent criminals, and should be arrested and punished. The rioting is inexcusable. But when people say, "See, I told you Islam was a bad thing," they're only compounding the problem, and doing themselves a disservice in their efforts to understand the tumultuous times we live in.
Consider the race riots in the US in the 1960's (Watts in 1965, Detroit in 1967, and Newark in 1967 for example). The parallels are striking. Blacks rioted then for precisely the same reasons the Muslims in France are rioting now. A poor population felt oppressed, disenfranchised and stuck. In each case, a seemingly minor police action touched off a multi-day riot that killed dozens and left millions of dollars in property damage. In some ways, the US riots were far worse than today's: those three US riots claimed a total of 100 lives; so far, the Paris riot has only a single death.
I'm sure in the 1960's there were those that felt the riots proved their point that blacks were dangerous violent people. Does anyone feel that way now about those riots? How is the Paris riot significantly different than the US riots?
Simplistically blaming Islam for the riots in Paris is missing the point. The Muslim population of Europe is going be a big problem for a long time, but not because they are Muslim. They are a minority that is not integrating well into the larger culture. Some of that problem is their own fault, and some of that is because of their religion. But it can't help to demonize the religion. Islam is not going away, and while it is mixed into this problem, it is not the problem.
Yesterday's post was my 1337th post. If I were leeter, I would have known that yesterday!
I'm only just beginning to dig into all of the Python web frameworks (The Boston Python Meetup group is doing a quickie comparison of frameworks on Thursday, and I'm on the hook for a TurboGears application.) For the most part, I still don't understand what all the parts are, though I recognize the names going by in my RSS feeds. I read about the developments with a vague interest, because it doesn't apply to me.
But when I saw Ian Bicking's Ajaxy Exception Catching screencast, my mouth dropped open. He's built something (WSGI middleware, what the heck is that?) that catches exceptions in your web application, and displays a stack trace in your browser. Big deal. But you can also expand the line numbers to see context lines of code around the lines themselves. Nice. Then you can expand some more to see all of the local variables at each frame in the stack. Whoa! Then you can type arbitrary Python into an edit control and have it evaluated in the context of that frame. Magic!!
A conversation with a friend yesterday turned to photography. He said he wanted to gather statistics from the EXIF data in his photos. I said I didn't know how EXIF data was stored, and would have to go look it up. David responded,
which I can't argue with.
Turns out EXIF data is actually stored as a TIFF file embedded in a JPEG record!
I'm probably more interested in file formats than your average guy. I'm fascinated by the different choices made by the designers of these formats. For example, JPEG is a sequentially-read record-oriented format. The file is composed of chunks, each of which has a tag number and a length. PNG files are also record oriented, but the records are identified by a four-character id. In a clever hack, four bits of record metadata are stored in the 0x10 bit of each character, so the case of the letters in the tag are significant in interesting ways. Typical tags include tIME, pHYs, and bKGD.
TIFF files are a bit harder to pick apart: they include byte offset pointers within the file, so reading the file may involve jumping to the end only to be directed back toward the beginning to find the data you want.
David and I also discussed RAW files. We both knew they were a straight capture from the CCD, and that there'd been no compression loss or "developing" interpretation, but couldn't put our finger on exactly how that differed from typical image files.
The downside of RAW files is that different cameras use slightly different formats. Adobe has a format called DNG, Digital Negative, which is designed to retain all the benefits of RAW files, but without all the vendor-specific differences.
Duane Keiser is an accomplished painter. He has a one-a-day blog of his paintings, and a companion site of movies of the paintings being painted. Half of these look like photographs to me. I can't imagine how to use a paint brush to capture the gooey sheen of a chocolate covered cherry, or the subtle shades of ravioli in marinara sauce. They're good enough to eat, simple, but intimate and sensual.
An Eye for Annai is a very sweet animated short, and I mean sweet as in innocent and child-like, not as in, "dude, you gotta check it out, it's sweet!". It will make you smile, what more could you want?
Oracle is jumping on the free database bandwagon: Oracle Database XE is a free (as in beer) Oracle database. This parallels Microsoft's SQL Server Express and of course all of the open source databases. The press release indicates that XE will be limited in both RAM (1 Gb) and storage (4 Gb). Microsoft's earlier free database, Desktop Edition, was unfortunately also limited in how many concurrent activities it would support, and made it just a toy.
Keep in mind, XE is still Oracle 10g under the hood, so it's going to have a big footprint, the download is 150Mb! Harry Fuecks has some details on building a PHP site on top of it.
Rands in Repose zeros in on a problem I definitely have: Repetitive Information Injury. It's the nervous habit of circling around your information feeds, pressing the lever hoping to get another pellet. I don't know if Rands has any advice about it, I got halfway through and then had to go check my email.
He recommends admitting you have a problem, then focusing on positive forward-moving information gathering. Sounds good. I think I get into click-for-a-pellet mode when I am faced with a job I don't want to do, or when I am stuck and want something easy. It's the couch potato in me that scans Bloglines for something entertaining. Sometimes I feel like there are too many distractions around me, I know I will be interrupted, so I go ahead and interrupt myself before someone else gets a chance. Rather than dig into a meaty problem and face the frustration of not making progress on it, I'll punt and click around instead.
A worse problem is when I have two tasks before me: one I want to do, and one I have to do. I'm not supposed to be working on the want-to-do, but I can't quite get up the hill of the have-to-do. So instead I waste time surfing around. My internal taskmaster dictates that it's ethically wrong to work on the want-to-do, but surfing doesn't incur negative karma points. I know, it doesn't make any sense. Maybe I feel like if I give into the want-to-do, I'll never get back to the have-to-do. Like Rands said, the first step is admitting you have a problem.
I've long adored Sysinternals' Windows tools. They are hands-down the most technically advanced Windows system tools around, and they are free. What I didn't know was that Sysinternals' Mark Russinovich has a blog. His latest post, Sony, Rootkits and Digital Rights Management Gone Too Far, is a fascinating piece of detective work through some pretty deep Windows internals, reverse-engineering an irresponsible and intrusive bit of Sony digital rights management code.