|Ned Batchelder : Blog | Code | Text | Site|
» Home : Blog
The Vietnam of Computer Science is an interesting piece, on a number of levels. Ted Neward has undertaken to explain why Object Relational Mappers are difficult, and to illustrate his case, draws extensive analogies to the conflict in Vietnam.
It is a good (if dense) overview of the technical challenges involved in mediating between the relational model and the object model, and there are many. It even seems to be a good historical overview of the Vietnam War.
The problem, if I can get a little meta here, is that Neward has taken on a challenge he can't neatly complete: to use Vietnam as a way to make the problems clearer. The hook is a good one, it's very catchy, but he mires down his points with stretched analogies, and confusing references. There's just way too much Vietnam in the piece:
How does it help to pull Westmoreland into this? In fact, I glossed over the ten paragraphs of detailed history of the war. I get the big picture: ORMs have poorly-specified conflicting goals, and once you are drawn into using (or building!) one, things can get more difficult rather than easier, until you are experiencing diminishing returns.
Beyond that vague theme, the details of the war are merely a distraction from the computer science points (or is the ORM discussion a distraction from the Vietnam points?). In the end, the analogy is no help. Are we to look back on Vietnam and decide how it should have been handled, and do "the same thing" for our object/database interface? What is the appropriate analogy here?
And if we're choosing a war to compare to, why choose Vietnam? Surely Iraq is the more interesting comparison. After all, we all agree about Vietnam, but the ORM conflict is still in full swing, with proponents of all sides heatedly pursuing their causes.
On the technical points, I like the discipline Ted Neward brings to the piece, including getting the root of "relational" right (it refers to the mathematical concept of a relation, which is a set of tuples, not to relationships expressed via foreign keys). I recommend you read about the tough technical challenges, and skip the Vietnam quagmire.
Neward also has posted some follow-up thoughts in response to criticisms of his post.
A news story that sounds like the Onion but is not: Warren Buffet giving most of his money away... to Bill Gates. I understand the notion of giving huge sums of money away. And I think Bill Gates is doing great stuff with his Gates Foundation. But isn't there something to be said for diversity? The $32B that Buffet is giving to the Gates Foundation could create a really big foundation all by itself (as big as the mighty Gates Foundation is now). It could do similar work to the Gates Foundation if it wanted to, but could go in a different direction if it was important to. The rich get richer I guess. The good news is that the Gates Foundation is using the money to do good work.
My twin sister Sarai got married over the weekend. I gave a toast (twice!) for her, but it was kind of impromptu. It was a good toast, but it left me thinking about other things I could have said, in particular, on the nature of marriage.
The funny thing about marriages is that they start with weddings. Marriages and weddings are very different things. Weddings are formalized rituals. Even the most ardent individualist feels the tug of the ingrained cultural icons. A bride in a white gown, a groom in a tuxedo, marching down the aisle, and so on. We all put on our finery and conform to our roles. A wedding is a public ceremony conducted in front of dozens or even hundreds of people.
But marriages are a private matter. The husband and wife decide for themselves how they will conduct it. Of course there are societal pressures, and roles to fill, but they are far more diffuse. The demands on a couple living their lives together for decades are great enough that they will have to negotiate their own terms for it to work.
A wedding is a short event, six hours or so, a whole weekend if you really go overboard. In the scheme of things, it's over in the blink of an eye. Even taking into account the long planning period, a wedding is a tiny fraction the length of even a short marriage.
But a marriage is for a lifetime. No amount of planning can take everything into account. The Bridezilla mentality that demands perfection in every detail of the wedding simply can't work for a marriage. Too much happens over the years, too many factors are out of your control, there are too many unexpected turns in the road. As a couple, you have to be flexible, and learn as you go. You have to stay in touch with each other to understand where you both are steering the partnership.
So my advice: listen to each other, listen to yourself. Decide what you want. Do what you need to do to make it happen. You and your sweetie can decide together on the ground rules. Love each other, live and be well. Mazel Tov.
I was IM'ing with a friend, and we were discussing the variety of outcomes for small tech companies. I typed something mundane, but a slip of the fingers left out a space, making it pithier than I intended:
Yesterday, Walt Mossberg wrote a great review of Tabblo in his Personal Technology column: A Photo-Sharing Web Site Offers New Services.
We're of course super-pleased that Mossberg wrote about us at all, not to mention the positive review he gave us. For example:
Naturally, we had a spike in traffic yesterday, signing up something like ten times more users than in a typical day. The servers hummed along nicely under the load, a relaxing anti-climax that let us get on with the business of building new features and improving old ones rather than fighting infrastructure fires.
It's really gratifying to work on a product that can be seen and used by hundreds of thousands of real people, and that can get the attention of someone like Walt Mossberg. It's going to be exciting continuing it into the future to see where we can take it.
Over the weekend, we watched two off-the-beaten-path movies that I recommend highly:
Jesus is Magic is a Sarah Silverman comedy concert, with extras like songs thrown in. It is hilarious, but also quite crude. If you haven't seen Sarah, her shtick is to offend absolutely everyone, either ethnically, sexually, or scatologically. For example:
Dirty Filthy Love is a great movie about obsessive compulsive disorder. It's a touching and realistic portrayal of a man slipping into the grip of a debilitating illness. It seems to be billed as a romantic comedy, but I would not call it that. There is romance of a sort, but it's more of a drama with light moments than a comedy. And the romance is about people finding each other through pain rather than the sort of "aren't we in love?" lightness you'd find in a typical romantic comedy.
12² - 10² = (12 - 10) × (12 + 10) = 2 × 22 = 44
Peter Thomas gives us the full picture of a Java web application call stack. It's very impressive. It shows about 100 call frames, annotated with the different layers of the architecture. The comments there debate the question of whether this is a good thing or a bad thing.
I was curious about the equivalent stack for our current mod_python/Django/MySQL architecture. I won't paste the actual stack trace here (opinions differ about the extent to which work details can be discussed in a blog), but here's the breakdown by layer:
for a grand total of 19 python stack frames between Apache and MySQL, six of which are our code. I won't claim to know whether this is better or worse, just comparing.
I was discussing a new list display with a co-worker the other day, and the question of its ordering came up. "It's got no ORDER BY clause," I said, "so it will be randomly ordered."
"No, it will be ordered by the primary key, the id," he insisted.
"No, if you don't specify an ordering, then you're allowing MySQL to return the data in any order it finds convenient, and it will return the data to you as it finds it, and who knows what order that will be?"
The debate continued. My co-worker claimed that since the id is the primary key, that will be the order records are stored on disk, and that would certainly be the order in which they would be found.
I pointed out that there are many factors that contribute to determining the order of records on disk. For example, if the database strictly orders the records by their id, it has to be prepared to move records or create overflow blocks when a new id is inserted into the middle of an id range.
"Yes, but it's got to be in id order," he continued, and then in a fit of confidence, "I'll bet you my car that if you select some records with an integer id and no ORDER BY clause, they will be returned in id order."
I turned to my SQL prompt and typed a query off the top of my head:
mysql> select id from listitems where list_id = 4000;
As you can see, the records are returned in id order, except for the first two, which are reversed. Why? Who knows?
There's a lot of complexity in a relational database, and the implementers generally take every advantage they can. If you don't specify an ordering, you will get your records in an arbitrary order. Often, when trying out code for the first time, they will seem to be returned in order, but that's because your database is small. As your data grows, more randomness will appear as deletes and inserts become more jumbled.
As always, specific databases may make more guarantees. For example, I am told that Microsoft SQL Server always stores records in primary key order, and that you need to account for this in designing your schema to get maximum performance. I don't know if this is true or not. I don't know if it is true for all versions of SQL Server, or all combinations of table creation options.
This is one of those cases of confusing an implementation with a standard. SQL itself makes no guarantees about the ordering of records, and it makes no claims about what a primary key "means" other than it is a unique non-null index into the records of a table. But specific implementations (SQL Server, MySQL, SQLite, whatever) may make more specific guarantees about the meaning of these things.
But do yourself a favor: if you care about what order your data is returned from a SQL query, add an ORDER BY clause. If you think the data is naturally ordered that way, then the ORDER BY clause won't add extra work, and if the data isn't naturally ordered that way, adding the clause will set things right.
BTW: I didn't take the car.
I saw the video of the Bellagio-like fountains of Mentos and Diet Coke today. This is something that my son Max had tried to accomplish a few weeks back, and we were stumped as to how to get the Mentos into the soda bottle efficiently enough to get a good geyser. Fritz Grobe and Stephen Voltz described their techniques well enough for us to duplicate their method. Now I'm providing an illustrated guide of our process:
I saw this video a long time ago, and must have shown my kids, because Ben was just singing, "I'm a robot, programmed not to know, that I'm a robot, programmed not to know, that I'm a robot.." It sounded kind of familiar, so I looked it up, and re-discovered the video: Scent of a Robot.
From the technically-amazing-but-why-would-I-want-that department: The World Cup, live, in ASCII art. You can use telnet to get a live stream of ASCII art of the World Cup games:
$ telnet ascii-wm.net 2006
Of course, most soccer coverage involves long shots of a large field with small people on it, which doesn't translate well to ASCII art. Here's a pretty good picture of a single player moving the ball ($RO) left:
TRINIDAD & T. - SCHWEDEN 0:0
Yesterday was the 20th anniversary of my first day of work at Digital Equipment Corporation. It was my first "real" job after college (I worked for Penn's robotics lab for a few years after graduating).
The topic of Digital had come up the night before at the Boston Python Meetup because we had a discussion about last weekend's BarCamp Boston un-conference, which was held at what is now Monster's headquarters in Maynard. (Antonio attended and made a tabblo of his impressions). As anyone knows, calling the mill in Maynard "Monster's headquarters" is like calling Paul McCartney the lead singer for Wings. The mill in Maynard was famously the home to Digital.
While at Digital, I worked in Maynard, though not in the Mill. I worked in a big complex on Parker Street named PKO3. Digital was like a universe unto itself. Buildings at first all had two-letter abbreviations (kind of like the two-letter postal codes for states), and then once the company grew large enough, three-letter abbreviations. The older buildings were given an extra O to make all the abbreviations three letters. So PK3 became PKO3. Find any ex-DECcie, and you can talk about places with names like ZKO, MLO, PKO, LTN and so on.
An internal phone system meant that seven digits would let you call any Digital facility in the world as if it were a local call. And of course, the DECnet networking infrastructure was world-wide so you could copy files from a server in Tokyo exactly as you could from one down the hallway. This is commonplace and obvious now, but in 1986 when I joined Digital, it was amazing and not to be taken for granted.
Digital was a very large place, and had many of the problems of large companies, including difficulty adapting to changing markets and technology, and too many inefficiencies. When I joined, I had badge number 196314! I worked in the printer group doing PostScript work, and got to do all sorts of different work, including:
Digital was a good place for a young kid to get exposed to all sorts of technologies and processes for developing them. I learned a lot in my seven years there. Ultimately, the company was too large and incapable of taking advantage of the PC changeover for me to stay, but it was good while it lasted.
Fortress is a new programming language from Sun Research. It claims to be trying to take over the Fortran mantle of being a programming language for mathematicians and scientists. It has plenty of interesting stuff, including complex rules for how to richly present Fortress source code, so that the programs can be displayed as complex mathematical texts.
But perhaps the most unusual thing in the specification was this:
That's right, folks! No tab characters in the source code, they are forbidden. Will this end the tabs-vs-spaces debate, or simply enrage one side?
The big news from Iraq yesterday was the death of Al-Zarqawi, the leader of Al Qaeda in Iraq. Most people were nothing but happy about the news that the terrorist had been killed by US airstrikes. If we are locked in battle with guys like this, I am glad that we are winning.
But I also sympathize with the viewpoint that no death is good news, and that rather than representing a step toward closure, Al-Zarqawi's death more likely is simply another step on a very long treadmill of violence. I saw a headline about reactions to Al-Zarqawi's death from the father of Nicholas Berg, the man beheaded in Iraq. Reading the interview, I was surprised to see that he feels 100% that the killing is bad on both sides:
I can't imagine what it must be like to have a grown son brutally and publicly murdered, and in such a way that the entire nation is using his death as a call to arms. Michael Berg went through it, and remains committed to his viewpoint that all violence on either side is a bad thing. His resolution is impressive.
He goes on to compare Bush and Saddam Hussein, and he believes that neither is the worse, which I can't agree with. I think there are times when violence is the only remaining course, and there have been times when it has been used successfully. Only history will tell whether Al-Zarqawi's death or even the entire Iraq war has been one of those times.
Although I don't agree completely with Michael Berg, I am glad that there are people like him willing to speak out against violence as a strategy, if only to open a few people's eyes that there are other possible world views.
A tale of a bone-headed PR person at Google from Jon Udell:
I've dealt with chunks of code that seemed done, in that nothing had changed in them for a long time (years), and they were doing their job admirably. Then suddenly one day, a bug turns up in that code that seemed so finished.
Joshua Bloch has me topped: in Nearly All Binary Searches and Mergesorts are Broken, he identifies a bug in binary searches that has been around for 50 or so years. Like the archaeo-bugs I've dealt with, they arise because the old code is being used on a slightly new or unusual case. In the sort algorithms, it's arrays with billions of elements.