« | » Main « | »

Date Difference and different dates

Thursday 31 July 2008

Date Difference My son Max released his first app last week: Date Difference is a small OS X application that simply gives you the time delta between two dates. It's been fun helping him puzzle through Objective-C, Cocoa, dates, and all the rest of the details that go into even a very small application. Go give it a try. With it, we realized that Max is coming up on his 6000-day birthday.

It's also been an interesting jumping off point for talking about all of the weirdnesses with time and dates. First off: time zones. Although motivated by physical reality, time zones are really political creations, and so can be very random. Some are not even whole numbers of hours. In fact, Iran, Afghanistan and Myanmar are all in the small club of countries a half-hour out of step. Perhaps an explanation for bad behavior? Think about it...

Another date anomaly: switching from the old-style Julian calendar to the more accurate Gregorian calendar meant correcting accumulated error by skipping days. For example, in Britain and its colonies, the day after Sept 2, 1752 was Sept 14, and in Russia, Jan 31, 1918 was followed by Feb 14. Hard to imagine the logistics of making that happen.


Friday 25 July 2008

I never took a course in economics, but it always seemed like a fascinating subject. It strikes me as either cutting to the heart of the matter, or completely missing the point. I've long wanted to understand more about how economists think, and what they think about, so the EconTalk podcast has been a great find.

Russ Roberts does a good job interviewing other economics thinkers, mostly keeping out of the way, and giving them a chance to explain their ideas. He'll chime in with his own ideas, but rarely falls into the trap of hogging the spotlight for himself.

One thing is clear: Russ and most of his guests have a particular world view. They are unapologetic free-market small-goverment guys. For example, the episode on Subsidies and Externalities is really a chummy conversation about the absurdity of government planning doing a better job than the free market. I was disappointed that Roberts couldn't put aside his politics to find even one example of a good reason for a subsidy. To chalk up North Carolina's subsidizing of their forestry farmers to the powerful farmer lobby seems simplistic.

The most interesting episode politically was Sowell on Economic Facts and Fallacies, wherein Thomas Sowell shed light on his view of economic fallacies. While I disagree with him politically, it was interesting to hear a consistent approach to a number of issues ranging from CEO pay to third-world labor, all of them coming down clearly on the side of letting the market have its way. He attributes concern over CEO pay to people (he never says liberals, but you know he wants to) wanting to design society according to their own ideas. He comes back to this misguided notion a few times. The final topic he discusses is immigration, where he suddenly does an about-face and without the least shame, declares that we need to control who can immigrate to this country, so that we can make sure that the right people come in and the wrong people stay out!

I'd be interested to know if there are similar podcasts making counter-arguments out there. It would be good to get the other side as well.

These political peccadilloes aside, many of the podcasts are very interesting as a discussion of economics, though sometimes quite advanced. Being professors talking together, it can also veer a bit to the inside: Roberts will throw around names and terms like Austrian School without explaining them.

But it has helped deepen my understanding, both in the specifics of economics, and in familiarizing myself with the idiosyncrasies of economic thought. For example, there's the classic mindset that says that if the world does something a particular way, then by definition it must be a good way to do it. This overlooks the possibility of a radical shift that uncovers a previously untried possibility. For example, food markets in the early half of the 20th century employed counter clerks to retrieve items from shelves. An economist of the time would have said, "everyone does it this way, there must be an economic advantage". But then someone invented the supermarket, where customers could get things for themselves, and everyone switched.

Economists often pretend real-world forces like culture and non-economic interests ("I invest in cattle stocks because I always wanted to be a cowboy") don't exist. Ideal markets are an abstraction. But economics is the best way we've found to explain the anthill of human business activity. EconTalk has helped deepen my understanding of it.

On the counter-intuitiveness of speed

Sunday 20 July 2008

I had an idea this morning that I thought would make my Mandelbrot viewer Aptus run a little bit faster. The compute engine is written in C for speed, but with a Python progress callback function passed in to get updates on the state of the computation. The code is structured like this:

// Code A: .1 sec
for each scanline:
    compute the pixels
    call the progress function

The progress function is called once per scanline, so for the default 600×600-pixel view, it is called 600 times. Computing this default view takes .1 second.

My idea was to invoke the progress function less often. There's no need to invoke it as often as 600 times in a tenth of a second. Since the progress function is written in Python, I figured I could save some time by avoiding some of that Python execution.

As a quick test of my thesis, I commented out the progress call entirely:

// Code B: .06 sec
for each scanline:
    compute the pixels
    //call the progress function

Now the computation took .06 seconds, a significant improvement! It looks like 40% of our time is spent reporting progress through a Python function.

The basic unit of computation for the Mandelbrot set is an iteration, and I was already counting the total number of iterations. So I changed the code to call the progress function only if a minimum number of iterations had been calculated since the last progress call:

// Code C: .1 sec
for each scanline:
    compute the pixels
    if min_progress_delta (1M) exceeded:    
        call the progress function

With this code in place, the computation still took the original .1 seconds. That's odd. The total iterations in this case is 1.9 million, so we only exceed min_progress_delta once, and the progress function is only called once. How can this be? In Code A, we invoke the progress function 600 times, and in Code C we invoke it once, and yet they take the same amount of time. In Code B, we invoke it not at all, and it speeds up by 40%. How can the one call change between Code B and Code C make such a difference?

Odder still, suppose we change min_progress_delta to two million, so that the progress function is never invoked?:

// Code D: .1 sec
for each scanline:
    compute the pixels
    if min_progress_delta (2M) exceeded:    
        call the progress function

It still takes .1 second! More experimentation: comment out the call of the uncalled progress function:

// Code E: .06 sec
for each scanline:
    compute the pixels
    if min_progress_delta (2M) exceeded:    
        //call the progress function

Now it takes .06 seconds! How can commenting out code that is never called make a difference? We're starting to zero in on the issue here: we didn't simply comment out uncalled code, we commented out the entire body of the if clause. And that meant that our C compiler eliminated the test of the if since it was unneeded.

It's beginning to look like the simple act of doing anything at the bottom of the loop is taking time. That's the only explanation for the data we have so far.

I'm no expert in these matters, but I've read enough about pipelines and caches to know that this is entirely plausible. When the code is uncluttered with detours, it goes much faster than when the end of the loop pauses to consider whether to invoke the progress function. Ironically, actually calling the function and invoking all of the Python overhead is insignificant compared to the time lost to simply deciding (in C!) whether to call it.

Or am I missing something here? Is there a way to invoke my callback function without putting hiccups in my pipeline?


Saturday 19 July 2008

FirePHP is an intriguing Firebug plug-in: it uses a server-side library to send debugging information in response headers, which are extracted and nicely displayed in the browser.

One of the complexities of developing web applications is that your server is busy with lots of stuff at once. Load a single page, and there could be dozens of requests, once all the scripts, styles, images, and Ajax calls are through. If you have some debugging traces in the main page, they'll get buried in all the other noise created by the subsequent requests.

FirePHP look like it gives a nice compartmentalization: the log messages from the main request stay tidy in Firebug. I haven't used it yet, because I need to whipe up some Python server-side code (the name FirePHP doesn't limit it to PHP debugging).

Personalized videos

Friday 18 July 2008

I recently saw two different viral marketing videos with a twist: they incorporate user content to pull in the viewer. Stepbrothers: Duel is an ad for an upcoming movie comedy, and uses your name, photo, and phone number (!) to create a video of Will Ferrel and John C. Reilly fighting over which of them is your best friend.

Election '08 prank is purportedly a news story about a groundswell presidential candidate. Scroll to the bottom of the page, enter your name, click Update, and play the video. They insert your name as text elements in a number of places. If you listen carefully, you'll notice they cleverly avoid any gender-specific pronouns, unlike the Stepbrothers video, which will only work for a male:

This is clever technology, and will keep viral marketing campaigns alive (I just spread two of them), that is, until it gets old.

Database naming

Tuesday 15 July 2008

Jeff Atwood has a good post about database design: Maybe Normalizing Isn't Normal, all about data normalization. I like the quip it ends with: normalize until it hurts, denormalize until it works.

In the comments, a few people quibble with the naming of his tables and columns. It's an age-old debate: are relational database tables named with a singular or a plural? The original proponents of relational design used singular nouns: User, Employee, Manager. But conceptually, a table is a set of things, and so should be plural, no?

On the plus side for singulars, it works better in today's ORM-heavy world. It simplifies the transition from objects to tables, since the class name will be singular. Typically, database table names have to be explicitly specified to make them plural since pluralization is hard to do automatically. At Tabblo, we have a table correctly named stories, but also one incorrectly named addresss.

Another minor point for the singular camp is the case where a single row in the table is actually described by a plural already. Suppose you have a class called UserFlags. What's the table called? If you favor plural table names you'd have to pluralize it again to UserFlagses?

An advantage for plurals is that it makes SQL queries sound right. The statement "select * from users" simply sounds right. Although, if you have to qualify column names, it sounds odd again:

select users.name from users where blahblah

I suspect which side you lean toward will depend on how you were raised, like any religious argument.

On another point, Jeff shows a User table with a user_id column as the primary key. At first I recoiled: shouldn't the primary key be named "id"? But as it happens, I have often made the mistake when typing ad-hoc SQL queries of using user_id in the User table, simply because I've been using it everywhere else.

If I type

select * from stories where user_id = 6

to see user 6's stories, then it's natural to type

select * from users where user_id = 6

to see user 6 himself. Of course it's natural to use a table-qualified id column name for foreign keys, but for a primary key? I've never tried it, so I don't know what pain I might incur further down the road.

BTW: Joel's forum has an an old thread on the issue.

Encouragement from unlikely places

Sunday 13 July 2008

Here's a sweet story about a Pixar fan who teared up at the original WALL-E trailer: Pixar Honors the Girl Who Cried at the 'WALL-E' Teaser. I like a few things about this story. The original video is simple but heartfelt, showing a genuine reaction to the trailer. The story of Pixar flying Courtney in for the wrap party is terrific. But the part I liked best was what Andrew Stanton said about her:

Six months ago, when the first trailer for WALL-E came out, we were only halfway done with the film, and we weren't exactly sure how we were going to get it done. We were exhausted. And then, one day, a movie showed up on YouTube showing a girl watching the trailer for WALL-E. And every time she watched it, she would cry on cue. When we saw that, we knew we were on the right track.

I've never made a motion picture, but I know what it is like to fix onto one particular customer and use them as a proxy for all of them. Making a movie I imagine can become pretty removed from the simple question of how people will react to the film. When the Pixar people saw Courtney's strong reaction to WALL-E's character design, it was a huge boost for them. They had heard from their customer, and the response was positive. What more encouragement do you need?

Gas station tv

Saturday 12 July 2008

The latest step in our march toward all advertising all the time is Gas Station TV, which consists of screens on top of gas pumps, blaring news, weather, and ads. These remind me of the screens in elevators, pestering us for the few minutes it takes to ride to our floor.

I understand the economic forces driving these micro-channels of ad-laced info-bits. What I don't understand is why they have to be limited to news headlines and weather forcasts. If the information is only there to engage our interest, why not branch out? What about classic paintings? What about cartoons from the New Yorker? How about poetry? Opening paragraphs from random Wikipedia articles? With so much content flowing all over the net, why are we forced to see the same news, stocks, sports, and weather all the time?

Dealing with parking pigs

Tuesday 8 July 2008

I work in a large Hewlett Packard facility in Marlborough Massachusetts. It has enough parking for thousands of cars (though much of it is unused). Of all of those spots, one row is unusual: due to its proximity to a row of tall tree, those spots are the only ones that are in full shade at the end of the work day. In these muggy summer days, those spots are highly prized.

Recently, one individual has been taking two spots for his Mustang. He parks at a slight diagonal across two spaces, I assume to ensure that his car is not dinged by those on either side of him. People often park over the lines, due to trying to back into spaces, but those people clearly were trying to fit into one spot. It's a little annoying that these people don't work at their parking a little better to avoid taking two spots, but I don't think they're being malicious.

The Mustang, though, is clearly parking purposefully. He has decided that he deserves two spots. I tried leaving a note on his windshield ("Please don't take two parking spaces, thanks."), but the next day he had parked diagonally again.

So now I'm torn about what to do. He's being inconsiderate and greedy in deciding that he deserves two spots, no doubt about it. When I see him parked like that, I think, "if he didn't do that, I could park there." But that's not true. All the other cars would be parked one spot closer, and I'd park only one spot closer than I did, an insignificant improvement.

I'm pretty sure he's the kind of person who wouldn't do something like this face-to-face. The anonymity of the parking lot makes it possible for him to act this way. Ironically, it also allows me to put a note on his windshield. If I were to see him park diagonally, I'm not sure I would approach him in person to ask him not to do it.

So, what to do? Should I continue leaving notes on his car? Notes with my phone number? Should I tell the security guy about it? Should I take a deep breath and focus on more important things?


Tuesday 1 July 2008

I saw Pixar's latest, WALL-E yesterday, and am of two minds about it. I really enjoyed it, it's a great movie. But it's not as great as all the gushing reviews are making it out.

The first half-hour of the movie is outstanding: it's a moody evocative story drawn with an apocalyptic palette, with a cute protagonist in the middle of it all. Without words, WALL-E draws us in and makes us feel for him. This is the part of the movie people are talking about when they say it is a great sci-fi film.

The second half of the movie takes place in space, but ironically is where the movie falls under the cartoon gravity of a kid's movie again. The plot takes over, and simplistic characters and turns are the norm. Don't get me wrong, it's a great kid's movie, one of Pixar's best. WALL-E himself is a great character, a believable robot with big expressive eyes that perfectly convey his emotion to us.

But let's get something straight: the people who are talking about a Best Picture Oscar are crazy. Forget the insider movie calculus that says it won't happen: it's just not that good a movie. It isn't that it's animated: WALL-E's earth-bound segment proves that CGI animation can carry a rich story just fine. It's that it's a cartoon: by the time the movie is over, the plot has been neatly wrapped up, a few strange holes in logic have been glossed over, our heartstrings have been expertly tugged, and we can go home happy.

A Best Picture can end happily of course, but stepping back to take in WALL-E's full structure, you can see that it's a kid's movie. That means trading subtlety for some physical humor, all of which is fine, but it means you have to give up your seat at the grown-up's table. WALL-E is rated G, a sign that the material is mild enough for small children, without the depth of story that a Best Picture needs to have. Oliver! was the only G-rated Best Picture, and that was in the early days of the rating system. It's hard to imagine it would be rated G today.

I'm not even sure that WALL-E is my favorite Pixar movie. Finding Nemo and The Incredibles I thought did a better job consistently hitting their mark, and finding richness in the stories they told. The ultimate mark of a great story is the development of characters over the course of the movie, and for that, it's hard to beat Pixar's first, Toy Story.

« | » Main « | »