|Ned Batchelder : Blog | Code | Text | Site|
» Home : Blog
For having purchased Photoshop, I received an offer today from Adobe for a free extra, one of which was a font family: Garamond Premier Pro.
Actually, calling this a font family is kind of like calling the US Navy a boat family. This is the most extensive type system I've ever seen. It varies along every dimension possible. Not only are there five different weights (Light, Regular, Medium, Semibold, and Bold), but there are four different optical weights (Caption, Subhead, Text, and Display), three major languages (Latin, Greek, and Cyrillic), as well as all sorts of typographic variants: ligatures, lining/non-lining and tabular/proportional figures, stylistic variants, small caps, and on and on. The full character set for just one of the fonts runs to four pages.
The typophile thread about Garamond Premier Pro goes into a great deal of detail about the type, especially the polytonic Greek support.
Now that I'm building a serious db-backed site with Python, I have a better appreciation for what database mapping packages are like. I'm using Django (pre-magic-removal for the moment), and am already bumping up against the limitations of the object relational mapper (ORM).
So I was impressed to see SQLAlchemy's philosophy statement:
We're not moving away from the Django ORM (this week), but it's good to have an understanding of the options out there, and SQLAlchemy looks quite promising.
This may sound heretical in these days of standards for everything, but I've had the best successes by designing my own ad-hoc data formats. Rather than adopting (or worse, adapting) a standard to fit your purposes, you should create your own data representation. It will give you the best fit for the problem at hand.
Yesterday, a reader of this blog sent me an email to let me know that he had just become a father. He wrote to me because of my Programming Madlibs story, which intrigued him as a Dad-to-be. I was touched that he would think to send me a note during what must be a hectic time for him.
In the same spirit, here's another story of a Dad teaching his son to program: Haaarg, world! His six-year-old picked up everything way faster than my kids did, but the flavor of the adventure is the same.
I don't know about other bloggers out there, but I'm seeing a different kind of spam in my comments these days. It looks like actual people are reading the blog posts and writing minimally appropriate comments, just to get a link to their website.
For example, on my post about The Wave, I got this comment from "christy":
The website linked to was a brand-new spam blog about televisions.
My post about the David Copperfield spoof video (which I called Magic is all around us) garnered this comment from "laura":
Laura linked to a spam blog about dogs with an identical design to Christy's.
These comments have clearly been written by a person because of their goofy but tenuous applicability to the blog post. Someone out there is interested enough in page rank to pay someone to write comments in middling English that are "about" the blog post, and not about their cheesy portal site at all. The good news is that this makes me think the comment spam preventions are working, and it's increasing the cost of spam for the spammers. The bad news is that these comments have to be cleaned by hand.
In my comments (and in the link above), I use the rel="nofollow" attribute to ensure that search engines don't lend any credibility to the link. As of now, I advertise that fact on the comments form. I doubt it will stop the spammers from trying, but one can hope...
I've noticed a pattern in software systems: secondary channels into the guts of the software are implemented as fake objects from the primary channel. That's a horrible sentence; let me demonstrate with a few examples:
I know in my own work, I often have need to create a way to get into the internals of the software, either to change configuration or to get debugging or administrative information out. If the product is in the business of serving information (such as a web application), it's straightforward to add a URL which serves the information desired.
With dynamic web applications (the word dynamic is an oxymoron at this point!), there's practically no cost to adding another URL running another piece of code. It's almost not even worth noting that we can create URLs to serve not just our customers, but ourselves.
I'm wondering what other more exotic examples there are out there of machines producing pseudo-products as a way of communicating with the outside world? What's the most creative use of this technique?
An awkward thing about programming in Python: there are lots of double underscores. For example, the standard method names beneath the syntactic sugar have names like __getattr__, constructors are __init__, built-in operators can be overloaded with __add__, and so on. In the Django framework (at least before they integrated the magic-removal branch), the object-relational mapper used keyword arguments named things like user__id__exact.
My problem with the double underscore is that it's hard to say. How do you pronounce __init__? "underscore underscore init underscore underscore"? "under under init under under"? Just plain "init" seems to leave out something important.
I have a solution: double underscore should be pronounced "dunder". So __init__ is "dunder init dunder", or just "dunder init".
I'll leave it to someone else to decide what "dunderhead" means now.
One of the things I've had to learn how to do at Tabblo is to monitor the state of our servers. I'm not the front-line guy for this, but I need to be knowledgable about it. We have a number of Linux servers running the site, so the top command is very helpful for seeing what's going on in real-time.
top is the an info-junkie's dream: it provides a compact dynamic presentation of a thousand factoids about what a Linux box is doing:
Unfortunately, the help is about as compact:
Luckily, the always-helpful O'Reilly devcenter has an actual man page: Linux command directory: top.
Another mashed-up movie trailer, this time it's the Ten Commandments re-spun as a high-school comedy: 10 Things I Hate About Commandments.
The ongoing debate about global warming fascinates me, and one of the more interesting questions is how scientists determine that human actions are the cause. Many conservatives seems bent on chalking it up to natual cycles. So it's interesting to see the Wall Street Journal publishing Scientists Explain How They Attribute Climate-Change Data. It's a quick (too quick) overview of the ways scientists rule out various explanations for the warming being observed, and conclude that it is human action after all.
I'm really pleased to announce that as of now, Tabblo is open for business.
Tabblo is a new photo-sharing site focused on telling stories. Or is it a story-sharing site focused on photos? Unlike most other photo sites, it lets you create pages (we call them tabblos) that include photos and text, so that you can share an entire story rather than just a stack of snapshots. Dozens of gorgeous layouts and styles give your tabblo a polished feel, and you can buy a high-quality 12×18 glossy poster for a song.
I've been working hard on Tabblo since January, and I think it has turned out really well. We have a small but talented and dedicated team, and we've tried hard to build an exceptional site. For example, I think we've pushed the envelope on the kind of editing power that can be provided in a browser. We're really proud of what we've done, and today we begin our public beta (grrrr, I hate that word).
There's lots more I'd like to say about Tabblo, about the last four months, about the next year, about the product, about the technology, but I'll save all that for other posts. Please try Tabblo and tell us your story.
My friend Damien Katz likes to write technical articles with goofy humor thrown in. He's mulling over a comparison of Erlang and Java, and he wanted to illustrate it with comics about two superheroes named Erlang and Java. But he doesn't think he can draw. As it happens, my eight-year-old son Ben draws all the time, so Damien asked him for an illustration, and he readily agreed. The result is posted on Damien's blog: Erlang vs. Java.
For work (which I'll be saying much more about very shortly), I've been having to consider color calibration with a printer. By printer I don't mean a thing the size of a bread box that sits on my desk and squirts ink onto paper. I mean a person who runs a printing shop using room-sized machines like an HP Indigo 5000.
I send him PDF files, and sometimes they look fabulous, and sometimes they look like someone dipped them in a yellow wash. I know a bit about printing technology, for example, I was Digital's PostScript expert for a while. I can even throw around terms like tristimulus values and color gamut without completely faking it.
But he and I don't see eye-to-eye on this problem. My PDF files are RGB, mostly because it was the simplest thing to do. He wants CMYK, but my understanding is that the mapping from RGB to CMYK is a very fluid thing, depending a great deal on the actual printing technology and printing machine used. How could I convert the colors into CMYK better than the HP Indigo's RIP could itself?
Does anyone have any insight or good pointers for me? I took a look at the International Color Consortium site (you know, the ICC in ICC profiles), but man, that's one thick site.
How does one make a PDF file that can be confirmed correct once it is printed?
Last night I attended the Boston Web Innovator's Group sixth meeting, otherwise known as WebInno6. It's a fairly informal event. There were about 100 people, and two companies are pre-chosen to present their products in about 10 minute presentations. The rest of the time is shmoozing with the other attendees, like a meetup.
The two presentations last night were:
The two products seemed interesting enough, but frankly, both Julian and Margaret need to polish up their presenting skills. I could easily imagine more compelling and dynamic demos of their products in the ten minutes allotted.
There were four other companies doing "side-dish" demos, but I didn't see any of them.
Also covering the event were Brian and Glenn from BostonWTF. Brian took a lot of photos. Three of us from Tabblo were there wearing identical obnoxious red T-shirts, so we'll probably show up pretty well in the snaps.
At least, that's what this David Copperfield parody demonstrates.
I was at Fenway park last night watching the Red Sox play the Orioles. (I'm not a sports fan. I've lived in Boston for 20 years, and have been to Fenway exactly twice, both times in the last few weeks!)
At last night's game a Wave went around the stadium. I know The Wave can be a polarizing event: some people love it, others wish everyone would sit down and watch the game. I can sympathize with the die-hard fans that don't like the distraction, but I think The Wave is great.
Here's 30,000 people gathered to watch a game. They are in a bouyant mood, and are focused together on the baseball game before them. They'd like to participate, but there's no way they can from their seats. As individuals, there's nothing they can do in the stands that could get the attention of the 29,999 other fans the way the ball game can.
But with The Wave, the fans have managed it: they are active participants rather than spectators, and from the cramped confines of their seats, they've created a game large enough for the entire stadium to watch. I think it's great.
Last night, a single Wave went around the stadium three times before petering out. I was especially impressed because the geometry of Fenway is somewhat ad hoc, meaning the Wave has to traverse a number of rough boundaries between sections to keep going.
I wonder how the idea of the Wave got started? It's not like a guy in a stadium had a great idea and said to his neighbor, "let's try this". There had to be at least a little coordination ahead of time.
Serge Bondar did a controlled experiment to see how much text on a page matters to the big three search engines. He created very long pages with unique made-up words inserted into them at regular intervals. Then he watched the server logs to see how much data the search engines actually pulled down. Then he waited for his unique marker words to turn up in search results.
He found that Yahoo only cares about the first 210Kb, Google gets bored after 520Kb, and MSN persists through 1.1Mb. Read about the entire experiment: Search Engine Indexing Limits: Where Do the Bots Stop?
When I read the results, I worried because my blog archives are organized into monthly pages. I thought maybe they were long enough that some content wasn't getting indexed properly. Turns out, no need to worry. My longest archive page is November 2003, at 119Kb. My complete archive listing is longer, just at Yahoo's limit of 210Kb, but that I prevent that page from being indexed anyway, since it is just a long list of post titles.
This has got to be one of the more unusual mash-ups: ASCII Maps, a fully-working version of Google Maps using ASCII art instead of images.
Triaging bugs is an important part of any development process. It's the simple but treacherous process of deciding what bugs should get fixed when. Simple because there's only one thing to decide: when should we fix this bug? Treacherous because it is tempting to turn the triage process into a long drawn-out affair with many people in a conference room. I've often seen triages like that, but it doesn't have to be.
Read more: Painless Bug Triaging