Unwitting bloggers and witless structured blogging

Sunday 22 January 2006

Antonio wrote today about Unwitting Bloggers. He started by pointing to Bill Burnham's pompously-titled A Unified Theory of Search, Social Networking, Structured Blogging, RSS and the Active Web, so I started reading there.

Bill is very excited about RSS, and its extension into Structured Blogging. He forsees a world in which everyone has a web page, and rather than post bits of content on sites like eBay or Craigslist, everyone will post structured blog entries on their own sites. Specialized search engines will find those pieces and aggregate them for people looking for movie reviews or apartment listings.

This is a nice ideal, but there are a pile of things wrong with it.

First, I don't believe that in the future everyone will have a web page the way everyone has an email now. Having a web page is an active endeavor that not everyone is willing to undertake. Even the services that have created unwitting bloggers require some energy and participation. Second, the proliferation of those services will splinter people's digital identities rather than unify them. Yes, more and more people will have some sort of page on some sort of social site, but also more and more will have more than one page.

This proliferation may not be a bad thing, and doesn't even invalidate Bill's vision. My 13-year-old son has a number of online identities which he nurtures like a garden, and so do all of his friends. His AOL buddy list has over 200 entries, representing just a few dozen people. But with all of these sites representing different facets of a person online, there will still be a need to actively create the single page that is you. No one has an answer for that, other than that their social site is better than all the rest, so you should just use theirs.

The larger problem with structured blogging is the naive dream of the semantic web. Don't get me wrong: I would love to see this happen. But semantics are really hard. Look at the definition of the Atom Syndication Format. It defines the semantics of blog posts, something that at first blush seems pretty straightforward. But the semantics are not straightforward. It has taken many smart people a long time (RSS is roughly ten years old, the Atom mailing list goes back 2½ years) to arrive at these semantics. Even after all that work, the definitions in the RFC seem kind of vague:

4.2.9. The "atom:published" Element

The "atom:published" element is a Date construct indicating an instant in time associated with an event early in the life cycle of the entry.

Typically, atom:published will be associated with the initial creation or first availability of the resource.

"Typically"? I understand why the semantics have to be vaguely described. It's because there's a great deal of variation in the world. Blog posts, as simple and uniform as they seem, are actually quite squishy when you get them under a microscope and consider the entire range of possibilities, as the Atom authors did. I'm not criticizing Atom. I've lived through the earlier incarnations of feed specifications, and it's miles above the rest. There's actually enough information in the RFC to make systems that work together. I'm merely pointing out how difficult it is to specify semantics, and even when you do the best job you can, you're far from where the starry-eyed semantic web crowd thinks we're going to get.

If Atom took as long as it did, and still leaves open questions of interpretation, how will more interesting data be somehow unified into Structured Blogging? I don't believe it will. I've written about this before: Semantic Web Difficulties, about how even CD descriptions are too difficult to describe just once for all audiences.

Even if the semantics could be nailed down, there are a vanishingly small fraction of blog posts that even qualify as structurable. And people are notoriously unreliable when it comes to entering data. Data in the wild is very dirty. It requires hard work to make it clean enough to use in the aggregate. Let's not even start on fraud and spam.

Maybe I'm just being cranky. Maybe the market forces will encourage uniform adoption of standards. Maybe people will realize their postings don't "work" if they do them wrong, and will do them right. But I've been hearing about the semantic web for a long time, and haven't seen much come of it yet. No one is talking about overcoming the basic difficulties, they are still just waving their hands and rhapsodizing about how wonderful it all will be.

Luckily, Antonio didn't buy into the structured blogging thing either. He's excited about unwitting bloggers, and I think he's right. People are more and more willing to have bits of themselves online. Some are more than willing, they are eager, but need help getting started with a substantial presence. I think there's lots that online services could do to turn members into unwitting bloggers. There's lots of exciting stuff coming down the pike.

Comments

[gravatar]
Bill Burnham 12:05 PM on 23 Jan 2006

Hi Ned,

Couple quick comments:

1. I agree that there are a lot of difficulties with the Semantic Web and actually expressly avoided mentioning the semantic web in my piece for that reason. I don't think that structured blogging, which is really just people agreeing on conventions for certain kinds of XML tags, is even close the Semantic Web vision. It's just a slightly more organized approach to tagging that what we have today. I think once the search engines start organizing content based on these tags that people will quickly adopt standards.

2. I think you are wrong about people not having a website. Perhaps they will have multiple websites, but I think it is pretty much a slam dunk that people will have personal websites given the increasing interaction that people will have with the web. I guess we will have to wait and see who is right 10 years from now :-)

Nice post overall, always good to get discussion going on things like this!

Bill

[gravatar]
Shea 5:47 PM on 23 Jan 2006

Hi Ned!
Great post - you mention that "... there will still be a need to actively create the single page that is you" and that people "...need help getting started with a substantial presence". Do you think publishing services like blogger.com and sixapart.com will make this process easier or do you think a new platform can emerge and challenge the social networking sites?

Add a comment:

name
email
Ignore this:
not displayed and no spam.
Leave this empty:
www
not searched.
 
Name and either email or www are required.
Don't put anything here:
Leave this empty:
URLs auto-link and some tags are allowed: <a><b><i><p><br><pre>.