Sunday, September 30, 2007

The semantic web is not just today's web with special sauce

Alex Iskold writes on Read/Write web that the semantic web can be achieved today, to some degree, if we create services that can do some basic semantic processing on extant web content. For example, the Spock search engine is optimised to find information about people and relationships. Iskold's basic point is that some end-user value can be derived from a semantics-driven approach without having every web site owner re-engineer their site to use RDF and OWL. While that's not false, to my mind it starts from a faulty set of assumptions.

A basic, recurring problem is that many people assume that the semantic web equals today's web, with some extra semantic goodness added on top. The WWW plus special sauce. Personally I don't think that's a helpful way to approach it. Today's web is highly optimised for human interaction. This is a good thing, but it does rather limit what we can do with machine processing to assist those human interactions: human brains are very capable of processing vague, ambiguous, sometimes noisy content that relies on social constructs to interpret. We can't do that with machine-based processing yet. Better to ask what else we can offer human users, rather than take the current interaction modalities and fiddle with them.

So if the semantic web is not about tweaking the current, human-facing, world-wide-web, why is it called the semantic web at all? I guess Tim Berners-Lee is the person to answer that definitively, since it was his term originally. To my mind, it's all about applying the metaphor of the web to machine-based information processing. To explain. The web brought about a revolution in human information handling thanks to some basic design features:

  • open and distributed were foundational design assumptions
  • simple, resilient protocols that quickly became ubiquitous
  • no central point of failure or control
  • dramatically lower barriers to entry than pre-Internet publishing

There are probably others, my point is not to try to be definitive but to draw out some of the features that produced a democratization of information publishing. Anyone can say anything now, and potentially be heard around the planet. Is this uniformly a good thing? No, there are dark corners of the web we might wish were not there. Is this on balance a good thing? Yes. OK, so the web democratized information publishing for humans. What's the relevance to the semantic web? The metaphor is that, just as the web freed human-processed information from newspapers, books and TV shows, so the semantic web aims to free machine-processed information from databases and documents. On a massive distributed scale, with no central point of control, etc.

Ultimately, though, we produce information systems for people to use, to satisfy some need or desire. So the value of the semantic web, of allowing machines to do some of the information processing legwork, is the extent to which it either helps people do the things they do today more effectively (cheaper, faster, easier, ...) or enables people to do things that they can't do today. The key, it seems to me, is automation. When I'm driving a car, changing from manual transmission to automatic gives me one less task to do, but doesn't fundamentally change my engagement with the task of driving. Whereas an automated highway would let me read the newspaper for part of my journey, even though I'm ostensibly the driver.

If it comes about, the semantic web could be as big a transition as the pre-web to the web. What's difficult to see, I suppose, is an obvious smooth transition from here to where we want to be. Iskold might be right that taking baby steps will keep the idea alive while we work on the hard problems in the lab, but there's a real danger that they dilute the vision without achieving any significant progress to the underlying goal.

No comments: