Bosworth’s Web of Data

Daniel H. Steinberg covers Adam Bosworth’s MySQL user’s confernces talked entitled:Bosworth’s Web of Data.

He calls on the people present to “do for information what HTTP did for user interface.” He says:

As a result of a simple, sloppy, standards-based, scalable platform, we have information at our fingertips from Google, Amazon, eBay, and Salesforce. Bosworth’s own company, Google, gets hundreds of millions of hard queries a day. He said they see it as putting Ph.Ds in tanks to drive through walls rather than around them.

I, for one, am in favor of sloppiness. I was having a discussion with one of my professors this evening about Semantic web vs. emergent semantics/microformats. We both agreed that technologies which enable mass production of data which can be consumed/scraped aggregated are preferable over systems which produce small amounts of tightly controlled data. You see, given any large data set, someone will figure out ways to mine it for the relevant information.

As a sidenote, Bosworth states:

In addition to the advantages in software, there have been great gains in hardware. Bosworth said that one million dollars buys you five hundred machines with 2TB of in-memory data, a PetaByte of on-disk data, and a reasonable throughput of fifty thousand requests per second. This amounts to one billion requests per day.

Anyone have an extra million they want to give me?

One Response to “Bosworth’s Web of Data”

  1. Daniel O'Connor Says:

    “Bosworth predicts that RSS 2.0 and Atom will be the lingua franca that will be used to consume all data from everywhere. These are simple formats that are sloppily extensible. Anyone who wants to can use these formats to consume content or to author content. Contrast this with the Semantic Web, which requires that you get a large group of people to agree on the schema of everything.”

    I know it’s true, but I don’t like it. Where’s my lovely, lovely RDF & SPARQL :(