Linked Data - Part 2

Last time we defined our basic data-structure, which is very similar to RDF, with the exception that nodes may contains byte-streams. Another main difference between RDF/RDFS and our model is that we will not provide hard-coded special relations in extension-ontologies to allow data to be stored more structured (for easier knowledge querying). We do have special knowledge, but in our case this knowledge is to dynamically infer information instead of hard-coding it in a standardized ontology. Hopefully I manage to get this idea across in the current page by giving examples of how this is supposed to work.

But first things first. Lets bash on Linked Data a little bit more and be a total hypocrite by whining about existing technology, without offering any alternatives :).

Linked Data - The cracks are showing

Cracks

Standardizing linked data architectures and communication protocols is a good idea, mandatory even if we want to create queries to access multiple linked databases. Enforcing people to use standardized knowledge in some W3C-managed ontology is definitely NOT a good idea. W3C is slow to adapt new technology (e.g. Canvas or WebGL or Flash-like layered vector animations) which creates a situation that W3C was created for to prevent. People will create their own standards because new technology is fancy and looks good in reviews. Developers can't trust W3C to adapt new technology quickly enough to have a wait-and-see attitude. But this is not even the main reason why I think it is a bad idea!

Lets say that I want to create a knowledge database, what is my motivation for doing so? Probably because I want to store knowledge and make it easy for that knowledge to be queried (just like any other relational database). As additional bonus it will be possible to link my knowledge database to existing knowledge databases so queries will return a lot of interesting information that would have been too tedious to add to our own. Another big feature that may convince me to use a knowledge database is the ease and freedom to add information without being worried about adding columns to a table. So who are these people that create knowledge databases? Currently just the big companies and hopefully all companies and institutions in the near future. Perhaps in some distant future even  people like you and me are extending knowledge databases (A man can dream).

An example

So let's say I'm a hypothetical distributor of meat and want to store products, locations, routes and vehicles in my knowledge base. My motivation is purely out of self-interest, but because I'm living in a Utopian world we'll share this knowledge with the world.

Step 1 My other carWe have a product X, which is 90% beef, 10% horse - How to store this information? Well if we want to be able to use those percentage in queries maybe we need to search for an existing ontology that has knowledge about using percentages. If it is a standardized ontology hopefully the query-parser is able to handle numerical comparisons. So what company is the leading expert on percentages that shares his ontology with the world? Hmmm, forget it, let's just continue...
Step 2 We have a vehicle and the driver notified us that the window is broken, how do we store this knowledge? First let's store the Vehicle so we'll be needing a (globally) unique identifier for this vehicle, perhaps using the number-plate (seems reasonable enough). Do we need to create identifiers for 'all' objects? Can't we just use auto-numbering like the good old-times? It's an Hyundai so let's link this object to some other knowledge database. What database to use, Wikipedia? hmm they do not have the specific Hyundai-car that we are using, perhaps our car-dealer has an ontology, well at least Hyundai should have an ontology. Too bad, they don't share their data in RDF format. You know what? forget it! let's just store the knowledge of the broken window so we can query "Give a list of all vehicles with problems", we can't expect to order the result by problem-severity because that would be too difficult for our little RDF datastructure. Lets just store the relation "Vehicle5 HasProblem BrokenWindow". Currently I'm feeling a lot of pain for not using a normal relational database, but maybe things'll get better. Lets store the knowledge that vehicles with broken windows can be fixed by visiting company 'CarGlassX'. Obviously we want our application to automatically show solutions to existing problems. Darn, we can't! We could create complex queries that show todo-items like "Vehicle5 NeedsToBeBroughtTo CarGlassX", but wasn't the whole idea of knowledge-databases that this information could be inferred? Perhaps there exists some W3C ontology that contains standardized elements like 'Problem' and 'Solution' so this info can be coupled in a query-lan... STOP
Step 3 PikaPikaWhat about the routes our vehicles use? We currently are using planner-software to figure out the optimal routes. Are you aware that our planning-problem is NP-Complete (fancy word for saying that computers have a really hard time at calculating optimal routes)? Every day routes will be different, so creating fixed routes as knowledge is not possible (or at least stupid). Information and computation are intrinsically linked. We could convert daily route-information to RDF form and store it in our knowledge base, but do we have really a reason to do this?

By now I'm feeling pretty stupid for ever trying to store our company's knowledge as RDF. This pain I'm feeling is not at all eased by knowing that the knowledge we have stored complies with W3C-standards. All I was doing was searching for existing ontologies to link our data, being annoyed by obvious limitations, and for what? Using a knowledge-database was not really advantageous to our company. What is my incentive to extend this human knowledge library? Is my only reason this Utopian dream of creating a Star Trek like computer? All and all it was a horrendous experience and definitively not worth my time.

Conclusions

  • The people that we want to use Linked Data are mainly institutions and companies that usually act out of self interest. They should have reasons to use Linked Data, even if they do not link their private knowledge-database with the rest of the world (otherwise it will never be wide-spread).
  • A big problem is querying unstructured data, if we want to query structured data than we're most likely better off using a relational database.
  • We can't rely on some consortium that describes the standardized way of doing things. Knowledge is inherently messy. If some part of the knowledge is nice and structured than we'll probably see a consensus arise by itself. It should be a free market.
  • Although inherently messy and unstructured, the author of the knowledge-base should be able to enforce some form of structure on his data. This user-defined structure should be enforced by the model and may be used in generating intuitive queries.
  • Computation may lead to new information. For example: the simple math equation sin(x) may be stored as computational knowledge which gives an infinite number of inferred knowledge (the outcome for each possible x).
  Somehow I am soooo hungry! Let's finish this at some later time.   *afk, food

line