Monday, November 06, 2006

Why go Semantic?

It has recently been discussed in the mindswap weblog (6th November 2006) about the need to discuss and clarify what do semantic web technologies bring new into the field of data analysis, and why not remain in the relational?
I think they have a very good point. Why go semantic when we can stay on the relational? Anyway, anyone who matters already knows the relational, why learn RDF or OWL, right?
No. I think what is so amazing and breakthrough technology about the semantic web is it's intuitiveness. While anyone can understand and visualize nodes connected to nodes that make up a whole, having to memorize tables and table connects, primary and foreign keys, etc, is a bit more cumbersome.
Semantic web will make data resources accessible to more people and to the people that matter - the ones generating the data. I think what defines a technology is the ratio between it's usefullness and the amount of computational support needed.
Semantic web will win, I think, mostly by sociological reasons - if the biologist is the data modeler, and he knows that the tools to analyse his data can be called by complying to a particular ontology (this is my definition of ontology-driven data analysis), then it will become an incentive to use such ontology. With the widespread of both databases and algorithms dependent on ontologies, changes in the ontology will not necessarily affect the flow of analysis, as can happen with relational databases.

Saturday, November 04, 2006

Ontology Driven data Analysis

Many researchers have come to realize that ontologies will definitelly bridge the gap between databases and analysis algorithms. But how to do that in a level of abstraction that is usefull regardless of the data structure? Data analysis tools often become obsolete and need to be adapted as new significant parameters emerge from the data collection. Ontologies are already being used as effective tools for integrating databases and data mining tools for deriving knowledge. But this often happens at a large scale of data wharehouses, where the bench biologist trying to derive conclusions from their data have little to say in the manner the data analysis is conducted.
Ontology driven analysis tools should be flexible enough to accept entry of new parameters that might, or not, improve the probability of the conclusions.