Wednesday, June 27, 2007

imho Ocean Semantic………….


Semantics is just a fancy word for understanding what things truly mean.

Semantic Web is the “identity of things” taken to its logical extreme. I keep coming back to that thought. In this discipline, the core question is: Do two (or more) phenomenologically distinct content instances refer, ontologically, to the same "thing" (aka "subject" or "entity")? And Semantic Web's core grammar is, of course, RDF, which is built on the notion that we can define meaningful ontological statements as consisting of discrete “subjects,” “predicates,” and “objects,” and that each of those "parts of speech" (my term) is itself a thing that can be given its own unique identity, designated with a URI, within an RDF triple. Every source or target content thing/subject can have its own identity/URI, as can every attribute/predicate-value of that thing/subject. In the process of determining semantic equivalence between two phenomenologically distinct semantic content instances (i.e., things such as customer records from separate applications or databases), our inference engines resolve them to a single thing defined under a common, shared ontology (defined in RDF/OWL). In other words, resolve (match, merge, reconcile) distinct things down to a single unique name—same semantics means, memetically, the same meaningful things, heteronymously, are tamed to assume the same names—Plus ça change, plus c'est la même chose.

That’s the Semantic Web (it’s also the core function of the data quality space—in, which, near as I can tell, the only vendor doing semantic web at this moment is Silver Creek Systems). Now, here’s something I wrote in this blog on February 10, 2005, in a different context (referencing ID Dataweb architectures built on XRI—a URI-based identification scheme):

“[W]hat the heck does the ‘identity of things’ refer to? On one level, it sounds like some metaphysical plane of existence, some mythical spirit world, some platonic ideal, like the ‘secret life of plants’ or the ‘lifestyles of the rich and famous.’ Like animism: the identities/souls of the inanimate starstuff from which we’re all, magically, composed….

broad scope of the term, in terms of concrete, real-world, commercial technical approaches, such as IP addressing, RFID, and ID dataweb. …..

That’s one of the big problems with the ‘identity of things.’ There are just too many ‘things’ in the universe. Try giving every star in the sky its own unique name, including the billions upon billions embedded in galaxies, and don’t forget to give each of the countless galaxies their own unique names. After identifying every discrete point of light uniquely, now try storing and managing all those names (plus the associated descriptive attributes of each star) in some master directory database in the sky. Clearly, the directory itself would have sufficiently massive gravitation to form its own black hole, sucking all of the named ‘objects’ in the universe down into some freaky meta-universe, never to be heard from again. ….

ID dataweb—aka federated resource sharing environments built on emerging Web services standards, especially Extensible Resource Identifier (XRI) and XRI Data Interchange (XDI). …

ID dataweb (actually, there are many synonyms for this emerging space—I’m partial to ‘federated resource sharing’) is an approach under which every data element in every database can conceivably be given a unique, fine-grained identifier—thanks to XRI, which is backward-compatible with the URI/URN naming scheme that has achieved ubiquity on the Web……the World Wide Web was built on the ‘identity of things’ (aka pages, scripts, etc.), leveraging URI, DNS, and IP. …

ID dataweb is an environment within which autonomous data domains can choose to selectively grant fine-grained data-access rights to external parties—and unilaterally rescind those rights. It leverages the identity federation and trust infrastructure being implemented everywhere through open standards such as WS-Security, SAML, Liberty Alliance, and others. It’s a standards-based flexible way of securely setting up and managing as-needed data-integration connections between autonomous organizations. Such as manufacturers, suppliers, distributors, and other participants in a supply chain. Or financial services firms engaging in dynamic partnering on equities underwritings. And so forth. Data integration/exchange/transfer is one of the principal tasks in any B2B collaborative-commerce partnering…..

Here’s an issue that the ID dataweb community must grapple with: As organizations expose/share/protect more of their fine-grained data resources through XRI/XDI, how are they going to manage the massive databases underlying the humongous ‘directories of things’ that result."

Maybe we should call it “thing-centric identity,” to adapt a phrase from my just-previous multi-month multi-post meditation. Is an RDF triple store the nucleus of that "directory of things"? How big will triple stores need to grow to encompass the universe of semantic things? At some point, will these stores grow so large as to mash it all gravitationally, resolve it all ontologically, down into a semantically massive and mighty thingularity?

More to come.