47 Comments

Summary:

The focus on the semantic web was fun, but ultimately missed the big picture, which is people care not about knowledge graphs but about the people and current events happening in their social graphs.

spiderweb

The semantic web is the vision of a web of interconnected data and meaning. This global web of knowledge would be something computers could understand and therefore provide us with a new frontier of information retrieval and intelligent agents.

After two decades of failed attempts, semantic web has become a dirty word with investors and consumers. So what exactly went wrong? Why are we still so far away from the web of data? Here’s my take on it.

The web of Obsoledge

Most attempts at creating a knowledge repository have involved converting “expert knowledge” into a web of data. The result is an inherently boring web of data. Google’s Knowledge Graph promotional video is a great example of how boring this web can be. “Let’s say you’re searching for Renaissance Painters”…. Really? Who searches for that?

More accessible technology is causing an explosion of information. This has the effect of making the shelf-life of knowledge shorter and shorter. Alvin Toffler has – in his seminal book Revolutionary Wealth – coined the term Obsoledge to refer to this increase of obsolete knowledge.

If we want to create a web of data we need to expand our definition of knowledge to go beyond obsolete knowledge and geeky factoids. I really don’t care what Leonardo DaVinci’s height was or which Nobel prize winners were born before 1945. I care about how other people feel about last night’s Breaking Bad series finale. How did they find the ending? What other series or movies might I enjoy based on those experiences?

We are living in the Now. The Now is eating ever greater quantities of our attention. It’s drowning out the obsolete past. Human attention, sentiment and emotion are key elements to today’s information age. They cannot be ignored. They need to be at the very core of any web of data.

Documents are dead

Deriving structured information from Wikipedia documents – a common practice – is fundamentally flawed. Not only does this create a web of boring facts, it assumes that documents are the source of knowledge somehow. They’re not. They are only a small sliver of the stuff that matters. And it’s the underlying conversation and activity that matters.

There is a sea change happening in the web and how we use it. Itʼs an evolution to a second phase of the web – the real-time web, or what I call the “Stream.” In the Stream, the focus is on messages not web pages. These vast amounts of messages are generated by social interaction, by conversation, by attention, by ideas, by little chunks of thought unleashed into a gigantic stream of data.

This also changes the way machines communicate with each other. Machines are still programmed by humans, and humans – especially programmers – are going to be lazy. They will use the easiest most pragmatic way to get machines to communicate. They aren’t going to spend days learning complicated RDF or OWL specs. They will use simple communication using JSON. And all the cool kids have abandoned XML.

Information should be pushed, not pulled

One less obvious problem is one of information retrieval. For the past two decades we’ve gotten so used to keyword search that google became an actual verb. Unfortunately, keyword search is now fundamentally broken. The more information is out there, the worse keyword search performs.

Advanced query systems like Facebook’s Graph Search or Wolfram Alpha are only marginally better than keyword search. Even conversation engines like Siri have a fundamental problem. No one knows what questions to ask.

We need a web in which information (both questions and answers) finds you based on how your attention, emotions and thinking interconnects with the rest of the world.

Meet the synaptic web

Keyword search is broken and we’re drowning in an unstoppable stream of information. The need for a next generation of information retrieval is now higher than ever. Is the semantic web going to be that next paradigm? I don’t think so. Not unless we radically revisit what a ‘web of data’ means.

It’s time to ditch the old paradigms of documents, knowledge and keyword search. We live in a world of big data, real-time streams and human emotions. It’s time for a revolution in information retrieval. We need a web that’s dynamic and centered around humans. A web in which data flows in a smarter way. A web that understands you and makes the proper data find you. This web doesn’t look like a database or a graph. It’s a web that’s intelligent, dynamic and sometimes chaotic. It’s the digital equivalent of the human brain. I call it the Synaptic Web.

Dominiek ter Heide is the CTO and co-founder of Bottlenose, which combines big data technologies with specialized data mining to make sense out of streams.

  1. > We need a web in which information (both questions and answers) finds you based on how your attention, emotions and thinking interconnects with the rest of the world.

    This requires pervasive surveillance of attention, emotions and thinking.

    Does pervasive observation change the observed human’s identity/goals/behavior?

    Share
  2. We want obsoledge, and we want interactions.. I am not sure we want the web to care about what is good for us .. Personaly i dont!

    Share
  3. If you mean the semantic web as in RDF/linked data, it has imho failed because of reliance on human editors. We need semantic search to enable a “true” semantic web, i.e. going from unstructured documents to some form of structured representation (be it RDF or any other fancy format). And documents are dead? They’ve changed, what you call messages equals documents, imo :). And finally, extracting and mining “boring” Wikipedia-style facts makes a lot of sense in plenty of fields/use-cases.

    Share
  4. Love this article. To my mind (and yours) “knowledge” is the contextualization of ideas that may relate to each other in a myriad of ways. Human intelligence learns through by contextualizing what it observes. “This is a lot like that, therefore I can predict the behavior of that”–I may never have seen a brick flying towards my head but I know to duck because I’ve seen other objects flying toward my head. Search is an algorithmic contextualization of information that presents a list of items approximately related to each other–it is implicit contextualization. But in a world where both structured and unstructured information information can generated by any app, any person and “thing”, implicit web search begins to look pretty lame. The problem goes well beyond disambiguation. There are simply too many possible contexts for any idea. My view of the “Synaptic Web” mirrors how I imagine the brain to work, millions (billions, trillions) of contextual streams, with each stream defined by a highly specific context that is explicitly defined (not implicitly defined by natural language processing). A given stream may have structure (“eco-friendly homes for sale in Brooklyn”) or no structure. Another stream may have real-time processing rules that may filter and route information from that stream to many tributary streams defined in even more granular ways. The innumerable streams of this “Synaptic Web” can each be individually shared by thousands of apps and people, with permissions for each stream that define who can add information to a given stream and who may simply read that stream. He who creates a stream defines such permission rules, it’s purpose, its data structure, it’s audience–the more open, the better. This Synaptic Web is a real-time “data exchange” that allows the contextual “flow” of information with the ability to trigger actions when certain pieces of information are identified in particular streams (e.g. when it’s time to “duck”). The Web, algorithmic search and activity streams are all training wheels for this new form of contextualization. My company, Flow, has been building this architecture for almost three years, with initial products available at Flow.net and iFlow.com. But what we want to do now is to open up our “Synaptic Web” to individuals, developers, researchers, businesses, internet of things…information streams defined and shared by anyone for any purpose. I’m looking forward to hearing from those intrigued by this new model of knowledge.

    Share
    1. will there be paragraphs in the synaptic web?

      Share
  5. “Really? Who searches for that?”

    People who aren’t social media crack addicts, like every online journalist under the sun.

    Share
    1. The author of this article isn’t a journalist, but the CTO of a company that has a vested interest in making money from people who aren’t all that into ‘knowledge’. Because that group of people are easily the majority of people on the planet, and also the easiest to convince to part with their money; it is a sound business strategy.

      Share
  6. I’d like to disagree with most of the article. Your argument “the Semantic Web has failed” does not follow from your “reasons”.
    Sure, I’m pretty familiar with the Semantic Web and able to understand RDF (really, it’s not impossible to understand) and (most of) OWL, but that is not why I think a Synaptic Web can live next to a Semantic Web. To start: wouldn’t it be great for your streaming web interpreters to be presented with structured information next to unstructured text? Let it live on top of the Semantic Web (and the rest of the Web).

    Do you want to exclude facts from knowledge? I, too, couldn’t care less about Leonardo da Vinci’s height, but if I see the Mona Lisa in Paris, I might want to know what else he painted and did and where I can see that. You need boring facts for that. Boring, but useful facts.
    For human consumption “messages” are only part of knowledge. Take science for example. Science doesn’t only live in conversation; loads of scientific knowledge is transferred in documents.

    The Semantic Web doesn’t depend on XML. Or JSON – although JSON-LD is gaining lots of ground. Human end users shouldn’t need to see raw facts in any text format, only developers. Turtle is the easiest to read and write by hand, I think, but eventually programmers will do that just as rarely as they read and write JSON.

    We’re still a long way from having phones that measure brain activity to decipher our thoughts before they become pieces of knowledge consisting of concepts and, err, facts about things we do, want, and feel. In light of my privacy, I’d like my phone to not push my thoughts and activities to the Synaptic Web. It could ask specific questions to the Web that I would like answered, but those questions are likely to be based around concepts, time and place (“what museums are open around here tomorrow?”). That almost works and looks like keyword search.

    I like the vision of a Synaptic Web (I heard the term for ther first time today), but to call the Semantic Web failed because people actually want a Synaptic Web was not proven today.

    Share
  7. Well, one look at Schema.org and I’d have to say the author here is just another buzzword hack. Synaptic Web – come on!, lets see you move some TVs, Events or Deals with that. Just because Nova couldn’t make it pay, doesn’t mean the rest of us are doomed!

    Share
  8. Ben’s right on the nose here!

    Share
  9. I agree with your comments on Lazy Consumption and Curation Latency, but mining social for more than relationships and interaction is basically worthless because you get the sentiment without the underlying bias that drives that sentiment…in essence you *are* judging the book by it’s cover and that is not just fuzzy data, it is dangerous data.

    Share
  10. “Information should be pushed, not pulled”: The idea of “push” over the web was a big buzzword in the late nineties, and it failed. People want to retrieve the information that they want to retrieve; they don’t want vendors like you deciding what to send them, despite any new buzz phrases you make up to claim that your system knows what the end users really want. (Judging from the date on http://synaptify.com/?p=613680, it looks like your buzz phrase has had three years to catch on, and it obviously hasn’t.)

    Share
    1. “pushing” information to me could be effective only if the pusher knew MUCH more about me than I’m comfortable with.

      Share

Comments have been disabled for this post