IBM’s Watson and Analytics: Less Than It Seems, Maybe More Than It Will Seem

Updated: February 10, 2011

Deep Analysis of Deep Analysis

First, let's pierce through the hype to understand what, from my viewpoint, Watson is doing. It appears that Watson is building on top of a huge amount of "domain knowledge" amassed in the past at such research centers as GTE Labs, plus the enormous amount of text that the Internet has placed in the public domain - that's its data. On top of these, it places well-established natural-language processing, AI (rules-based and computer-learning-based), querying, and analytics capabilities, with its own "special sauce" being to fine-tune these for a Jeopardy-type answer-question interaction. Note that sometimes Watson must combine two or more different knowledge domains in order to provide its question: "We call the first version of this an abacus (history). What is a calculator (electronics)?"

Nothing in this design suggests that Watson has made a giant leap in AI (or natural-language processing, or analytics). For 40 years and more, researchers have been building up AI rules, domains, natural-language translators, and learning algorithms - but progress towards meeting a true Turing test, in which the human side of the interaction can never tell that a computer is the other side of the interaction, has been achingly slow. All that the Jeopardy challenge shows is that the computer can now provide one-word answers to a particular type of tricky question - using beyond-human amounts of data and of processing parallelism.

Nor should we expect this situation to change soon. The key and fundamental insight of AI is that when faced with a shallow layer of knowledge above a vast sea of ignorance, the most effective learning strategy is to make mistakes and adjust your model accordingly. As a result, brute-force computations without good models don't get you to intelligence, models that attempt to approximate human learning fall far short of reality, and models that try to invent a new way of learning have turned out to be very inefficient. To get as far as it does, Watson uses 40 years of mistake-driven improvements in all three approaches, showing that it's going to require many years of further improvements - not just letting the present approach "learn" more - before we can seriously talk about human and computer intelligence as apples and apples.

The next point is that Jeopardy is all about text data: not numbers, yes, but not video, audio, or graphics (so-called "unstructured" data), either. The amount of text on Web sites is enormous, but it's dwarfed by the amount of other data from our senses inside and outside the business, and in our heads. In fact, even in the "semi-structured data" category to which Watson's Jeopardy data belongs, other types of information such as e-mails, text messages, and perhaps spreadsheets are now comparable in amount - although Watson could to some extent extend to these without effort. In any case, the name of the game in BI/analytics these days is to tap into not only the text on Facebook and Twitter, but also the information inherent in the videos and pictures provided via Facebook, GPS locators, and cell phones. As a result, Watson is still a ways away from providing good unstructured "context" to analytics - rendering it far less useful to BI/analytics. And bear in mind that analysis of visual information in AI, as evidenced in such areas as robotics, is still in its infancy, used primarily in small doses to direct an individual robot.

As noted above, I see the immediate value of Watson's capabilities to the large enterprise (although I suppose the cloud can make it available to the SMB as well) to be more in the area of cross-domain correlation in existing text databases, including archived emails. There, Watson could be used in historical and legal querying to do preliminary context analysis, to avoid having eDiscovery take every reference to nuking one's competitors as a terrorist threat. Ex post facto analysis of help desk interactions (one example that IBM cites) may improve understanding of what the caller wants, but Watson will likely do nothing for user irritation at language or dialect barriers from offshoring, not to mention encouraging "interaction speedup" that the most recent Sloan Management Review suggests actually loses customers.

Featured Research
  • 10 Contact Center Myths Busted

    For most forward-thinking companies, the use of contact center software is on the rise. That said, in spite of contact center software’s sudden rise in popularity, a number of myths have begun to take shape. more

  • Go Mobile and Increase Employee Productivity

    The Bring Your Own Device (BYOD) and Voice over Internet Protocol (VoIP) tech trends now allow people to work whenever, wherever, and however they want—mobility makes it happen. more

  • [Infographic] How to Select a Phone System in 10 Steps

    Choosing the perfect phone system for your business is no small task …. Depending on the size of your company, the industry in which you work, and the specific needs your phone system will be required to meet, any number of solutions could get the job done. more

  • SMB CRM Providers Comparison Guide

    A good SMB CRM system can be an incredibly valuable asset for your business. As more businesses recognize this value, the amount of SMB CRM vendors is expanding quickly. Navigating the pricing plans, features, and service terms of all these can be a decision-making nightmare. more

  • 2017 Business VoIP Cost Guide

    Reducing expenses is one of the main reasons that businesses switch from traditional office phone systems to VoIP technology. But many people rush this decision and end up spending more than they need to. The costs of implementing a new VoIP system can increase quickly, especially if you don’t strategically plan for it ahead of time. more