Thursday, February 17, 2011
Thoughts on Watson: the Jeopardy robot Part III
OK. Watson won. He won kinda big time, and I was gonna post a picture of a chain-link-fence-holding Sarah Conner engulfed in flames on judgement day, but it was sort of gruesome and anyway I don't really feel that way about this particular robot triumph. That said, burying an arms cache in the Mexican desert and maybe learning a few tips on how to outsmart logic-beholden machines via a re-read of Asimov's I Robot might not be a terrible idea. (no, watching the movie won't help)
Before we get into the gloomy excitement of computers getting really really smart all of a sudden, let's quickly discuss the saving grace for humanity here which is: Ken Jennings managed a (reasonably funny and certainly appropriate) Simpsons reference within his final jeopardy answer!
We're still good at something! Being funny! Take that, machines!
In case you're not a Simpsons nerd, see the classic Kent Brockman clip below.
Maybe IBM's next challenge should be to develop a robot that wins Last Comic Standing. No, seriously, give it a shot, IBM. It's even OK if it uses props, or ventriloquism, or a redneck-y catch phrase, but I think your best bet is to have it write 45 minutes about the quirks of being robotic, then stock the audience with robots. But perhaps I digress.
Back to the show. What can I say except to ask more questions about how Watson works? I find this stuff sort of endlessly fascinating.
Why does it seem like Watson's so much better at Double Jeopardy than Single?
Does Watson benefit from any sort of momentum of confidence factor after a series of correct answers (like a human would)? (Presumably not...so, followup question: Would that be an inherently bad thing to build into a computer's programming? Discuss)
How did Watson put together Moldavia and Wallachia to get Bram Stoker but not get the Chicago airport question?
Let me jump into that one.
As a reasonably intelligent human being, I gauged both of these (final jeopardy) questions as fairly easy, although I would agree that the Chicago one was a little tougher. Although the inherent difficulty of any question is debatable and of course skewed by whether someone knows or does not know the answer, I feel my opinion is at least somewhat founded in objective analysis and also supported anecdotally by the fact that both human players got both questions correct.
I'm being presumptuous about how Watson thinks through these questions, but what the hell (bbq). To get to last night's final Jeopardy solution (clue paraphrase: "So and so's published anthropological survey of Moldavia and Wallachia was the inspiration for this author's most famous work"), I feel Watson must have had to throw "Moldavia" and "Wallachia" into the gears and realize (quickly) they were a reference to Romania. (Wikipedia informs me that these comprise the north and south historical regions of what is now modern day Romania). From there the path gets a little murky though. The category was "19th Century Authors". So now Watson must filter through a list of authors with whom Romania is associated? Or does he, in his geographical search come up with keyword "Transylvania"...which when filtered through the category title yields "Bram Stoker", author of Dracula. Perhaps "Dracula" has to occur first to Watson but I sort of doubt it. This all seems pretty reasonable, except if this type of database list generation and subsequent list cross referencing is how Watson arrives at answers...
then why not nail that Chicago airport one?
Pull up a list of major airports, filter by US cities (or don't, even) and cross reference with cities with 2 airports. Even without the US filter (or if the US/American filter includes all of the cities in the Americas) this ought to yield a fairly short list. Then filter by historical names, and see which are associated with WWII. As I mentioned, Toronto's airport is Lester Pearson International (YYZ), but he's got no direct association with WWII (that I can see). A good but incorrect guess would have New York City (because of JFK)
I find all this puzzling, is all. Hopefully we'll get more coverage and learn more about how Watson ticks in the near future. For now I'll take our triumphs where we can get them, but more pertinently...where's my holodeck?
Subscribe to:
Post Comments (Atom)
Remember, in my common on Watson Post I, how I noted that some of the early questions raised by Watson's missteps/the show's questionable handling of one of his answers point to Watson not fully being programmed to grasp the nuances of the show?
ReplyDeleteFrom that question comes the possible answer that A) has allowed me to sleep at least a few scattered hours a night and B) seems, to ME, to be the best way to approach your "follow-up question"/discussion point about a momentum/confidence factor being programmed into Watson.
Waton was not built to be The Perfect JEOPARDY!-Playing Robot™. Though, it is an AWESOME test of its abilities.
Watson's a cross-referencing dynamo, but kind of in the way that a search engine comes up with results based on the words you enter. Emotion? Momentum? Not important. It certainly seems more brilliant and faster and more impressive than your average search engine, but...if it had been built for the specific application of winning at JEOPARDY! (or at least a program capable of replicating the greatest of human players, the most-perfect and challenging JEOPARDY!-specific Player 2 ever design), the quirks of the game would've been included.
Watson's technology is being designed for multiple applications, presumably. This was merely an interesting, February Sweeps destination test run. Programmers who want to improve him FOR a future rematch might include JEP-specific conditional things like recognizing that his top answer has been given and proven incorrect. That'll make the program better at JEOPARDY!, of course.
But, in the grand scheme of Whatever Watson Ultimately Is To Become/Exists To Do™, will programmers bother?
They probably should, if Watson's purpose is to always produce reliable answers - recalculating on the fly is something a cheapo GPS can do, you know? Conditional information has to be able to tweak its responses for it to be a perfect answer-giving machine, I suppose.
But...regardless of how they decide to improve Watson going forward, it seems to me that he wasn't designed JUST to succeed at this one game show.
Hence, the show-related problems, and the plot-holes we're asking to have filled in.
His betting on Daily Double/Final JEP! questions also gave me pause. I hope they explain that. Just a random number from within a given range? He can't have been thinking strategically, or certainly not greedily.
Fascinating. All around.
Your point is well taken however I think that IBM went out of their way to say they were designing a machine that could beat the top human players at Jeopardy (a la Deep Blue before). But since they clearly accomplished that feat despite the lack of nuance we mention, they could retrospectively say they acknowledged those aspects but figured they wouldn't make an overall impact.
ReplyDeletePerhaps then the question becomes "how can they apply the new information about nuanced thinking deficits to making the machine better overall?" - which seems clutch to application of said machinery in any field, as you stated.
You could safely argue Watson's first Final Jeopardy bet was the extremely safe way to insure him winning overall (across two days) and in that sense it was strategic...however I would then question why Watson wagered anything at all in that instance. What competing factor brought the number to (an insignificant) $947?
Intriguing.
Slate has an interesting article about Watson and how, if you really want to test AI, no-limit poker is a particularly interesting challenge:
ReplyDeletehttp://www.slate.com/id/2285035/
Good article and good point about poker. That would be truly awesome/scary.
ReplyDeleteSee also this Ken Jennings penned article (about losing) that Tony sent me:
http://www.slate.com/id/2284721/pagenum/all/#p2