But you don't start out trying to solve the problem in a hilariously inappropriate way. For example, if your boss said, "hey, sort these 10 billion numbers" you wouldn't do simulated annealing with a cost function that penalizes unsorted entries, and then just make random swaps in the data and tell your boss to come back in 10 years when it will only probably be finished with an only probably correct answer. That's a categorical waste of resources, not a strategic upping of resources to get a first, but still reasonable, attempt that you can then whittle into something better.
As a machine learning researcher, my opinion is that Watson is more like simulated annealing. It's like someone said, "Hey how can we make this thing play jeopardy without even thinking at all about how it will do the data processing... how large do we have to make it if its processing is as stupid and easy to implement as possible?"
See my other comment for more on this.
Some interesting numbers to contextualize IBM’s Watson:
To put this in perspective, a conservative upper bound for a human being standing still is at most about 150w — less than 1/10 of 1% of Watson — and the person just holds the buzzer and operates it with a muscular control system.
Each of the servers generates a maximum of 6,649 BTU/hour. Watson overall would generate about 600,000 BTU/hour and require massive amounts of air conditioning. I don’t know a good estimate on heat removal, but it would up Watson’s energy cost significantly.
I don’t mean to criticize Watson unduly; it certainly is an impressive engineering achievement and has generated a lot of good publicity and public interest in computing. The engineering feat is impressive if for no other reason than that it is the first accomplishment of this scale, and pioneering is always hard… future Watsons will be cheaper, faster, and more effective because of IBM’s great work on this.
But at the same time, the amazing power and storage costs for Watson really kind of water it down for me. I’m not surprised that if you throw power and hardware and memory at a problem, you can use rather straightforward machine learning methods to solve it. I feel similarly about Deep Blue and chess.
A Turing test that would be more impressive to me would be building something like Watson or Deep Blue that is not allowed to consume more power than an average human, and has comparable memory and speed. The reason this would be impressive is that in order to build it, you’d have to have some way of representing data and reasoning in the system that is efficient to a similar degree that human minds are. One thing you could not do is simply concatenate an unreasonable number of large feature vectors together and overfit a machine learning model. Since this is an important open problem with lots of implications, we should use funding and publicity to drive research organizations like IBM towards that goal. Maybe building Watson is a first step and now the task is to miniaturize Watson, and in doing so, we’ll be forced to learn about efficient brain architectures along the way.
Note: I gathered the numbers above by looking here and then scouring around for various listings of specific hardware specs. I'm willing to believe some of my numbers might be off, but probably not significantly.