Some thoughts on statistical models in science and Chomsky vs. Norvig
There has been a lot of discussion about whether we will really be able to have a deep understanding of scientific results of the future. Steve Strogatz explains the problem succinctly here.
This debate is somewhat abstract. It is therefore interesting to read about how this controversy is playing out in particular scientific disciplines. One of the best examples I saw recently was the exchange between some of the best minds of this and a past generation: It started with a number of remarks that Noam Chomsky has made at the Brain, Minds and Machines symposium at MIT. Here is a transcript of Chomsky’s remarks. He also clarifies the position in this interview.
There is a lot here, but I wanted to address one particular point that Chomsky makes: His position is that statistical models may work very well in engineering applications, but do not (at least typically do not) reveal the underlying principles or rules that govern the universe. Chomsky argues that theories should still govern experiments. As usual, he adopts a fairly extreme position. For instance, he says about Gregor Mendel:
“Yeah, he did the right thing. He let the theory guide the data.”
(as an aside, I think Feynman provided the best response to such arguments: “It does not make any difference how beautiful your guess is. It does not make any difference how smart you are, who made the guess, or what his name is – if it disagrees with experiment it is wrong.”)
However, Chomsky’s position is overall understandable: Science is not just engineering – it should provide insights into the workings of the universe. Or should it?
The answer to Chomsky by Peter Norvig, Head of Research at Google, is very much worth reading. This is a well thought out response, and I can’t really do it justice by summarizing it. Read it, along with the paper by Leo Breiman that is discussed in one of the last paragraphs.
I think there are many interesting questions that are being addressed here, and I felt that sometimes this was not sufficiently clear in the discussion. Let me point out a couple of things that came to mind:
Are probabilistic models useful? I find the disagreement between Norvig and Chomsky about this specific question a bit puzzling. This is perhaps because they are talking mainly about linguistics, and have particular probabilistic models in mind. Of course probabilistic models are useful: quantum mechanics describes the world in probabilistic terms. Statistical models are also often inspired by certain hypotheses about how the world works. If the data points in favor of one model, it frequently provides evidence that this model captures something about how the universe works.
I think the disagreement is about what Leo Breiman calls “algorithmic models”. Such models typically have thousands of parameters, and can be so complex that we cannot understand why or even how they work, even when they give very good predictions. They are frequently not constructed to fit any particular set of data. It is therefore quite possible that the principles which make them work would not help us understand the rules of the universe. Thus even understanding such models may not tell us much about the world around us.
If we describe the world in terms of such models, are we still doing science? This is where Norvig and Chomsky disagree. I think that this is somewhat of a semantic dispute. I am certain that both will agree that in all of science we should strive to find simple rules that accurately describe the workings of the universe in a language understandable to humans (this language has traditionally been mathematics).
However, there is a possibility that we will never be able to completely understand how the universe works. The Four Color Conjecture has been proved 35 years ago with the assistance of computers. We know that the statement is true, but nobody has a clear understanding why that is so – the proof is simply to complex to be fully comprehended by a human being. What if describing natural phenomena will require models of similar, or higher complexity?
I don’t see any reason why the universe should be simple enough for us to understand. In that case, handing over the business of doing science to computers may be our only option.
Here is another interesting take on this issue.