Today I went to my son’s kindergarden class to tell them about my work. This was a bit of a challenge. Instead of talking specifically about my research, I tried to explain to them what mathematics is all about. The presentation is here. If you are called upon to do something similar, feel free to use this to build on. It is a bit short, but about right given their attention span. I emphasized applications, since I thought that the more abstract ideas of pure math would be a bit hard to get across. I probably underestimated them. It was a fun exercise, and I think they got something out of it.
There has been a lot of controversy about a recent clinical trial of an oxygen treatment for premature babies. However, I found the reporting of the issue very confused. This article from the AP demonstrates the difficulty in communicating the issues surrounding randomized clinical trials (RCTs). There is a general misunderstanding – and frequently, misrepresentation – of what clinical trials are all about. It is all too easy to get the impression that these are just experiments on humans. The truth is that the evidence gathered from clinical trials is essential in deciding which treatments and medicines work, and which might be harmful. Without them medicine would not have come nearly as far.
My point here is not to give a full review of the controversy. For a good explanation see here. Rather, I would like to use this as an example to explain some misconceptions about RCTs.
The following statement from the article demonstrates how it is easy to misunderstand the issue:
“… the debate is about one of modern medicine’s dirty little secrets: Doctors frequently prescribe one treatment over another without any evidence to know which option works best. There’s no requirement that they tell their patients when they’re essentially making an educated guess, or that they detail the pros and cons of each choice.”
I fully agree with this statement. But the writer never follows up to explain that this is precisely why we need clinical trials – to provide the evidence that will help decide which option is best.
It is easy to be reminded of the horrors of the past when reading about RCTs (like the Tuskegee syphylis experiment). I am not saying that we are living in a wonderful world in which medical researchers always do what is best for their patients – far from it. However, RCTs are the very tools that allows doctors to offer provenly better medical care.
If you read the beginning of the article, it remains unclear whether in the study premature babies were hurt (perhaps even on purpose) in order to test new medical approaches. Of course, this would be truly horrific if true. As you read further the picture becomes more confusing. The article states that
“Oxygen has been a mainstay of treating them [premature babies], but doctors didn’t know just how much to use. Too much causes a kind of blindness called retinopathy of prematurity. Too little can cause neurologic damage, even death. So hospitals used a range of oxygen, with some doctors opting for the high end and some for the low.”
This is exactly the point: Before the study was performed, a range of treatments was prescribed (85%-95% oxygen saturation levels). Doctors knew that oxygen treatment helped. They did their best to guess how much to use. But before the study was performed they were just guessing what treatment will lead to the best outcome. They did not know whether they could be doing more harm than good by administering too much or too little oxygen. In the absence of evidence, they essentially gambled.
This was a very important issue to resolve, and that is precisely why the trial was performed. Doctors could not have guessed that the higher oxygen levels both reduced mortality and improved outcomes. Now that the answer is known future generation of premature babies will receive better care.
But would this be ethical if it came at the expense of the babies involved in the study? Of course not! We cannot pay for progress in medicine by knowingly harming patients – indeed, the very thought of it evokes the darkest chapters of medical history.
So the question is if the babies in the study received the best medical care known at the beginning of the study. In clinical trials patients are split into groups that are given different treatments. One treatment cannot be known to be worse than the other(s) — this is what the trial is designed to resolve. However, if one treatment turns out to be better, then one of the groups will have received better (more effective) medical care. But this will be known only after the study is completed.
This is an essential point: Before the trial is performed, nobody knows for certain which treatment is better. Indeed, babies that did not participate in the study received a range of treatments, according to the best guess of their doctor. If the trial was not performed, the range of treatments — including the worse one — would still be administered.
What may be difficult to accept is that sometimes, perhaps more often than we realize, doctors simply do not know what the best treatment is. We laugh at the medieval use of leaches, bloodletting, and remedies that would help balance the humors. But doctors today still often guess about what works (how much is a matter of debate) – and I am not even talking about nutritional supplements almost all of which are completely unproven, if not known to be harmful.
I think this is where scientists in general – and mathematicians and statisticians in particular – need to better explain, and keep on explaining why we think certain things are true. Clinical trials offer a way forward in situations where we simply cannot base decisions on experience, but need to look at data and use statistics.
Returning to the case of the premature babies, the stories are heartbreaking
“I unknowingly placed my son in harm’s way,” said Sharissa Cook of Attalla, Ala., who wonders if vision problems experienced by her 6-year-old, Dreshan Collins, were caused by the study or from weighing less than 2 pounds at birth. “The only thing a mother wants is for her baby to be well.”
Dagen’s mother, Carrie Pratt, was more blunt with reporters: “Why is omitting information not considered lying?” she said. “We were told they would give her the best care every day.”
I cannot imagine what these families have gone through. But I do believe that what they suffered was not a consequence of a participation in the study. The babies in the study on average had lower mortality, and better outcomes than babies that were not in the study (perhaps due to the “inclusion benefit”). Had they not participated, it is impossible to know what treatment their physician would have chosen. It may have been any of the ones used in the studies, since the entire range of treatments offered was used in practice.
Explaining the need and the reasoning behind RCTs is not easy. It is far easier to write a story about premature babies harmed by a heartless group of faceless scientists and doctors in white coats. There are many example of people who do misuse evidence knowingly or unknowingly and harm patients by doing so: from the anti-vaccine fantasies of Andrew Wakefield to the cancer therapies of Dr. Burzynski. However, the people in this clinical trial have not done anything of the sort.
I keep hearing some variations of the following comment: “Living organisms generate the equivalent of exabytes (or zettabytes, or whatever) amount of information per second. We will need to store all this data, and then analyze it to make sense of what is going on.” The first part of the statement is certainly true. A detailed description of the position and state of every molecule even within a single cell will take enormous amounts of data. Similarly, recordings of the activity of a population of neurons already generate gigabytes of data per second. As our recording techniques get better, the rate at which data is generated will also increase.
However, I am worried about the second part of the statement. There are a couple of concerns here. If we mindlessly accumulate data, it is possible that important features will be buried in the mess. As I wrote separately, some have suggested that this is perfectly fine. We just need to feed the whole shebang to some data crunching supercomputer, and it will tell us what matters and what does not. In this brave new world, scientists would only have to decide what questions need to be answered – machines will collect and interpret the data for us.
However, I doubt that our algorithms are this powerful. For the foreseeable future, we will have to play an active part in analyzing and understanding the data. And this means that blindly collecting all the data we can may not be the best approach.
A second related question is what is the complexity of a satisfactory description of a living organism — say a bacterium, or the human brain. I would expect that the complexity of the description will dictate how much data we will need to fully develop it.
What constitutes a satisfactory description is subjective. Satisfactory can mean that the description gives us a feeling that we understand how the organism functions. A satisfactory description could also give accurate predictions of how an organism behaves, without giving us an understanding of the mechanisms.
I am relatively optimistic that we will be able to develop the second type of models. We already have some computational models of organisms that give very good predictions about their behavior (here is an example by Jae Kyoung Kim, and collaborators and a computational model of a cell about which I wrote before). However, these models are not simple. I doubt that you can really stare at them and gain a deep understanding of how the model, or the organism, ticks.
Perhaps we will be able to develop models of living organisms that both give us accurate predictions, and deep insights into how they function. After all, physicists have given us such descriptions of the physical world. However, I doubt that we will get there just by blindly amassing data.
Here is a recent fascinating paper (original study and a nice comment). Briefly, the study shows that absolute pitch — the ability to identify a note when played on its own — is not really that absolute. In one of the experiments, subjects with absolute pitch listened to a piece of music that was slightly detuned. For listeners with absolute pitch this detuned music established a new reference point. After listening, their internal map of pitches shifted, and they identified notes in accord with the detuned reference they had just heard.
John Lienhard pointed me to his related radio episode. He points out that even people without absolute pitch will sing a familiar song in the original key. However, the experiment above points to how flexible our minds can be. Our memories, and the internal categories we establish are not absolute. They can be shifted to adjust to the environment – and this will happen without us being aware of these internal changes.
Perhaps this is even more impressive than muscle memory. We do many things mechanically and unconsciously. But our unconscious brain is not dumb robot. It is flexible, and self-correcting. We are under the illusion that our ever changing mind is stable.
Something about our field has been bothering me for a while: Overall, we mathematicians do a relatively poor job of presenting our research to a general audience (here is Doron Zeilberger’s comment on the subject). There are certainly some spectacular expositors in the field. But overall, we could do much better in presenting our work. This is a problem both of training and practice.
I work at the intersection of mathematics and biology. Over the years, I have been on a number of thesis committees for graduate students in both fields. Graduate training in the two disciplines is quite different. Importantly, students of mathematics are not trained as extensively to give presentations, or write in an accessible way. There are a few differences in the way we do things that could be responsible:
Mathematicians spend a lot less time preparing graduate students to present their research. Most biologists require students to give talks regularly during lab meetings, and at conferences. Many graduate biology programs require graduate students to give oral progress reports annually or twice a year. Although things are changing, this does not seem to be the norm in mathematics.
We also put less emphasis on writing. Publishing one or two peer-reviewed paper is frequently a requirement for a PhD in biology. Moreover, students often have to submit a thesis proposal, or a mock grant proposal as part of their qualifying exams. The writing is critically reviewed – I have been on committees where students had to rewrite their proposal several times before it was accepted. For many students in mathematics the thesis is the first, and sometimes only, original piece of scientific writing they will produce.
The reason for the differences may be that good presentations are much more important in biology. Even a mediocre talk will raise eyebrows, and it can kill your chances of getting a job. And a poorly written grant will not be funded, no matter how good the ideas. In biology there is a high overlap between people that do excellent research and give excellent presentations. Less so in mathematics. The cynical reply here is that these are just better salesman and get more funding, and hence run bigger and more productive labs. But perhaps these are simply the people who view the presentation of their research as an integral part of their work.
I don’t mean to say that we need to emulate biologists in every way. I see plenty of problems in graduate education in biology – graduate students frequently get no programming experience, and little teaching experience. They most certainly do not learn enough math and statistics.
However, with the current situation in academia, only a handful of PhD students will find academic jobs. Mathematics students entering industry will have learned the persistence and concentrated effort necessary to do research. Many will learn how to program. These are invaluable skills. However, the ability to write well and present ideas clearly is also indispensable. Shouldn’t we do a better job in teaching our students these skills?
Things are getting better. For instance, the students in our SIAM chapter at my university (University of Houston) have organized a student paper presentation event. Participation was strong, and I was impressed with the presentations. Paper exchanges, where students read and comment on each other’s writing have also been helpful. There are numerous other ways in which we can help our students and postdocs become better communicators. And I think we should consider this an essential part of their training.
I know this issue has been brought up many times, but I just read this excellent post, and wanted to bring it up again.
If you read scientific articles, you have likely encountered p-values many times over. Many people think they understand what a p-value means, but I believe that many do not. In science we frequently test different hypotheses. We naturally want to determine the probability that the hypothesis is false or true, given the data that was observed. The p-value is frequently interpreted as somehow giving us such a probability. But this is not what the p-value tells you – it only gives you the probability of observing the data you have, or a more extreme sample, under the given hypothesis. If this probability is small, then either you observed a low probability event, or your hypothesis is wrong.
The main point of this post is to direct you to the following clear discussion of the issue. Although this point has been made so many times, I think it is worth re-emphasizing.
Perhaps I am going out on a limb here, but it seems like we naturally tend to the Bayesian approach. We can compute the probability of the data given a hypothesis, p(D | H). What we would like is to determine the probability of the hypothesis given the data, p(H|D). We want to be able to say: “The data tells me that this hypothesis is very probably true,” or even “The data tells me that the probability that this hypothesis is true is 99%”. If you don’t want to use a Bayesian approach then you can’t go directly from p(D | H) to p(H|D). In particular, p-values deal with the first, but not the second.
Next time you see a p-value in an article, pay attention how it is interpreted. I am sure you will find many examples where the interpretation is not quite correct.
I just came from a very interesting lecture about Wiener, cybernetics and the counterculture by Cyrus Mody. So let me continue with some further thoughts on Wiener and cybernetics. One thing that Wiener warned about in Cybernetics is the takeover of machines. Despite what this may bring to mind – after all cybernetics gave as the word cyborg – I do not think that Wiener thought that one day Skynet will become self-aware and exterminate the human race. He writes
The modern industrial revolution is similarly bound to devalue the human brain at least in its simpler and more routine decisions. Of course, just as the skilled carpenter, the skilled mechanic, the skilled dressmaker have in some degree survived the first industrial revolution, so the skilled scientist and the skilled administrator may survive the second. However, taking the second revolution as accomplished, the average human being of mediocre attainments or less has nothing to sell that is worth anyone’s money to buy.
Thus, Wiener says, our brains now give us the only advantage we still have over machines. But this advantage will not last – and what then?
We are a part of many feedback loops involving machines. But how of how much value is the human component in these loops? How much longer will it be necessary? Let me give a couple examples (I am indebted to Evgeny Morozov for a some of these).
For instance, if your goal is to maximize profits, you need to minimize the difference between what your audience wants and what you are delivering. In the movie industry that would allow you to avoid another “Waterworld” or “The Adventures of Pluto Nash“. You can avoid such disasters by improving your predictions about what people like. And if you are Netflix, you have the data that allows you to do so. You know not only what movies people streamed, liked and disliked, but also when they paused them and when they skipped ahead. You analyze your data and you find, perhaps, that people want to see Kevin Spacey directed by David Fincher in a remake of the British TV series “House of Cards”. You are guaranteed a success.
Or perhaps you are a punk band, that wants to get the biggest possible response from the crowd, i.e. moshpit intensity. You install some sensors in the floor, and correlate the dance intensity with the features of the songs played at that moment. You can then design your songs according to what drives the crowd wild. This is what a band in China called Bear Warrior did. Quoting the singer of the band:
…the data helps us understand how we can improve our performance to make the audience respond to our music like we intend.
The potential problem here is that for centuries we had a feedback loop between the artist and the public. These were and are imperfect. We are now changing them to be more efficient, and give the public more of what they want. But in doing so, are we marginalizing, or even removing the artist? Could the machines that we put in their place really be creative? This new system could result in sterile solutions, which may be satisfactory, or even delight us. But by minimizing the possibility of failure, we may also minimize the possibility of generating something truly new and surprising.
And what happens when humans are taken completely out of the feedback loop. A self-driving car is in the near future. But as Gary Marcus at NYU asks, do you want to leave the entire decision making process to a machine? What if you are driving down a narrow bridge with a school bus full of children coming your way. Should your car be allow to kill you in order to save the children? This may be the moral decision, but should a machine be able to make it. Isaac Asimov thought about this, and came up with some interesting answers.
Wiener himself tried to suggest the changes that are necessary so that all humans remain valued in the future:
The answer, of course, is to have a society based on human values other than buying or selling.
What other values? Wiener does not say.
I do not believe that the singularity is near. But I do believe that, as they did in chess, machines will surpass us in many other ways. As a society we should follow Wiener’s advice, and agree on what we really need to value.