REDMOND, Wash., March 22, 1999 — David Heckerman entered Stanford University medical school in 1980 to learn about the human brain. At the time, he was asking questions like, “What is the nature of human awareness?” and, “Are humans simply fancy computers that can understand and direct their own existence?”
Two decades later, Heckerman still contemplates these questions, but from a different perspective. Rather than questioning whether the brain works like a computer, he now asks whether it’s possible for a computer to emulate the human brain. Can computers be “aware?” Can they offer a level of intelligence that resembles the sophisticated processes of the human brain?
A senior researcher in the Decision Theory and Adaptive Systems (DTAS) Group at Microsoft Research, Heckerman is approaching these questions armed with a background in statistics, medicine and artificial intelligence. His work centers on using data in sophisticated ways to make computers “anticipate” the desires of users so they can more efficiently serve people’s needs.
“People use several phrases to describe what I do, including statistics, machine learning, and data mining, but it’s really all the same,” Heckerman says. “My work centers around learning from data. There’s a sea of data on the Web, in computer databases, everywhere. I want to take that data and gain some insight and knowledge out of it, so we can make smarter decisions.”
A boyish looking man in his early 40s, Heckerman demonstrates energy and passion when discussing his work. Despite the ambitiousness of his research, he has the unusual gift of explaining his work in the simplest terms. A couple of hours talking to him, and it’s clear that what most people regard as intensely challenging, Heckerman sees as logical and straightforward, even simple.
Heckerman combines data with expert knowledge to make predictions about complex problems. What differentiates his work from that of traditional statisticians is that the predictive models he builds-called Bayesian networks-capture cause and effect relationships about the world.
The implications of Heckerman’s research are enormous. Already, his work is helping people eliminate junk mail from their e-mail in-boxes and easily obtain a sophisticated level of computer technical support without placing a phone call. It is also enabling businesses to better target customers by predicting the habits of computer users who browse or shop online. While his research has far-reaching implications for how computers will be used in the future, the underlying goal for all of Heckerman’s research is to build “intelligence” into the computer to make it a far more useful tool than it is today.
“The idea is that when you use your machine, it will form guesses about what you’re trying to do and help you,” Heckerman says. “It will be like having a butler.”
From Medical School to Microsoft
Like any good statistician, Heckerman sees his transition from medical school to Microsoft as a series of logical moves. He first entered medical school thinking he would become a neuroscientist. But as he ventured deeper into the life of a medical student, he began to realize that many of the questions that interested him actually lay in the field of artificial intelligence. At the same time, it disturbed him to watch physicians regularly make diagnoses with little time and sleep.
Witnessing these problems led Heckerman to wonder about the possibility of using probability-and computers that can handle complex statistical computations-to diagnose medical illnesses. He began working toward his Ph.D. in Medical Information Sciences in 1983, and soon discovered the “Bayesian network”, a recent invention of researchers at Stanford and UCLA. Bayesian networks encode an extra piece of information that statisticians usually overlook: information about cause and effect. This extra information makes it easier for humans to understand these models and to build predictive models when both data and expert knowledge are available.
Realizing the potential uses of Bayesian networks for medical diagnoses, Heckerman, his colleague Eric Horvitz, and a third partner opened a medical diagnosis company in 1986 called IntelliPath. The team relied upon pathologists’ knowledge of which symptoms cause which diseases to build systems that diagnosed diseases of the heart, brain and lung, as well as other diseases. A year later, Heckerman and Horvitz decided to expand the concept to address any problem requiring diagnosis. Joining with Jack Breese, the two colleagues started a second company called Knowledge Industries to apply Bayesian networks to issues ranging from sleep disorder problems to jet airplane failures.
While operating two companies, Heckerman completed his Ph.D., which was selected by the Association for Computer Machinery as the top dissertation of 1990. He then returned to medical school, completing two years of course work in a single academic year. Heckerman became a professor at UCLA in 1992, giving lectures to students in artificial intelligence and probability theory. Soon after he began working there, he was approached by Nathan Myhrvold, Microsoft’s chief technology officer, who had read his award-winning dissertation on “Probabilistic Similarity Networks” and invited Heckerman, Horvitz, and Breese to join Microsoft’s newly formed research division.
“My initial reaction was, ‘There’s no way,’ ” Heckerman says. “I had really worked hard to get the position at UCLA. But we came up here and were very impressed with the people. And we saw an extraordinary opportunity to have our research used by millions of people.”
Making Computers More Responsive
Six months later, Heckerman, Horvitz and Breese relocated to Redmond, Wash., to form the DTAS group. Using his statistical models, Heckerman saw the potential to put computerized data to better use. In some cases, computers could be used to collect new data that would prove helpful to users. In other cases, computers could be used to analyze mountains of existing data that companies collect, yet don’t put to effective use.
“Companies like Visa and MasterCard collect an enormous amount of data about their customers, but they’re not doing much with it,” Heckerman says. “Right now, they use simple rules to decide whether to approve a transaction. But there’s all this data that’s sitting in their database that they could use to make better decisions. For example, a sudden change in the types of transactions a person is making is a great clue that the later transactions are fraudulent.”
By using data more effectively, Heckerman and the DTAS team at Microsoft Research have already developed a series of breakthroughs that are improving Microsoft products. For example, the team developed the technology behind the popular answer wizard that was incorporated into Office 95 and Office 97. Presented to users in the form of a paper clip, happy face, cat, or other cartoon character, the answer wizard analyzes what users are trying to do and offers them assistance without them having to request it. Customers can also type in questions to receive additional information about the software program they are using.
“How many times have you gotten stumped while using a computer and wanted to ask someone for help?” Heckerman asked. “The answer wizard lets you do just that.”
Heckerman’s group also developed the technology that helps customers use the Microsoft Technical Support Web site to “troubleshoot” problems encountered with Microsoft products. The technology, which also has been incorporated into Windows 98, enables users to pinpoint the solution to their problem by leading them through a series of questions. “If you’re having trouble printing or your fonts look funny, you go to the troubleshooter and describe the problem. It then helps you by asking questions and describing possible fixes for you to try,” Heckerman says.
About two years ago, Heckerman and the DTAS group began work on an “anti-spam filter” to help users filter out junk e-mail from their in-boxes. Heckerman, who says he and Horvitz came up with the concept at about the same time, first thought of the idea when he received his first spam messages. “It was December 1996 and for the first time I got a very strange message,” Heckerman recalls. “It wasn’t addressed to me, had nothing to do with me, and was trying to sell me something. And I thought, ‘Oh no, junk e-mail.’ ”
A month later, Heckerman was receiving five junk mail messages a day, and he and the DTAS group set out to develop the filter. Rather than blocking e-mail from a centrally located server, however, the group decided to build the technology into each user’s software to give them the greatest control over what to filter out.
“Most people would consider e-mail that talks about how you can get credit to be junk,” Heckerman says. “But if you’re a small business, you might find such mail useful. Our anti-spam filter customizes itself to what you consider normal and spam mail.”
The current prototype of the filter scans information about e-mail messages, such as the subject line, the body of the message and the time of day it was sent, for hints that the e-mail is junk e-mail. If it believes the message to be junk mail, it colors the mail by default or, at the user’s option, sends it to a special junk mail folder for review.
“The filter makes a diagnosis much like a physician,” Heckerman says. “It uses all sorts of clues in the message-the words and phrases in it, who sent it, when it was sent, etc.-and then it makes a decision. Unlike most filtering technology, the technology we developed also gives users control over the flow of mail into their mailboxes.”
Most recently, Heckerman’s group developed technology that will enable Web site owners to offer visitors personalized information by observing previous browsing or shopping habits as well as the patterns of other customers with similar profiles. Microsoft plans to add the technology to Microsoft Commerce Server, the next version of Microsoft Site Server 3.0 Commerce Edition. “Say you own an e-commerce site. When customers drop items into their shopping baskets, our technology recommends other things they might want to buy,” Heckerman says.
While all of these successes have been a collaborative effort, colleagues credit Heckerman for his keen ability to foresee the practical applications for the team’s research. “David has a fairly rare combination of talents-he is mathematically sophisticated and also creates practical applications,” says Breese, an assistant director at Microsoft Research who has known Heckerman for 14 years. “He has excellent intuition and persists until he reaches closure. Words that come to mind are ‘focused’ and ‘productive.’ ”
“He is a focused researcher who often cuts to the key technical issues and challenges with great rapidity,” says Horvitz, Heckerman’s colleague for the past 20 years. “He is a brilliant mathematician with a passion to understand.”
“He has made significant contributions in this area, and has worked very hard to provide other groups with the resulting technology,” says Max Chickering, a researcher in the DTAS group and Heckerman’s student at UCLA. “Despite the fact that he’s consistently involved in several projects simultaneously, David is always eager to find new problems to tackle. He brings infectious intensity to his work.”
Breakthroughs in Learning and Understanding on the Horizon
Aside from his research at Microsoft, Heckerman is beginning to tackle a problem of personal interest. While in medical school, he saw how physicians regularly used patients as guinea pigs in clinical drug trials. He witnessed clinical studies in which physicians gave half the patients what they believed to be a better drug, and half the patients an inferior treatment, and then measured the results. “Well, you do that and you’ve just jeopardized the lives of 50 percent of those people,” he says.
Armed with the power of Bayesian networks, Heckerman wonders about the possibility of learning the causes of disease and the effectiveness of disease treatments solely by observing the patterns of patients rather than by conducting controlled experiments. “Wouldn’t it be great if you could infer causal information without doing any experiments?” Heckerman says. “Offer both drugs, let people choose which drug they want to take, and see what happens. And from that, infer which treatment is better. It sounds like magic, but sometimes, under reasonable assumptions, it can be done.”
Another problem that Heckerman wants to tackle with his methods for discovering cause without experiment has to do with the school choice debate that for years has divided educators. Using data provided by Milwaukee Public Schools, Heckerman hopes to build a Bayesian network to determine whether giving students greater choice over the schools they attend improves education. The University of Wisconsin-Madison posted the data on the Web after the five-year Milwaukee Parental Choice Program was ended and the results deemed inconclusive. Heckerman hopes his statistical methods will yield more definitive results.
“There are many open questions in the fields of sociology and medicine, and I’d like to tackle at least one of these problems,” he explains. “I hope to prove the method works on a key problem so that others will start to use it. A great satisfaction would be to see this technology in routine use by the FDA.”
Regarding his work at Microsoft, Heckerman says the ultimate goal is to make computing a lot easier than it is now. He anticipates a time when people will be able to build advanced intelligence into their machines. He envisions a day when computers will work behind the scenes to accurately predict and take action based upon users’ interests, preferences and desires. And he hopes his work with computers will help unlock some of the mysteries about the nature of consciousness that have intrigued him for the past two decades.
“I don’t set my sights too high,” he jokes. Well, perhaps he does. But if Heckerman’s work so far is any indication, scientists may indeed succeed in transferring a large degree of human intelligence into their machines.