Microsoft Researcher Jim Gray Receives Turing Award for Helping to Transform Databases into Dynamic Tools Used by Millions

“The original question, ‘Can machines think?’ I believe to be too meaningless to deserve discussion. Nevertheless, I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.” Alan M. Turing, 1950

SAN FRANCISCO, May 14, 1999 — Fifty years ago, British mathematician Alan M. Turing predicted that by the turn of the century, computers would be able to “think.” To measure this, he devised a simple test. Put a person and a computer in one room, and a judge in another, and have the judge ask the computer and the person questions using only a keyboard. If 30 percent of the time the judge can’t tell the difference between human and computer, the machine must be somewhat intelligent.

It turns out that Turing’s prediction was overly optimistic. As the new millennium approaches, computer scientists have yet to create a computer that can pass the “Turing Test.” But thanks to the work of Microsoft researcher Jim Gray and his colleagues in the database research field, scientists have made great strides in transforming the computer into a better repository of human knowledge.

Jim Gray

A senior researcher and manager of Microsoft’s Bay Area Research Center, Gray’s work over the past three decades has made it possible for computers to store and analyze ever larger amounts of information. As a result of his research, databases have evolved from 10 megabyte-sized storage facilities, which were the sole property of large corporations and government institutions, into terabyte-sized computers available to millions of people. His work paved the way for many of the daily transactions we now take for granted, such as withdrawing money from automatic teller machines (ATMs), making airline reservations and purchasing products over the Internet.

This month, the Association of Computer Machinery (ACM) awarded Gray the prestigious A.M. Turing award for his contributions to information technology. Named after the pioneer whose interest in artificial intelligence Gray shares, the A.M. Turing award is widely regarded in industry circles as the Nobel Prize of computer science. Gray received the award for his “seminal contributions to database and transaction processing research and technical leadership in system implementation from research prototypes to commercial products.” Gray will be honored at an awards ceremony in New York on May 15.

“I am delighted that the ACM will be presenting the Turing Award, its most prestigious award in computer science, to Dr. Gray,” says Barbara Simons, president of the ACM. “Dr. Gray is highly deserving of the award, having done fundamental research in several key areas, including databases, transaction processing and scaleable computing.”

Sitting in a conference room in his downtown San Francisco office, Gray methodically takes a visitor through a linear account of his career over the past 35 years. A bearded man with wire-rimmed glasses and a staccato voice, Gray punctuates the discussion with humor and a warm laugh. He frequently emphasizes that all of his achievements were a collaborative effort.

“I think for a whole variety of reasons, individuals get singled out for praise,” he says. “But the reality is that I don’t think there’s a single paper on this transaction stuff that I wrote solo. I would come up with some crazy idea, and someone would set me straight. Or someone else would come up with a crazy idea, and I would set them straight. Quite literally, hundreds of people worked on these ideas in many different institutions.”

Dying and Going to Heaven

Gray first became interested in computer science in the 1960s as a mathematics student at the University of California at Berkeley. His interest grew after he enrolled in a numerical analysis class, the only mathematics class at Berkeley that offered access to computers. After completing the class, he met an electrical engineering professor, Mike Harrison, who offered him a job as a research assistant. “I felt like I had died and gone up to heaven when he offered it to me,” Gray recalls.

After a brief stint at Bell Telephone Laboratories, Gray returned to Berkeley to complete his doctorate degree in programming languages, and afterward accepted a two-year IBM post-doctorate position at the university. He then went to IBM’s T.J. Watson Research Lab, where he worked on operating systems research. After working there for about a year, his manager urged him to consider changing his career path. “He said, ‘You know Jim, IBM has a lot of operating systems,'” Gray recalls. “‘We don’t really need another operating system. On the other hand, networking seems to be a pretty important area, databases seem to be a pretty important area. Wouldn’t it be a good idea to work in one of these unpopular research areas rather than do what everybody else is doing?'”

The advice seemed reasonable, but in the interim Gray decided to leave IBM and spend three months as a visiting computer science professor at a Romanian university. Upon his return to the U.S., he rejoined IBM in its San Jose, Calif., research lab as a database researcher. There he worked with a group of 15 researchers, among them Ted Codd, the father of “relational databases,” the modern database model in use today.

Making Databases Easier to Use

Many of the concepts that have earned Gray recognition came to fruition during the next two decades. At IBM, Gray focused on the idea that databases should be easier to use. One way to accomplish this was to make them display information more visually. The relational model restructures the way data is presented into simple tables that can be more easily visualized and manipulated. “It allowed people who were not programmers to have a very simple model of how the data is being transformed,” Gray says.

Gray also found a way to transform the database from immutable storage, updated by one person at a time, into a dynamic tool that could be simultaneously accessed and manipulated by thousands of users. “Circa the 1960s, people began to put their databases on disks and allow customers and tellers to change the database on the fly,” Gray says. “The person would enter a request to the database, and the database would be transformed and back would come the response. This dynamic nature of the database posed a new set of problems.”

To make it possible for databases to handle the multiple requests, Gray and colleagues concluded that groups of actions for one user should be processed together as a single “transaction.” Gray and his colleagues based their transaction model on a mathematical theory that made database programs more robust and simple for users. They also defined four properties each transaction should provide, and spent many years developing algorithms and programming computers to ensure databases would automatically perform transactions according to these rules.

The first property that database transactions should provide is atomicity, meaning that all of the actions should either occur as a unit, or none should happen. “For example, if you transferred money from one account to another, you’re taking money out of one account, which is one action, and putting it into another account, which is another action,” Gray says. “But if the computer crashes between the debit and the credit, the credit might not be recorded. So you want both of these things to happen or neither, but not half of them.”

Gray also decided that transactions should group individual actions in a consistent manner and be durable enough to survive computer crashes once the user hits the “save” button. Finally, transactions need to be isolated, which means that if two or more people make requests at the same time, one user can’t view another user’s incomplete transactions as they are entered into the database.

Creating a High Performance Database that Never Fails

In 1980, Gray joined Tandem Computers in Cupertino, Calif. There he concentrated on creating a “fault-tolerant” system and improving system performance. “The thing I focused on was making computers that never failed,” Gray says. “The property that you really want is high availability. You want a system that’s available all the time.”

To make computers available all the time, Gray researched why systems fail. At the time, the best computer systems typically failed for several hours each year. Gray’s goal was to reduce the failure rate to just one second per century.

“At the time, when the ATMs of Bank of America went down, it was front page news.” And today, with the advent of the Web, around-the-clock availability has become even more critical, Gray explains. “At this point, it’s noon somewhere in the world all the time. And if you’re a Web company, you want to be up and running all the time. There’s no midnight shift on the Web. You have to be online all the time.”

To improve performance, Gray met with 25 colleagues from various commercial and university institutions to co-author an article that defined a measurement for computer performance. The measurement evolved to become the standard defined by the Transaction Processing Performance Council, which is widely used by database and hardware computers to record the number of transactions per second that their systems can perform.

Gray also attempted to create a system that could process 1,000 transactions per second. “It was clear that there were going to be banking systems and airline reservation systems that needed to do thousands of transactions per second,” he says.

Unfortunately, Tandem was unable to achieve either goal. “The embarrassing thing is that we still haven’t reached the one second per 100 year goal,” Gray says. “The 1,000 transaction per second goal was first achieved by Oracle, and that really hurt. But it was a race, and they won the race, and we at Tandem lost the race. And more power to the guys at Oracle who did that.”

After working for Tandem for about 10 years, Gray went on to work at Digital Equipment Corp. and then was a visiting scholar at UC Berkeley. He joined Microsoft in 1995, after concluding that it was easier to conduct research in industry than at a university where professors must struggle for federal funding in order to conduct systems research.

“I fundamentally fell in love with Microsoft and Microsoft Research,” he says. “If you’re a researcher in a large corporation and you take your research idea to a development guy, he looks at you as a risk with probably low reward. You’re the drug pusher who says, ‘Try a little of this stuff. You’ll like it.’ But Microsoft has this embrace-and-extend mentality. They hear a new idea, they say, ‘That’s a good idea, but here are five different things we could do to make it better.’ I can’t emphasize too much how different this is from any other place I’ve ever worked.”

During the past four years at Microsoft Research, Gray has managed a small team that focuses on building fault-tolerant computer systems that can be scaled to enormous sizes. “The fundamental idea is that people don’t know in advance how much computing or storage they’re going to need,” he says. “What you’d like to be able to do is just add another disk, another processor, or another communications line and have it automatically join the cluster to deliver more storage, more communication-to scale up without limits.”

Many of Gray’s concepts have been incorporated into Microsoft SQL Server, most recently SQL Server 7.0, which was released last year. Gray and his team also created the TerraServer, a Web site that houses the world’s largest database on the Internet. A collaborative project among Microsoft and several partners, including the Russian and United States governments, TerraServer stores more than a terabyte-or a trillion bytes-of compressed aerial and satellite photos of the earth, accessible from the Internet. The TerraServer Web site has received more than 2 billion hits from online users since it went live in June 1998.

Storing satellite photos of the earth is just one of many uses for databases of this size, Gray says. “If we could solve the copyright problems, the Library of Congress is 25 terabytes in ASCII text. It would be nice to have the Library of Congress online.” Gray also predicts that terabyte-sized databases will one day be used to store film clips and vast amounts of scientific information.

A Visionary for Database Research

Fellow researchers credit Gray with turning the database into a highly flexible and widely-used tool. “If you do online banking today, e-commerce of some kind, if you make complicated travel reservations with airline tickets, hotels or rental cars, all this is based on technology that owes much to Jim,” says Andreas Reuter, Dean of the School of Information Technology at the International University in Germany.

“Before transactions and Jim’s subsequent work, database systems were clumsy to use,” says David Lomet, a senior researcher and manager of the Database Group at Microsoft Research. “Jim’s role was to help create the first prototype relational database system that demonstrated that these systems were not only much more flexible than their predecessors, but could be made to perform more competitively.”

Gray’s colleagues also applaud him for his skill at clearly defining research problems and doing the legwork required to ensure that the best research ideas are implemented. “His attitude of talking to many people from different camps was absolutely crucial in getting this exchange between developers and researchers started,” Reuter says. “He has guided and shepherded many of the successful research groups at universities in the past 20 years or so, and he has carried numerous ideas into the industry and pushed them toward implementation.”

In addition to his work within Microsoft, Gray has participated in several high-profile projects outside of the company. He co-authored a book with Reuter called “Transaction Processing Concepts and Techniques.” He is editor of the Morgan Kaufmann series on Data Management. He is a member of the National Academy of Engineering, the ACM and the National Research Council’s Computer Science and Telecommunications Board. And he is an active participant on the Presidential Advisory Committee on Information Technology, which recently recommended that the federal government increase its investment in long-term computer research and reinvigorate university research with Lewis and Clark style expeditions into the 21st century.

Future Challenges

Noting that the Internet sprang from research conducted decades ago, Gray is concerned that the research pipeline will dry up unless there is a renewed effort to fund university-level research. “What we’re seeing right now is that college students are dropping out and faculty members are leaving universities,” Gray says. “What that means is that there’s less research, fewer teachers and fewer researchers. The pipeline of people and ideas may empty. I’m very concerned about this, as are many other scientists.”

On what goals should long-term research focus? One challenge, Gray says, is to make computers trouble-free appliances that never fail and automatically manage themselves. Right now, computer servers come with several thousand dollars in maintenance costs per year, Gray says. But customers want a system that simply works-a system that always runs, will repair itself automatically and never loses the data it stores.

A second goal, he says, is to turn computers into virtual librarians that can store and summarize massive amounts of data, and present this data to users in a convenient way. “Some day you’re going to have a personal digital assistant that is watching and listening to everything that is going on around you,” Gray predicts. “It’s going to have a few petabytes of information to work with. And you’re going to say, ‘I was talking to this guy, he was somewhere in San Francisco, he had a beard, it was about 30 years ago. Now what did he say?’ And this thing is going to hunt around and come back and give you a clip of me telling you this. It sounds like science fiction, but it’s actually doable.”

A third challenge is building intelligent computers. Passing the Turing Test is a commendable goal, Gray says, and there are many steps researchers should strive for along the way such as improving the ability of computers to recognize speech and written language. “To pass the Turing Test, computers have to be able to read and write and think as well as a person,” Gray says. “I think we’re pretty far away, but there is progress. Turing predicted we would be there today. Perhaps if he were still alive, we might be, but we let him down. I suspect, though, that the kids coming out of school today will make him proud. The next generation will probably pass the Turing Test.”

As for himself, Gray plans to forge ahead with his current research. “I love this stuff,” he says, laughing. “I like to understand things, I like to build things, I like to see things work. I didn’t start out with a desire to become rich or famous. I just wanted to have fun, and I want to keep having fun.”

Microsoft Researcher Jim Gray Receives Turing Award for Helping to Transform Databases into Dynamic Tools Used by Millions

Related Posts