Speech Transcript - Rick Rashid, Microsoft Research Valley Road Show

Remarks by Dr. Richard Rashid
Senior Vice President, Microsoft Research
Microsoft Research “Valley Road Show”
April 16, 2003
Mountain View, California

ROY LEVIN: Good afternoon, ladies and gentlemen. My name is Roy Levin. I’m the director of Microsoft Research here in Silicon Valley and it’s my very great pleasure to welcome you all to the Microsoft Research Road Show.

We have a very interesting afternoon for you I think, starting off with a talk by Rick Rashid, Senior Vice President of Microsoft Research and my boss. Rick came to Microsoft in 1991 as director of Microsoft Research, really to get the research organization started, became Vice President in 1994 and Senior Vice President in 2000. And over the course of the past dozen years he has been responsible for many of the key technologies that have come out of Microsoft Research and into Microsoft’s products, including work in operating systems, data compression, digital media, including what became the Windows Media Division, and most recently was responsible for the work that has become the Smart Personal Objects Technology.

Before coming to Microsoft, Rick was a professor at Carnegie Mellon University where he worked on a very well known Mach operating system, which is still in the marketplace today in various guises.

And so it’s my very great pleasure to welcome Rick Rashid.

(Applause.)

RICK RASHID: Thanks, Roy.

Alright. Well, if this is a road show, I guess I’m a roadie, though I don’t know what that means exactly; never quite figured that out.

What I’m going to do is try to give you a sense of some of the things that we’re doing within the research organization and a sense of the organization itself and then you’ll have a chance later on to go around and see some of the demos and look at a small sample of some of the projects that we have that are going on.

Now, to put everything in perspective, as Roy said, we started Microsoft Research back in 1991 so I was sort of officially employee Number One and we date the beginning of the organization to the day that I officially joined Microsoft.

From that time we’ve just grown and grown and grown. Today we’re about 700 researchers. We span more than one continent now. We have most of our research here in the United States. Our largest facility is in Redmond and that was our first facility, but we also have a group, actually our second lab was actually in San Francisco under Jim Gray, which is a very small team, and then we’ve been growing our team here in Silicon Valley in Mountain View for the last year and a half or so, and that group is now up around 20 or so and continues to grow. So that’s our U.S. contingent.

A little bit more than five years ago, Roger Needham started up our research lab in Cambridge, England. Unfortunately, Roger just recently passed away, but he’s created a great organization there, 75 researchers right on the campus of the University of Cambridge. In fact, our building and the University of Cambridge Computer Laboratory building are right next to each other.

Not quite five years ago, we started our lab in Beijing and that’s now grown to about 150 so it’s a very strong, very rapidly growing team there and they’ve been doing a lot of really excellent work.

So that’s sort of the organization as a whole. Obviously there are a lot of great people. Some of them are here. You’ll have a chance to meet them in the demo room, like Jim Gray, for example, Gordon Bell, Tony Hoare, who works at Microsoft Research at our Cambridge lab, Gary Starkweather, whom many of you know as the inventor of the laser printer. But we also have some tremendous mathematicians as well: Michael Freedman and Laci Lovasz are two examples in the mathematical area. So it’s a good group all around.

We’ve always had the same mission and we’re continuing to have the same mission. I’ve used this slide since 1992 when I first did my talk going around and trying to hire people, to start the lab back in those days, which is first and foremost we’re about moving the state of the art forward. We’re patterned after Carnegie Mellon University. That’s where I came from. That’s where my traditions came from. So that’s the structure of the organization that we have.

Our people are judged on peer-reviewed literature, just like they would be in the university environment. And the goal here is to say you have to move the state of the art forward if you’re going to be of value to a corporation like Microsoft, and that’s what we’re trying to do first and foremost.

When we have good ideas then our secondary mission is to take those and move those into our products as quickly as possible, and that really is a statement that says Microsoft is not going to be here ten years from now unless — unless as a company we continue to innovate and do new things.

If you think about what Microsoft was like when I got to Microsoft in ’91, it’s not the same company anymore by any stretch of the imagination. I think the last time I looked we were like 98 percent of the people at Microsoft that work at the company today came after I did. It gives you a sense of how different the company is. It was a very small company in those days; it’s grown a lot. When I first got to Microsoft the main thing that I thought about the company and I thought the company did was DOS. In fact, I even wondered at the time did it make any sense to do a basic research lab in computer science at a company that did DOS and I worried about it a little bit, but it’s just a very, very different environment, very different climate.

The world has changed a lot in that period of time and it’s going to change a lot in the next ten years, and that’s really why our organization is here.

As I said, we’re really organized like a university. That’s a model; when people write papers, they publish their papers, nobody reads the papers before they publish them. Typically, just as it would be in the university environment, they don’t finish writing the papers until about an hour before they’re due. (Laughter.) And, in fact, also just like in a university environment they often don’t finish the papers until the day after they’re due and then they wheedle and they cry and they do all the things to get the program chairman to accept the paper anyway. So we’re just like a university.

We’ve built the organization up as a collection of critical mass groups. The approach we’ve taken to building the organization has always been to start with key senior technical people, people like Roy, to start individual areas or labs. Those people then bring in mid-career and junior people and that’s the way we’ve built every research area that we’re in.

But also just as we come out of the university world we’ve also had ties back into the university environment from the very beginning. So about 15 percent of my total budget actually goes back into the universities in the form of research grants, fellowships, support for curricula, lab grants and things of that nature. And then, of course, we have just a huge number of interns and visitors each year. I don’t even have a count anymore; I used to have a count. I know it’s over 150 interns, PhD interns just in Redmond every summer, and worldwide it’s probably over 200. And every day we have three or four talks from outside visitors at Microsoft in Redmond.

In terms of expanding the state of the art we’re out there publishing just like any good university research lab would be, and in many fields now you’ll find a significant number of papers coming from our organization. We’ve probably published more papers over the last six years than any two or three groups combined, but it’s been a tremendous publication machine, if nothing else, and we really believe in participating in the communities that we work in.

But we’re also focused on getting our ideas into products, and this is something that from the very beginning I’ve taken to heart. It sometimes again comes from my heritage at CMU, but it’s something we believe in a lot, which is we want to change the world. We want to get the ideas that we have out in some tangible form and so we’ve had very focused efforts from the very beginning. Even when the lab only had five or ten people we had a program manager whose job was to do technology transfer and we’ve been growing that group and growing that group of professionals as the organization has grown.

We also do everything we can — I always talk about technology transfer as a full contact sport. We do everything we can to just get people to interact with each other. That’s where ideas flow and as all of you know from your environment.

And in many cases we actually have researchers that have specific things that they are pushing like, for example, our data mining teams. They really want to move data mining technology into our products and the way they did that was move themselves into product groups for some period of time until that technology was established and then they came back into the research organization.

A lot of the things that make Microsoft what it is today came out of our research group. The simplest example of the impact that the research group can have on a company like Microsoft, and it’s also a very old example, you go back to 1995 when we released Windows 95 and Office 95 together. The only reason we were able to do that was because we had developed optimization technology for optimizing the working size of 32-bit code that allowed us to fit Windows and Office, now expanded by a factor of two, because they went from 16-bit to 32-bit code, back into the same space that Windows 3.1 fit in. So we were able to allow the company to sim-ship those products on four and eight megabyte machines, which was the standard of that particular era, long before our Office suite competitor was able to do. And what that meant was several hundred million dollars of direct income for the company because we could sim-ship those products that had a huge marketing value to us. But more importantly, it was a huge differentiator in the marketplace that we were able to bring our 32-bit products to market earlier and it was a tremendous advantage to us.

A lot of other technologies that have driven key products, things like the Windows Media audio, came out of our research group. A guy named Rico Malvar drove that. A lot of the key streaming media technology, in fact, the Streaming Media Division originally came out of the research group as a team I started back in 1993. I started the first e-commerce group in the company.

More recently things like Tablet PC and what we call our Smart Personal Objects are fairly direct transitions of people and technologies from the research group into the products.

So let me just use those as examples of some of the things that have happened. So Tablet PC is something now that has just gone into the market. It’s having a fair amount of success. I saw a report yesterday that said that some independent group was saying there’s like 5 percent, they’re anticipating 5 percent of notebooks sold this year will be tablets. That’s a big impact very quickly for a new product.

The original tablet PC design was actually done by Chuck Thacker, one of our researchers. In fact, I remember when Chuck helped to start our research lab in Cambridge, England with Roger. I remember when he was sawing apart Sony laptops and flipping the screen and sitting it on top of the keyboard part and saying, “OK, well, that’s what I want,” and then putting a digitizer in and beginning to play with the software. That was the very beginnings of that effort.

And then as time went on Chuck did the reference design for the Tablet that became the design that various manufacturers built upon. But many of the key technologies that went into the Tablet, the inking technology, the handwriting recognition technology, the note-taking and reading technology came out of the Research group, and not just the one in Redmond but also the one in Beijing and in Cambridge as well.

Most recently, just at CES and in last year’s COMDEX we announced something that was incubated within Research called SPOT for our Smart Personal Object Technology. And the idea being SPOT is really in some sense the ubiquitous computing vision. The idea is that you can have smart sort of computing, processing power that’s relatively inexpensive, very low power that can be put in virtually any device and yet has connectivity out to the outside world.

So what we’ve done with SPOT is effectively created an infrastructure that allows us to bring computing to almost any kind of device. So what we’ve built is effectively a national — I say national but it includes Canada, so U.S. and Canada network built around FM subcarrier because that network allowed us to deploy it very quickly to a broad collection of locations, and, in fact, when the technology first comes online, which will be this fall, we’ll cover more than 80 percent of the U.S. and Canadian population, so it gives you a sense, about 180 million people will be covered by that. So it’s a very easy to deploy technology.

We developed a new way of using the subcarrier to be able to bring much greater bandwidth and much more error resistance to the subcarrier so that we could get more data into these devices.

And then we built an infrastructure, a software infrastructure so that the devices could be continuously updated and easily programmed, so, in fact, we’ve built a version of the common language runtime that actually runs inside this very small device and yet allows us to send software directly to it and update it over time.

Our first device partners are watch manufacturers and the feeling there was that the watch is kind of the ideal place where you could bring timely information and data to an individual, because it’s right there, it’s something you can look at, it’s always sitting right in front of you. So right now I’ve got one of those watches on, one of our prototypes and it’s 58 degrees in San Jose with a cloud and a sun showing; I don’t know what that means exactly. There’s a chance of rain today. It’s supposed to be 63 as a high. Friday and Saturday look better, so that’s good for you guys. Humidity is 53 percent. Sunrise was at 6:31, sunset 6:43. The barometer is at 30.06 and steady. Winds are out of the south-southeast at 6 miles an hour. And the UV index is minimal, so don’t have to worry about suntan today.

But that gives you a sense. When I got off the airplane, I got San Jose information. Back in Seattle, I’ve got Seattle information. And then I’ve got my own calendar on here as well so I know that I’m actually supposed to be giving this talk.

So there’s lots of things that you could get from this but more to the point this isn’t just about watches and getting information to an individual; it’s also about control, the idea that you can now bring computing, the computing experience and extend it to just about any kind of device, and that these don’t have to be sort of funny little devices that all have weird programming mechanisms and don’t fit into an environment. These kinds of devices can fit neatly into a computing ecology and work with each other.

So the way I tend to talk about this is this is a way of bringing computing to anyplace that you need to have personalization where you’d like the devices to know about you, to know about the world around them, to know about each other and where intelligence can be brought to bear to make a task better or simpler.

And this is just another picture to give you a little bit of a sense of what it might look like. This is a Sunto-based device.

And we’re looking at a wide variety of other things. We’ve only announced watches but the technology could be put into again just about anything.

So that’s an example where we’re in research and we started out with some ideas about how we wanted to bring sort of this sort of ubiquitous computing concept out. We looked at ways that we could do it and the kinds of devices that would make sense. We worked to design a chip set and a radio technology that would allow us to do it and it would be low enough power and cheap enough that could be widely deployed. And we worked with research teams and then eventually with the product teams to bring these concepts together and then went out to partner with National Semiconductor, radio partners like Clear Channel and Intercom and Rogers in Canada and with device manufacturers, in this case Sunto and Citizen and Fossil to really try to bring this all together and bring something out to market.

Now, that’s an example of technologies that in some sense are already now they’ve already made the transition from the research end of the world into the product end of the world but there’s a lot of things that we’re doing that are sort of much more research focused, and I’m going to highlight a few here. You’ll see some more in the demos next door. Keep in mind that we code more than 55 different research areas in our organization so my suggestion is if you really want to find out what we’re doing go to Research.Microsoft.com and you get a chance to see the full breadth of the activity. I’m only going to just cover some quick highlights of various things.

Now, an area that honestly doesn’t get talked about very much in public because it’s kind of boring for a lot of people but for computer scientists I think it’s extremely exciting is the whole question of software engineering, how we can really change the software engineering process. For a company like Microsoft it’s critical because that’s what we do. We write just a ton of software.

And so our whole cost structure is built around what does it take to produce millions of lines of software, how can we do it right, how can we do it better than it’s been done before and how can we integrate many different activities together, many different pieces of software in one environment.

So the challenge really for us is to say how do you build artifacts, software artifacts that are 10, soon to be hundreds of millions of lines of code and do it in a way that does what users want it to do, that doesn’t fall on its face all the time and that you have some sense of whether you can make some positive statements about what the software does and what it doesn’t do out in the field.

And increasingly the marketplace requires much higher reliability as time goes on because the bar keeps getting moved higher and change. People really need more, new things all the time and we need to be able to create new things quickly. And so we need to be able to build software that, in fact, lets us build software.

So the focus in our research efforts, we started about five years ago something called our Programmer Productivity Research Center, headed up by Amitabh Srivastava, and the focus of that center has been how you basically take software and let it analyze and help you develop software; in other words, how you put engineering into software engineering and focus on defect prevention, focus on early detection, focus on the technologies that allow you to track the development process from specification all the way through to deployment and the bugs that you get back and integrate all that information together so you can continually improve the product and improve the information.

And we chose to do this in a way that on the one hand was research oriented, meaning we needed to be able to continue to improve the tools rapidly and try out new ideas and new technologies but we also said that group was going to be the main group providing technology to the whole company so that we would get the feedback from the product groups as they effectively dog-fooded our research, told us what worked and what didn’t, and so we were able to get a feedback loop built up in the process.

So let me give you some quick examples of this. One of the things we recently showed off at the Object Oriented Programming Conference was something called Scout. It’s really test prioritization software. And the idea behind Scout is to think about what it takes to build a product like Windows and test it. The answer is it’s basically a pipeline process. It takes six to eight weeks to run all the tests that need to be run to verify that an instance of a build of Windows is the same as the last instance with respect to those tests. It’s an enormous undertaking. Over 800 people have to be involved in doing that testing. That’s an enormous undertaking.

The problem, of course, is that as you’re developing a new version you’re making changes. Those changes go into the pipe, go into the six to eight-week test period and the worst thing that could happen to you is you discover five or six weeks into that test period that something you did five or six weeks ago broke it and now you’ve got to flush that pipe or go back and figure out what it was that you did that caused the problem and then undo that, but you’ve got six weeks of other changes. And so it’s a serious, serious production problem for us.

What the team that worked on this thing called Scout did was to take the idea and say, look, we can look at the test — here’s the image of it — we’ve run one set of tests. If we can map the basic blocks in the code to the tests that touch those basic blocks we can effectively understand what the impact of any change that gets made would be with respect to those tests. And effectively the concept here is to re-prioritize the tests that are performed based on the changes that were made in the code so that when a change gets made you test those things first that are most likely to find the problem in that particular piece of code. Therefore, you’ll find the bug one day into the process, not six weeks into the process ideally ad so you’ll have to flush that pipeline relatively less often.

So that’s one of the things and that’s now gone into use in virtually all the major product teams that we have. Exchange is using it. It’s a critical tool for us in the Windows 2003 Server to help us improve the productivity. The SQL team uses it. And we’ve been able to scale this whole process to literally tens, nearly a hundred million lines of code now in our code base.

And if you’re interested, there’s a paper in ISTA from Amitabh and one of his colleagues on this.

Another area that we began to prioritize very heavily, really starting at the very end of the Windows 2000 cycle in terms of our research was static code analysis, the idea that you effectively could analyze code to look for many classes of bugs, which historically we only found through our test processes.

And there are a number of tools that we’ve built. The first one we put online, something called PREfix, we acquired a company, in that rough timeframe with some great people in it, Jon Pincus, a number of other really top people and they’ve brought that technology into Microsoft and then we refined it and continue to develop it, first brought it online with the very tail end of the Windows 2000 process and now it’s become a part of our normal analysis and normal code development process within the company.

And the idea is that it does basically a symbolic execution of the code. There are models for how this code is supposed to behave. And basically it’s a big model checker, operating over the tens of millions of lines of code that we have. And today we use it in all of our main products and it’s become a primary means for us to find a wide variety of different kinds of defects.

Now, the interesting thing about PREfix is that it is a model checker but it is a heuristic model checker, meaning that it can tell you when something is wrong; it can’t tell you that nothing is wrong, so just to keep that in your head.

Now, the most recent iteration of PREfix is something we call PREfast, which PREfix was really a tool that ran at the very end of the development process to really verify over very large bodies of code whether something was or wasn’t likely to be correct. PREfast is now a tool that ordinary developers use and many of our product groups are now required to use before they even check in their code. And the idea is to do this kind of model checking instantly against the code as the developer develops it.

The PREfast, just like PREfix, can’t say that there isn’t a bug or a problem; it can only say that it’s found some things that it thinks are suspicious or are probably not correct.

Most recently we’ve had some of our teams looking at really a problem that says how can you define properties of programs that you can actually demonstrate are true or false for large bodies of code. And so two different tools have been brought online. One of them, SLAM, is now making the transition into our product group and will probably be something that you’ll see us using extensively with the next release of Windows.

And the idea behind SLAM is that you define a set of properties, a set of rules, mathematical properties of programs that you believe need to be true. They may be statements about the use of memory. They may be statements about the use of locks. They may be statements about the use of APIs. But there’s clear mathematical statements or rules that say this has to be true for this body of code, right, so it could be something like this only touches memory that is allocated. That’s an example of the kind of thing you might be able to say. Or you release a lock that you take; that’s another type of an example.

Now, unlike the PREfix or PREfast tool that was really built to analyze, sort of do symbolic code execution, what SLAM does is it literally takes those rules, takes the original C or C++ or C# program and effectively construct a new program from the original that only has binary variables that with respect to the original program is identical with respect to those sets of rules. That then can be exhaustively proven with proof techniques and that’s really the idea behind SLAM. And this is a technique now that we’ve been able to use on device drivers and pieces of code that may be hundreds of thousands or even in some cases a million lines of code. I mean, a device driver these days is a very large thing or can be.

And this is really a way again of demonstrating the absence of a defect; meaning, if this procedure works and states that this particular piece of code has this property, then you’re done with respect to that property. It does have that property.

We’ve sort of moved upscale from that, looking at ways of taking the same approach, the same set of ideas and applying them to something as large as the entire Windows code base and that’s something that some of the other researchers in the PRC have been working on that now allow us to talk about properties of extraordinarily large bodies of code, again, and state categorically whether there is or isn’t that particular defect in the code.

Now, this is not proving programs correct, because there’s not a way to really talk about that in a sensible fashion because no one knows what correct might mean. This is a way of saying with respect to certain properties do those properties hold or not for this particular body of code.

So that’s some of the work we’re doing, just the tip of the iceberg of work that we’re doing in sort of the software engineering area. The key thing there is we’re really trying to change the way people think about building software at Microsoft and ultimately elsewhere and to think of it more as a managed process where from the very beginning your specification and all the information that you collect as you develop the software is kept in a large database, that that database then can be annotated with any changes or modifications that are made over time, that all the testing and test procedures and test results can be kept in that database, results of all the different tests you run against the code can be kept in the database and then when we release a product to the field things like our online crash analysis that we have in Windows XP and Office XP, that data that comes back from the field from users that have problems can be fed back into our system and then we effectively mine that data looking for the bugs that that data indicates exists. So it’s a very sort of integrated, holistic way of thinking of the development process from the beginning to the end.

Now, I also wanted to talk about things we’re doing, you know, how we’re sort of thinking about changing systems and user interfaces and highlight some of the work that’s going on there.

Let me just point out that the world is really changing dramatically. The easiest way to look at this is to see how cheap a terabyte of disk storage is now. At CompUSA you can buy 250 gigabytes of disk, the last time I was there, which was a couple weeks ago, for $400, $399 and I think there was a rebate, but I never send in rebates so I don’t count that.

So that’s where we are. In the next two or three years that’s just going to be a single disk drive and it’s going to cost you $300 or $400 just like a single disk drive does today, a high capacity disk drive.

To put it in perspective, every conversation you could have from the time you were born to the time you die could be in a terabyte of disk storage. And depending on what part of the country you’re from, you might have a lot of space left over, or not. (Laughter.)

Another way of saying the same thing is you put a camera on your head, you could keep half a year of everything you saw while you were awake — I’m assuming eight hours of sleep; that’s what I usually get — in standard VHS video kind of format in that same terabyte of disk.

So another way of saying this, the way I like to talk about it is we’re reaching the point where storage is of human scale, right, where we can literally think about keeping the things that people care about.

Now, what does that mean? Well, first off it means we have to think about what kind of computing environment we have to support that. I think you have to move to a much more query-oriented view from a hierarchical view, or another way of saying the same thing is today we have the Dewey Decimal system on our computers, this nice hierarchical structure that says that I think cats and dogs are under animal husbandry, which is under agriculture and related technologies, which is under technology. I don’t get that, but then again I can’t find the law files in my documents either, so that’s a problem and it’s not a good way for people to keep information.

So what we’re looking at is how can we change the relationship between the user and the computer to really take advantage of more natural human constructs and contexts, things like what have I seen. One of the most powerful tools you have is your memory and a lot of times you have just a hint of something associated with something in your memory and you don’t know what it is but you know there’s a relationship there you can grab with it.

So an example of this is one of the projects we’ve built around this idea is something called Stuff I’ve Seen, which is literally a system that keeps track of everything you look at on your computer screen as you’re using it. And the idea is that that’s a context. You could do queries against that context. And there are a lot of things that happen to you as the day goes on that you might want to relate back to the things that you’ve seen.

So in many cases when I look for information I’m looking for something I’ve actually seen before or I may have looked at and I want to get back to it but I just don’t know exactly where it is. So Stuff I’ve Seen is a way of doing that and time is in a sense a uniquely interesting way of thinking about it.

Here is just an example of a screen shot. This is one of the interfaces they’ve been experimenting with where you can put in a query and get all the things that you’ve been looking at recently that relate back to that query.

Another way of looking at it is to think about it as things that are landmarks against images you may have looked at, meetings that you may have had, basically different reference points, the weather. There are lots of different reference points that people have in their head associated with the things that they want to find, and again what we’ve been experimenting with in this particular project that’s run by Susan Dumais in our research lab in Redmond is this idea of, now that I have all this knowledge about what I’ve seen, how can I index it, how can I manage it, how can I get access to it, how can I show it to the user in a reasonable way?

More broadly, we’ve been looking at ways of modeling the user interface to sort of fit the way people think about things or the way they think and feel about things that they have, taking advantage of this notion of deep history and association as a structural concept.

And this is just a simple example from a project run by Lili Cheng in our research lab in Redmond called Sapphire, where she’s been looking at sort of what might new kinds of interfaces look like. So if, for example, you’re working on a document and you want to be able to, as you look at something, know, for example, if there’s a name, there’s a person and the person, they’re someplace; in this case you may have access to their schedule. You may know what mail you’ve received from them, what documents you’ve received from them, what meetings you’ve had with them. It may relate back to other documents that you’ve been working on related to that person or related to the document you’re looking at. There may be an implicit workgroup of people that you’ve been working with related to this document, and you may be able to think of things in terms of a calendar of events that are going on.

The point here is to experiment with the notion that says take all that information that today we effectively throw away, but that the computer knows about you when you’re working on something. If I’m creating a document, the computer knows, or could know if it bothered to keep track, that are these words, were they typed in by you, did you cut and paste it from somewhere else, did you get them in an e-mail message from somewhere, did you get it from a slide, where did this information come from, who did it come from, what were you doing at the time you were working on this document. That’s all information that is there to be mined or had or kept and now that we’ve got a lot of storage we can think about using it as ways of referencing back to the information that we have.

Now, I mentioned part of this, in talking about this and talking about the notion of human information, mining human information, one of the key elements of that is that people typically work in language. Going back to 1997 we’ve had natural language technology built into our products from our research group. This is the original 1997, Office 97, what’s called Text Critiquer that was built into Office. We’ve used the same technology for summarization, Japanese word breaking and so forth. It’s just doing grammar checking in this particular case, just like you learned when you were in grammar school.

But what we’ve been able to do over time is use this technology to develop better and better ways of sort of pulling apart language and using it either for search purposes, for creating databases of information. This is an example of a database that’s been extracted from a dictionary. These are all the relationships in the dictionary to the word “bird”. Here poultry is a bird. A duck is a bird. A quack is caused by a duck and so forth. And no human being typed that in; that was just written down in a dictionary and the software pulled it out automatically by just parsing the information that was there.

We’ve taken that technology, brought it together with statistical techniques and now being able to use that for other things than just search but are also using things like machine translations.

So one of the things we’ve done, for example, is to bring online just this week, in fact, a natural language processing system that we’re using for our knowledge base. The first knowledge base that we’re translating is our Spanish knowledge base and as of this week all 125,000 articles are automatically translated from English into Spanish. Before only a few thousand documents could ever be pulled up in Spanish. They were all hand translated. When we did our beta testing, which we’ve been doing over the last year and a half or so of the technology, what we found was the utility of the computer translated, machine translation document is about the same as for the English document for English speakers. So we’ve been very, very pleased about that result. We’re in the process of moving it to Japanese, German and French, so we’ll start with Japanese first, and German and French soon thereafter, and we’ll be gradually bringing more and more of our languages online.

One thing to understand about this is that you’re not really translating English. When you’re translating Microsoft Knowledge Base articles, they may be written in something that started out as English, but they’re not English. And that’s the critical point about this technology, which is it is a learned technology where we take translated pairs, hundreds of thousands of lines of translated documents and that builds the structures that we then use for the rest of the translation. That’s how we can take Knowledge Base articles in English and translate them in Knowledge Base articles in Spanish as opposed to Knowledge Base articles in English and translate them into God knows what into Spanish.

So that’s the key bit of magic in this particular piece of work and it’s the first statistical language technology that we know of that’s been brought on this scale to the marketplace.

Another thing that we’ve done with some of our language technology and statistical analysis techniques for language, I showed this before and so some people here have probably seen it, is we started to look at the question of can you actually get answers to questions using the documents that are out in the wild, in this case in the World Wide Web. And so one of our researchers, Eric Brill, has developed a system where he basically takes a question that you asked, uses natural language techniques to effectively rephrase that question as a variety of possible answers, goes out on the Web looking for articles that might have those answers in it using traditional search techniques, and then uses statistical language techniques to effectively take those articles that might have answers, correlate them against each other and say, OK, well there’s probably a lot of redundancy in the information; what is likely to be the real answer to the question.

So a simple example of that is, who is the head of Microsoft Research? I put this in I think as my first question, figured that it would be interesting if it got it wrong. It got it right, so I didn’t fire the guy. (Laughter.) You don’t hear about the projects that didn’t work. (Laughter.) And actually he puts up some of the other possible answers, which are all actually, with the exception of computer laboratory, perfectly reasonable.

When did Abraham Lincoln die? April 15th. Interestingly enough, there are a number of places on the Net that say that he died on April 14th, which is not correct, but if you ask when did Abraham Lincoln get shot, he was shot on the 14th.

Now, the key thing there is that there’s no code in here at all that knows anything about Abraham Lincoln. For that matter it doesn’t even know about shooting. I mean, it doesn’t know anything about semantics at all. All it’s doing is just cross-correlating all the articles that come back that it’s looking at.

It’s really good at factual things. It’s not very good at things that don’t make sense. But, for example, what is the color of grass? It comes back with some reasonable answers, which green is the best.

Who is that masked man? (Laughter.) Okay, I put that one in. That was another one of my ringers. It got the answer right. (Laughter.)

Once I saw this I actually got excited about it and I put it in my talk. Why is the sky blue? That’s another one I always like to put in. In this case, light.

Now, this isn’t like Asks Jeeves, where somebody has actually written down answers and looking for questions that you might ask. This again is just going out onto the Web. This thing, if you aggregate all the knowledge that’s out there and then do analysis against it and mine it you might be able to find some unique answers that make sense.

Now, you could ask it some silly things: What is the meaning of life, in case you wanted to know what the Internet thought. I think it’s particularly appropriate that a system that’s all about questions comes back and says the meaning of life is a question, but that’s the way it goes.

I was giving a talk. I think it was at our Professional Developers Conference about a little bit over a year ago, a year and a half ago. It was after September 11th. And somebody from the audience, and I was doing that one live because I had a system all set up there for it, and somebody in the audience asked, and so I put this in, and they wanted to know, where is Osama bin Laden hiding, so what does the Internet think. The answer is, in those days, the mountains in Afghanistan. If you ask it today, and actually I did it yesterday just to check, it says Afghanistan is still number one. Pakistan is now number two on that list so it’s changing. Kandahar was on there somewhere, so he might be sitting in Kandahar.

QUESTION: (Off mike.)

RICK RASHID: I tried that yesterday too and the first one that came back was Iraq. The second thing that came back was Syria. The third thing that came back was the United States. (Laughter.) So I can only assume that he’s been captured and that no one is telling us.

It’s all interesting, but the point here isn’t to be just amusing. The point I was really trying to get across is that what we’re trying to do is say how can we mine this vast storehouse of human information. It’s not a database, right; it’s a bunch of texts written in English or Spanish or Chinese or Japanese or whatever. And so you have to come up or start thinking about developing technologies that can do that and it can’t be human beings typing in answers to questions. It has to be something that’s automated or it won’t scale.

I’ll talk about other kinds of data. I don’t have a lot on this, but you’ll get a chance to see some related technology over in our demo booth over there.

One of the things we’re doing is also looking at general media. So, for example, in our most recent Movie Maker product, which was just released, one of the technologies in it is something called Auto Movie Edit, which came out of our research lab in Beijing. And what they do there is they’ve actually got — I’ll say this; it always sounds funny when I say it, but they have a model of human interest in video, meaning they’ve developed a model for what are the boring parts of your home videos and what parts you’d like to actually keep. (Laughter.) It’s not a perfect model; it’s not bad though. And the idea is you can feed your extremely long, boring video that you’ve shot in it, and it will produce something that looks a lot tighter, much shorter I will add, certainly from my own video, and it can set it to music nicely and make the transitions fit with the music and so forth.

But the idea is that it’s really not just sort of cutting things up and smashing them together. They tried to develop a model for what is it about images that make them interesting to a person or not and then apply that in a disciplined way to actual video.

They’re using the same techniques to look at TV kinds of video, especially news stories, things of that sort, what they call their Smart Video Browser. That’s not something we’ve shipped in a product yet, but some technology that they have in the demonstrations.

And something which is in the process of being moved to the product teams right now is work that we have that basically does auto sorting and categorization of photos where you can tell is it an indoor scene — it’s actually very good at this — indoor scenes or outdoor scenes. They’re very characteristic so you can tell which ones are indoor and which ones are outdoor fairly easily. You can tell in many cases who’s actually in the images, how many people are in the images and if you cross-correlate that kind of data with what time you were taking the images and what your calendar was like on those days you can start to actually do a lot of interesting kinds of searching against your photos and your images, and so we’re in the process of again trying to bring data mining techniques to this kind of human information.

Well, I’ve got more stuff. I’m going to stop there though because I promised I was going to leave some time for questions. There’s a lot more things I could talk about. I do recommend going to our Web site. One of the things I won’t get into today but I had slides on it, I just didn’t get there, is that we’re doing some really exciting work on learning technologies, technologies to help students learn and to do distance education, distance learning. So if you go out to our research Web site you can get a lot more information about that and that’s work we’re doing jointly with a lot of people, some of them here in the Bay area, Berkeley, Santa Cruz, Santa Clara, places like that, also at USC, Brown University, Carnegie Mellon and so forth, and of course there’s a very large program going on with MIT.

So with that I will open it up for questions.

QUESTION: I’m very curious about the Web services. Were you involved in the Web services development, and what is the feature and what is the vision of Microsoft with Web services?

RICK RASHID: So I was involved in the beginning of what has eventually become a lot of the Web services work that we’ve been doing and, in fact, my name is one of the names on the Web services security spec. And my interest there has primarily been in sort of overall architecture and security so that was my involvement on that side, working with a lot of the guys that are now in the developer division but originally their offices were right next to them.

That’s an example where we took a product, sort of a product group, in this case what was at the time codenamed Indigo, had them really set in the research group, in the research building right next to myself and Butler Lampson and a number of our key people working in that, that had traditionally worked in that distributed systems area and then it was just a great interaction between us and that product team and eventually they moved into the main line of product organization.

In terms of where it’s going, I mean, again I could have gone into a lot of detail, but there are some great examples here of some of the work that Jim Gray and Tom Barclay have done where they’ve been using Web services technology to build things like large scale federated databases, working with the astronomy community, where they’ve been using Web service technology to make available large databases like the Terra Server and a lot of people build applications like USDA to build soil applications against that kind of data.

So I think it’s got a bright future in terms of where it’s going.

Yeah, another question?

QUESTION: What is the promise of intentional programming and when do you believe we will benefit from it?

RICK RASHID: Well, that’s actually something that was a project that had been run in Research for quite some time under Charles Simonyi. Now, Charles has effectively spun himself out into a separate company really driving that.

The goal behind the intentional programming is really to try to find — in some sense it’s a transformational technology. The idea is to be able to allow people to effectively develop code in the abstractions that make the most sense for them for their particular environment and to be able to then transform that code for other purposes.

So the idea was to capture the programmer’s intent, be able to provide a series of transformations on that and effectively be able to repurpose code from one environment to the other through transformation.

Some of those technologies have made their way into Microsoft products. Certainly a lot of our work on XML that’s now going on within the Office organization, a lot of those people were people that worked with Charles and a lot of those ideas about XML transformation came out of some of the work that he did as part of that project. But the actual intentional programming activity itself, that’s now a company that Charles has that’s a separate company.

Another question?

QUESTION: Where does security fit into the research department in terms of is there like a separate group that has a research focus or is it just interwoven within the whole research department? Because there’s been so much of a priority put on security.

RICK RASHID: The security, I mean, that’s sort of in the area where it touches on a lot of different technologies. So, for example, we have people that are very focused on cryptography and so technology is around cryptography. That obviously has value in one kind of security, privacy, and being able to manage and protect the information. They were some of the key contributors to what is now the- (Trustworthy Computing initiative). In fact, the key members of the research team moved to the product team to move that technology forward.

So that’s certainly one element of it. We have people that look at security from a user interface perspective, for example, looking at how can people provide credentials that are less onerous to them. So one of the projects we have looks at ways that you can take advantage of people’s memory of images and pictures as a way of doing what’s called certification or testing that would be reliable but wouldn’t tax people’s memories quite so much as these nonsense words that we’re supposed to remember today.

We also have people looking at how you can have secure identity, forms of identity that can be carried around and that identify you as an individual and that can be signed by an authority and recognized without having the physical path itself be something that you have to keep private.

We also have people that are looking at — I mentioned sort of code security, right, how can you avoid certain well known classes of flaws that have the property that they allow outsiders to attack systems from the outside. And then there’s a whole set of things going on in networking.

So there’s a wide variety of different things. I mean, I was involved, for example, in our Web services security work. So there’s a wide variety of different things that touch on a lot of different technologies.

The problem with a word like security you never have a group I think that’s just called security because it wasn’t sure what they’d be doing. There’s so many things people call security; it’s not just one topic.

Yeah, other questions?

QUESTION: I wanted to ask you what is the nature of the budget for research at Microsoft and also what are the priorities? Also what are the major collaboration type of projects you’ve been doing with outside companies? And also if you are targeting certain verticals or the enterprises? (Laughter.)

RICK RASHID: Okay, I need that software now. (Laughter.) This is stuff I heard.

Let’s see. I think one of the questions was budget. We have 700 people. We spend about as much money as you’d need to spend for 700 people. (Laughter.)

One of the characteristics of the group, in fact, is that we don’t really obsess very much about budget. What I’ve found, again going back to my early CMU experience, is that research groups work best when researchers don’t know anything about budgets. In fact, they’re actually cheaper to run that way because they don’t get into weird counterproductive behavior. (Laughter.)

And so basically what we tell our researchers is that they get what they need, and if they don’t need it they shouldn’t ask for it, and that’s worked reasonably well. And so we tend to look at our research effort in terms of how many people we’ve put against the research task rather than the pure number of dollars.

But in terms of dollars, we’ve said I think publicly before it’s over $200 million, so it’s roughly a quarter of a billion dollar range. But that counts a lot, so I don’t think you should view that as a precise measure. Obviously some percentage of that goes to universities, for example.

Now, let’s see. I lost track of the middle question. The last question was, were we more enterprise focused or more consumer focused, and the answer to that is I think it’s just both. I mean, it’s different groups, right. The groups that are really interested in digital media, they’re probably a bit more focused on the consumer, although there are a lot of scenarios where digital media are relevant to enterprises.

One of our researchers, Anoop Gupta, for example, has now become Vice President of the Real Time Collaboration group that’s now a product organization within the company, and they just acquired PlaceWare, which is a company that was in that business to push that agenda.

So there’s a lot of work that we’re doing sort of broadly that I think is applicable on both sides and a lot of times you don’t know when you’re doing research which side it’s really going to be most applicable to.

And I missed the middle question, I’m sorry.

QUESTION: It seems like we’ve hit a wall on speech to text, everybody has, and it’s somewhere below the threshold where people will use it readily. Is there something coming?

RICK RASHID: Well, first off we haven’t hit a wall. If you look at speech recognition in the Tablet PC, and it’s also coming out as far as in Office 11, it’s significantly better than in the last release of Office, Office XP. And, in fact, broadly if you look at the field, we’ve been getting about 10 or 15 percent improvement historically year after year after year after year.

Now, I think what you’re thinking of is that there are sort of thresholds of utility that you cross. I mean, I think we’ve long since crossed the threshold of utility where speech recognition can be used for things like data mining, because you can mine against spoken material, and people do that today in a number of areas. It can be used certainly in accessibility situations where individuals don’t have other means of communicating with the computer and so speech can be a good way of doing that and many people successfully use it.

The last statistics I saw on this were something like 3 to 5 percent of all Office users actually did use speech on a regular basis. So it is something that gets used by a small group of people.

I agree with you we’ve not crossed the threshold into everybody and it’s not clear we’ll ever cross that threshold. At least in English it’s much faster to type than it is to talk for most people. Dictation is not as good an input mechanism in English as typing. And so for English and some of the other languages that use the Roman alphabet, it’s going to be a better thing to use typing or some form of input than to use speech.

For the Asian languages we’re probably already at the point where dictation through speech and typing are at a par if not better for speech. And one of our speech researchers now runs the speech product group at Microsoft, the last time he was in Sheng Hua the students were challenging him and he had a little competition and he asked them to send their best typist up. And he dictated and the best typist typed and he was able to get his document done faster and correct faster than the person typing.

Now, in Asian languages they’re already statistical in the way you do input even when you type. It’s not a one-to-one relationship between what you type because you’re effectively typing in phonetic spellings of things. So admittedly there it’s not as good a task.

So I think we’re crossing various thresholds. I don’t know when we’re going to get to the point where the computer understands the average person, especially the average person talking over a cell phone. I’m not sure I’m at the point where I understand all my cell phone conversations. I’ll just sit there saying, “OK, yeah, yeah, yeah, sure,” and you close the phone, “I have no idea what that person just said.” (Laughter.) But that’s just life.

I was at this innovation conference that ran last year. There was this session on computers and communication and I think I pointed out there and I think the entire panel I was with agreed with me that the killer feature for a wireless phone is the ability to place a phone call whenever you want, to have the phone call last as long as you want is the last, and to have the person at the other end understand what you’re saying through the entire conversation. OK, and if we can get there we’re done and the other stuff, the pictures and stuff are fine.

OK, I think that’s the last one. Thank you very much. Now, before you all stand up and make noises, I’m supposed to now welcome you to go see the demonstrations and things we have nearby and to also point out that we have refreshments outside. So that you don’t all mob the demonstrators some of you should stop and eat. (Laughter.) And with that, I’m all done. Thank you.

(Applause.)

Speech Transcript – Rick Rashid, Microsoft Research Valley Road Show

Related Posts