Remarks by Rick Rashid
Microsoft Valley Speaker Series
“A Look at the Future of Computing”
Dec. 3, 2001
RICK RASHID: Okay, the normal question: Is the mike on? Yep, you can hear me? Okay, good. If I start slowing down a little bit though, I had a cold last week and I started getting laryngitis. So you never know what will happen. And I don’t think I’m supposed to have any fluids at any time. I think they’re afraid of that.
My goal today is to try to give you a sense of what we’re doing in our research group. I know a lot of you are here in the Silicon Valley Microsoft center, so this is a chance to get insight on what’s happening up in Redmond in our basic research group, and for those of you that are visitors give you a sense of what we’re doing at Microsoft in the basic research area.
Now, I came to Microsoft about 10 years ago. In fact, it was September 1st of 1991, which is also the official day that we believed that Microsoft Research got started. I have the advantage that I can claim that it started the day I started, since I was the one who was putting it together. And we’ve been building since then. So it’s starting from me, your typical employee number one. We’ve now grown so that we have over 650 researchers. Actually, it’s more like 680 the last time I looked in a number of different locations all around the world. We cover more than 40 different research areas. In a lot of ways we’re really much more like a computer science department probably than what most people tend to think of as an industrial research organization, because that’s sort of the philosophy that we started with and that’s the approach that we’ve taken.
We started our group in Redmond and we grew that for about five years before we started to think about the possibility of expanding. And over the last five years we’ve started research labs in Cambridge, England, and they’ll actually be celebrating their fifth anniversary coming up in this next summer; San Francisco, the group under Jim Grey. Roger Needham heads up our research lab in Cambridge, in close collaboration with the University of Cambridge. We started a group in Beijing just about three years ago now, two and a half years ago, and that group has been just growing enormously, and I was just there for a number of meetings and a presentation in our group in Beijing and I’ll talk more about them. And then our most recent group actually is here in the Silicon Valley center. It started about four or five months ago under Roy Levin and we’re starting to grow that group, but it’s still relatively small.
One way to think about what we’re doing, if you know a little bit about me, if you’ve ever heard anything about me, you know I’m a big Star Trek fan, so you’ll occasionally see me put things in Star Trek terms. We’re sort of the Star Trek part of Microsoft, you know, which is great. In fact, if you come and see my office, I’ve got the Star Trek logo on my door and a bunch of Star Trek paraphernalia. And a lot of the work that we’re doing isn’t exactly what you’ll see in the Star Trek movies or TV shows. They have a little bit more latitude in terms of technology. But in terms of things that we’re sort of pointing toward, you can see a little bit of that going.
The two little videos I have on the slide there, we’re showing off some of the work we’ve been doing in MEMM research — this is micro-electric mechanical machines. You wouldn’t really think this is something that Microsoft would be doing, but we are. It’s a relatively small group. We do work with a number of hardware companies and hardware manufacturers, and one of the things that we like to do is experiment with new approaches and new ideas as a way of sort of moving the market forward or helping to see the market move forward. We work with display manufacturers. We’ve worked with chip manufacturers of every type and of course the PC manufacturers. And so we’re doing a lot of work in that area, whether it’s in storage, or MEMMs for display. These are little machines that are up there to give you a sense of what’s going on.
We’re also doing work in things like virtual collaboration. I’ll actually show you some of the little demos of some of that work, but it’s sort of like a holograph, not exactly. There are virtual people that interact with you quite the way the Star Trek holodeck evolved.
We are working on automatic language translation, and, you know, I think this is one of the areas that I’m particularly excited about, because I think the opportunity is there to really make a big difference in people’s lives. Again, I mean, I manage a group of fairly diverse people that speak many languages. A certain amount of language translation would be very helpful when I visit the group in China. But this is an area where we think there’s a lot of opportunity and there’s a lot of interesting technology we can bring to bear on it.
Amusingly enough, we actually have had researchers that have published papers on things like quantum state teleportation. Again, this is a little far from what you might think of Microsoft doing, but we have a very strong group that’s working in the physical physics and the mathematics and really the boundaries of those fields and traditional computer science and complexity theory.
And my typical joke about the quantum state teleportation work is that if Microsoft 20 years, 50 years, 200 years from now, whenever it actually does, is the company that invents the teleporter, that given the sort of general rap we have in the press, the articles that would come out the next day would be, “Microsoft Copies Again.” (Laughter.) You know, it’s just the way things go, right?
We’re doing a lot of work in computer vision technology, so I’ll talk a little bit more about that. I think it’s critical for some of the new user interface work that we’re doing, but it’s also important in things like managing photographs and doing things with images of various types.
And then a huge effort has been going on for a long time in artificial intelligence, and I’ll give you a feeling for all those things as we go along.
This actually has nothing to do with the talk. (Laughter.) I really like this picture though. I actually did an event with Scotty. It was a Microsoft event; that’s why we’re both wearing Microsoft shirts. They were red shirts, but we seemed to have survived the event. And it was a developer event we did in Seattle. He actually lives in the Redmond area, not that far from Microsoft, so it was an easy trip for him. And it was really fun to do the event with him. I think he’s like 80 or 81 now. His wife just had a baby, so he’s a pretty impressive guy. (Laughter.) Again it has nothing to do with the talk.
Now, in terms of some of the people that we have, I just want to give you a sense of stature of the research organization in terms of some of the kinds of people that we have. This is the short list. The long list would be really, really long. But we have three Turing Award Winners: Jim Grey, Butler Lampson and Tony Hoare that work in our research organization.
Gary Starkweather is the inventor of the laser printer and he’s been playing a significant role, especially in looking at printing technologies and color and visual systems.
Jim Blinn and Jim Kajiya both have Academy Awards, literal Academy Awards, you know, from the Hollywood Academy for the work they’ve done in computer graphics and computer graphic algorithms.
Michael Freedman is a Field Medallist in mathematics. Again, Field Medallist and Touring Awards are basically the Nobel Prizes of mathematics and computer science respectively, just to give you a sense of that.
Laci Lovasz just recently won the Bell Prize, a top paper in his area of theoretical computer science.
Gordon Bell, of course, many of you know is from this area, the father of the VAC, the mini-computer and does a lot of work for us in areas of things like tele-presence.
And Dan’l made a reference to the fact that my own work is primarily in the operating system, so I know there are Macintosh people here, so if you’re using Mac OS 10 and you like it, my code is in there. There are still a lot of my lines of code in that system. Tru64 Unix is also based on the operating system I did when I was at Carnegie Mellon. And if you ever heard the word “NUMA” I made that one up on a bright shiny day in Pittsburgh back in 1983, because I couldn’t otherwise figure out what the name of that particular kind of multi-processor architecture would be.
In terms of the research organization, and I think this is a key point I want to make for especially the people here in the Silicon Valley center, you know, we really have taken a philosophy that’s much more like a computer science department philosophy, much more of a research university kind of philosophy in the things that we’re doing. It’s a very flat structure group. We’re built pretty much the way Carnegie Mellon, which is where I come from, was organized, meaning we tended to build critical mass groups and sort of not just one or two people in each area, but trying to build critical mass groups and make a difference in the areas that we do research.
It’s a very open group. We are very aggressive. We publish the work that we do. And the focus again is to make sure that what we’re doing really is at the state of the art. And one of the ways to measure yourself is by going through peer review literature. And so we peer review our work. We’re out there and we’re very open with the things we do. If you go to our Web site, www.research.microsoft.com, you’ll find not only the research that we’re doing, but you’ll also find the names of all the people that are working on this research.
I’ve had a couple people ask me in the past, you know, is that a problem. I mean, aren’t you worried about headhunters? I said, well so far we don’t lose anybody. This is a great advertisement for people to come to work for us, so it’s worked out well for us.
And we have a lot of visitors. A lot of people come into the research group, over 200 visitors a year. I think I know it’s probably more like 250 now, and a huge number of Ph.D. interns from around the country. In a typical summer we’ll probably have on the order of a hundred to 110 Ph.D. researchers, research students working with interns in Redmond and in some of our other labs. We have a large number of Chinese students working in our labs in Beijing and likewise European students in Cambridge.
So we have a lot of people coming through. We’re always giving talks and presentations. If you go out on the corporate network, on any given day there will probably be two or three different presentations being given by either outside researchers or outside speakers. So it gives you a sense of people are constantly coming through and we’re exchanging ideas.
Another part of what we do, another part of my mission is also to encourage university research, and we work with a lot of universities. About 15 percent of Microsoft’s total basic research budget is devoted to supporting academic research and education, and so that’s an important part of what I do. And I know a number of you have contact with local universities and other organizations. You know, if you ever have a question about that, you can come through me or Doug Leland, who is our director of University Relations, to talk about that. We view that as being an important part of the community that we’re in and the things that we’re trying to support.
Now, when I talk about the mission of our group I always like to phrase it sort of in this way. The first thing I view that we’re trying to accomplish is move the state of the art forward in the areas that we’re doing research. That is the number one priority. Because if we’re not doing that, we’re not really going to be of that much value to Microsoft. I mean, Microsoft can always go out and buy technology from other people. The value of having a basic research organization inside the company is to be able to create things that really can’t be done elsewhere that really are pushing the state of the art and doing it in a Microsoft context, where you can rapidly take that technology and move it into products.
That’s the second point, which is when we do get ideas that make sense and we do prove out what we’re doing, then there’s value in really getting that technology into the products very quickly, and we’ve been working with the product groups to do that. So that’s really our second mission. We have a group of dedicated program managers whose sole job it is to sort of lubricate that technology transfer. I know some of them do work with some of the groups here in the Silicon Valley center, so that’s something that is a part of our mission as well.
Ultimately all this is about making sure that Microsoft is still here 10 years from now. You know, it was funny, when I first started the research group one of the questions that I would most get from people I was trying to hire was, you know, research is a long-term venture; is Microsoft really going to be around in five years? And these days that seems kind of odd. We’re a $25-odd billion a year revenue company with $30-odd billion in the bank and people don’t usually ask me that question much anymore. But it’s just as relevant a question to ask now as it was back then.
Technology is constantly changing. We can see that in the changes that are going on even recently in the economy. So you really have to be constantly renewing your technology base and constantly renewing the approach that you’re taking towards technology if you’re going to continue to be successful. That’s been an important part of what we’ve being trying to do as a company over the years and we view that as a critical component of our research team.
Now, just to give you a sense of what we’re doing in these areas, when I talk about expanding the state of the art, we’re publishing a lot of papers. This last year at PLDI over 30 percent of the papers came from our research group. That’s a stunning number. I don’t think I’ve ever heard of a single organization producing that many papers for a major conference.
I’ll be giving the keynote at the major computer vision conference next week in Hawaii, and I think we have like 20 papers in that conference. It’s just enormous. We published more computers on computer graphics at C-GRAPH over the last six years than I think any two organizations have done. I mean, just our China research group now is producing about 200 papers a year and really having a big impact on the international community. So there’s a tremendous amount of effort that’s gone on here.
In terms of partnering with academia, I’ve already mentioned most of these points I think, but I think the key thing is that I think there’s a lot of opportunity for us to work with our colleagues in the universities to really help upgrade education in a significant way. So a lot of our efforts working with universities is really focused around how to make technology work better for universities and for the art of teaching.
In terms of technology transfer, there we’ve really been trying to drive the company’s products. And today if you look at the products that we’re shipping it’s hard to find anything that we do as a company that hasn’t been influenced by the research group. The packaged products that we sell are typically now going through optimization processes and testing processes that have been designed and built by the research group. We’re using technology to do scanning for bugs even before we begin the process of testing, and that’s been developed within the research group.
A lot of the key pieces of technology used by our products are also coming from the research organization. I think we have a slide coming up here that talks more about that. So, for example, the fact that Microsoft was able to sim ship Office 95 with Windows 95, I mean, that was a long time ago, it was five years ago. People may not think of this as being very complicated, but at the time it was viewed as critically important that we could ship Office 95 and Windows 95 on the same machine and be able to run it in 8 megabytes of RAM. The only way we were able to do that was with optimization technology developed in the research group.
And the funny part about that is that we actually started working on that optimization work for 32-bit code about three years beforehand when the product groups really had no interest in it whatsoever. They didn’t care about 32-bit code at the time. And everybody thought, well, and by the way, memory would be so cheap that people would have lots of memory. Well, by 1995 for a variety of reasons memory wasn’t particularly cheap and 8 megabytes was a lot, and so we were able to bring that technology to bear against our products, sim ship Office 95 with Windows 95 and really beat all of our Office competitors to the marketplace with 32-bit office products by almost a year, like nine months to a year. So it was a critical difference for us, made a huge financial difference in the company.
I mentioned the automated bug detection work that really helped to make Windows 2000 one of our most stable releases ever, and that was a critical part of that.
And a lot of the key technology that’s inside of our products, things like ClearType, if you’ve used that on laptops, MS Audio 4.0, which is a key audio technology in our Windows Media; the collaborative filtering, if you’ve used any of our Commerce Server products you’ve probably seen the collaborative filtering support that’s in there; intelligent search and so forth; these are all technologies that came out of the research group and we’ve moved into the product groups in an aggressive way.
And a lot of the groups that exist today in the company, like, for example, the Windows Media Group, really started as a research team that I put together back in 1994. We spun that group out in 1996. It has grown since then.
And it’s still the case that a lot of the core technology comes from research. All of the date mining facilities in SQL came out of the research organization. I started the first e-commerce group in the company. And so again these are all the technologies that are coming from the research groups.
So that gives you a sense of some of the sort of little bits and pieces of research. What I’m going to do now is give you a sense of the actual technology we’re working on. And I’m putting this in kind of a framework here, and one of the ways I like to think about what we’re doing is we’re really trying to break down a lot of barriers that have traditionally existed in our field, whether it’s the reality barrier, you know, really being able to create things that begin to resemble reality in one form or the other, whether it’s the barriers between people, really allowing people to communicate with each other in a reasonable way, barriers between people and computers, really breaking down that barrier that makes it difficult for people to get their jobs one, barriers between people and information, and that’s really a sense of how do I process and manage all the information that I can now get access to, barriers between computers themselves, how do I break down the barriers when we build large scale distributed systems, and frankly barriers within yourselves.
You know, everybody tends to break themselves into little pieces. There’s what I’m doing in my office. There’s what I’m doing in my car. There’s what I’m doing when I’m giving a talk in Silicon Valley. There’s what I’m doing when I’m at home. And there needs to be a way of bringing those things together, and I’ll talk about technologies that address all these issues, and there are a lot of other things we’re doing too, but there’s only so much I can get done in one talk.
Now, let me put a framework on some of this work. I mean, part of what’s happening over the last few years, and I think you guys here have probably seen this as much as anyone now, is that we’re seeing a tremendous amount of change in computing over a relatively short period of time. And it isn’t just about increasing processor speed, right, so Moore’s law says you’re going to get a factor of two every 18 months, which means a factor of 10 every five years, a factor of 100 every 10 years, and we’ve all gotten kind of used to that. We’re kind of adjusted to that, built that into our business plan, think, okay, well, computer are going to be that much faster; we’ll try to make that happen.
And even in some senses we almost become blas
about exponential change, even though fundamentally exponential change means everything up to now doesn’t count. Okay, that’s the mathematical description of exponential change.
But what’s really happening is that things are happening faster than exponential change, at least faster than Moore’s law. If you look at graphics, for example, in the last three years we’ve seen a factor of a hundred change in real time speed of graphics that you could deliver on a PC or a similar kind of device, and that’s just a dramatic change in a three-year period.
You’ve going from — I mean, I just remember the games we used to play back three years ago. When you think of the technology that’s now being deployed, it’s just tremendous.
In the storage area again we’re seeing these enormous changes, a factor of 2 a year really in storage. And we’re really on the very edge now of being able to make terabyte disk drives available to just about anybody, even in a laptop and I’ll talk more about what that means.
And if you think about that those things sound exciting, you know, the stuff that’s going on in the laboratories right now in terms of the underlying technology that underpin our field are just stunning. The work that’s going on in nano-scale materials, for example, manufacturing; I don’t know if you’ve seen that there was an article just recently about the single molecule transistor. One of our researchers is working on single molecule storage for solid-state storage.
I mean, there’s just a tremendous amount of work happening actually in a very short period of time where people have discovered a whole range of techniques of basically arranging molecules and managing material that no one really thought of before, and where three or five or five years ago people kind of snickered when they heard about nano-technology. They probably still snicker about the little intelligent machines that are going to go in your body and repair all your cells.
The reality is that the actual progress in nano-technology has been stunning in the last few years. Some of the work that’s going on where things like the organic, like the optical switching networks that people are starting to build that can basically switch light at over a hundred gigahertz in speed, and really these technologies are very easy to manufacture, relatively speaking. So it’s really exciting what’s actually happening there, and whenever somebody says things like condensed matter theory, it always makes me feel a little bit like Star Trek .
But it’s literally the case now that hundred gigahertz networks are pretty much in the can. I mean, people know how to build those. Terahertz wide area networks, wave positioned multiplex layer networks that are terahertz, we’re going to be there. I mean, you can already demonstrate that in the laboratory. Electro-optical back plane interconnect, those are technologies that may have seemed pretty far out a year or two years ago. People are really seriously building those things now.
What does that mean? Well, among other things it means that we can break down barriers. I mentioned the reality barrier. If you saw the original Toy Story movie, the individual frames had something on the order of 2 to 12 million triangles represented in that. The way you represent 3D graphics is through pixilation of a surface, and you tend to think of it as triangles per second of the graphic power. It’s not the only measure of quality, but it’s one that people often cite.
So if you think about that sort of 2 to 12 million triangles per second range, well the Xbox and the current generation PCs with the new graphics cards, they can do that. We’re at that stage already where we can build graphics that are as complex as what you saw in Toy Story one in real time. And if you’re not sure about that, just go to a Toy Store and check out the Xbox game console display that’s there. There’s just been an enormous change in what we can do in this short period of time.
If you think about what’s it going to take to get us to sort of the reality barrier, well Albie Ray Smith, who worked with us for a number of years, and is one of the founders of Pixar, did a little calculation for me once where he was judging that reality he thought was about 80 million times per frame. That’s a very back of the envelope calculation, a lot of other issues in it.
But the interesting thing to say is that if you believe that, then we’re not that far away from it. We’re within five years of probably getting to 80 million triangles per frame for a 24-frame per second kind of movie. That’s where we’re going. And so the changes that are happening are dramatic.
Now, you’ve already seen the Xbox, and I’m not going to bother with an Xbox demo, because you guys have already seen that. I did an Xbox demo for the PDC. But I’ll show you some other examples of what you can use some of this graphic power for.
Now, this is sort of a virtual me, except this is kind of a flat version. And that’s fine, but we can do a lot more now. I’ll just bring up a different kind of me, if I can find myself here. Okay, so this is a nice little 3D version of me.
Now, the way this is done is actually very easy. Some of our researchers working in computer graphics, computer vision came up with a way of basically taking a standard video camera, just what you’d use at home, they do one swipe of your face like this, get about 30 seconds worth of video. They’re able then to map that into a 3D image of you and texture map it and the whole process takes about three minutes. It’s very, very simple. I know last year at one of the events we were sponsoring that we actually set up a demo booth and had people coming through. I think we had a demo booth at ACM, the ACM conference here earlier this year where we just let people create their own faces.
Now, you can see there are a lot of problems with the technology. I mean, obviously this is not as handsome as I actually am. (Laughter.) Now, I’ve asked the team to work on that — (laughter) — and they’re going to put a lot of effort into it.
On the other hand, even with that defect, there are still a lot of things that you can do with it. One of the things that having a virtual version of yourself lets you do is you can do things like make yourself smile. That’s kind of a sickly smile, but it’s better than nothing.
Of course. you can also be very sad — (laughter) — and people really don’t want to see me sad, especially people who work for me. (Laughter.) So they want me to be happy and you can reset me and I’m happier.
Of course, Dan’l probably knows this really well. This is a key feature for Microsoft or really any executive, right, the ability to tend to be thinking. (Laughter.) That’s kind of a critical feature that everybody needs to have and especially in those meetings where you’re really thinking about what you’re doing to do over the weekend.
And, of course, I mentioned I was a Star Trek fan. If you really want to make yourself look like Mr. Spock, you can sort of be a little bit more Spock like. Or if you’re science fiction interest moved more toward ET you can be ET.
And, of course, one of the problems they have with this scanning thing is they don’t really let you wear your glasses, but you can add the glasses in. You can stick those in later.
Now, that’s one version of me. Let me show you a different approach to a virtual me.
Now here again this is just a flat image of me and our research group in China has come up with a way of taking a picture and they’ve worked with actually I don’t know the person but a supposedly well-known cartoonist from China to come up with a way of sort of automatically generating cartoon versions of you. So there is a cartoon me, and you can sort of see by comparison you get sort of just the hairline and the eyes and the features. And once you have one of these cartoons it’s a different version of reality, of course, but you can also do a similar set of things. There’s the happy me. There’s me just haven eaten something. There’s a very sad me, kind of crying. There’s a goofy me. There’s a very mean version. I’m never really mean. I don’t know how I could be that guy, and I don’t know what that’s supposed to be, honestly, and so forth.
And then you can do really amusing things like put somebody else’s voice in your mouth, but here’s —
BILL GATES VOICE: I’m very excited that a real pioneer in machine intelligence and multi-media research —
RICK RASHID: Okay, so you get the point. (Laughter.)
Actually, it’s funny. You know, I showed that three-dimensional version of me to a group of European parliamentarians. I was visiting Microsoft — that was almost two years ago. And somebody raised their hand and said, “Does this mean you can make a politician say anything that you want?” Of course, I didn’t really want to touch that question. I wasn’t going to say anything about it, but somebody else in the back jumped up and they said, “But you can already do that.” (Laughter.) Right, so obviously you don’t necessarily need technology to accomplish these goals.
But again this gives you a sense of where we are in being able to represent different kinds of reality.
Now, that’s kind of visual reality, and again next week at the Vision Conference I’m going to be showing a lot of additional technologies our vision group has created in this area. But we’ve also been working in the auditory or acoustic space. And what I’m going to let you listen to is some work that was done by our speech group, where what they’ve done is they effectively have come up with a way of extracting sort of the salient qualities of somebody’s voice and then being able to animate that based on a set of text and really effectively extracting human processes to use to create a very lifelike experience.
So what you’re going to listen to are sort of the synthetic and real voices that we’ve captured. And basically the way this is done is to get this thing you have to have a throat mike on for four hours and talking constantly, which again for me wouldn’t be a really big problem but for other people it’s actually somewhat difficult to do. And then we sort of extract the auditory qualities of someone’s voice and then you can then sort of animate it in interesting ways. So I’ll just let you listen to the results.
VOICE: Various automatic trained voice tables. First, synthetic voice. Second, original recording.
COMPUTER VOICE: The meeting had scarcely begun; then it was interrupted.
RICK RASHID: That’s synthetic.
VOICE: The meeting had scarcely begun; then it was interrupted.
RICK RASHID: That’s the real person.
COMPUTER VOICE: He spoke to various of the members.
RICK RASHID: That’s synthetic.
VOICE: He spoke to various of the members.
RICK RASHID: That’s the real person.
COMPUTER VOICE: However, the aircraft, which we have today, are tied to large soft airfields.
RICK RASHID: That’s synthetic.
VOICE: However, the aircraft, which we have today, are tied to large soft airfields.
RICK RASHID: That’s a real person, but he sounds just like the computer. (Laughter.)
Now, I wondered about that, because he’s like a voice actor that we hired, and I’m trying to figure out why the speech team hired somebody who sounded like a computer. (Laughter.) And it occurred to me that well maybe he doesn’t really sound like a computer; maybe this is just his like voice acting voice and so they didn’t really know it until they actually got him in the booth, right. But then I met this guy and he actually just exactly sounds like that. (Laughter.) He sounds just like a computer.
But again part of what you’re hearing there is what we’ve done is sort of extracted the qualities of someone’s voice, right, but there we’re using human processes to play it in a sense. And human processes is effective with music and with voice, and so the lilt of your voice up and down. You can actually use real music to do the same thing and use this technology to effectively create artificial singing. Now, the way you really approach that is you make MIDI music and you basically score it. You put your MIDI score in. You put the words down that go with it. And then you get singing.
Now, by contrast I just wanted to, in case — you know, the earliest form — I want to be sure to credit the people — the earliest form of artificial singing that I know of was done at AT & T back in 1962 and was really an incredible feat that they were able to do this back with the computing power of the day, but it still sounds really, really bad. I’ll let you hear a little bit of it. It was the inspiration for the little singing piece in 2001 .
RICK RASHID: This is using the speech technology.
RICK RASHID: Now, there are several things you learn from that. I mean, first off this is not a technology that in any way threatens Simon and Garfunkel. (Laughter.) You know, we’re not there. But you could easily imagine again looking at little bit forward to the future the notion that you may want to capture people’s voices that are famous singers or that have a particular meaning to you personally or meaning to society for the purposes of being able to recreate singing or to do new things with their voices after they’ve passed away or after perhaps their voice is no longer good or they’ve lost their voice in some fashion. Just like you could imagine the sort of 3D image stuff that I showed you earlier, where you could imagine using something like that to capture an imagine of your baby, your infant that you could carry with you to later in life.
So this notion of being able to recreate reality in some way is valuable for a variety of reasons, and those are some of them.
Now, I’ve mentioned breaking down other kinds of barriers. We talked about breaking reality barriers. And also I didn’t really talk about the physics, and there are a lot of other areas where we’re able now to do a lot of simulation that we weren’t able to do in the past.
But I’ll just mention breaking down barriers between people and computers. And here things like being able to do handwriting really well, speech really well, computer vision, natural language processing, these are some of the areas that when you think about how do we create user interfaces that actually are oriented towards the user as opposed to being oriented toward the computer that you may want to address.
Now, I know a number of you are very, very familiar with some of the natural language technology we first started shipping in our Office products back in ’97, sort of the little green squiggly underlines, the automatic numbering. Those are all technologies that derived out of our natural language research work, where we’ve really been able to put a broad spectrum parser for English and now for several other languages into our main product.
Now, underneath the covers in Office we’re really actually going in and analyzing the sentence. Now, we only expose what we call (textbook peeking ?) which is the ability to sort of get alternative versions of your text or to tell you that your text isn’t really quite as grammatically correct as it might have been, or at least sounds kind of strange, but we’re actually doing more than that. We’re actually doing analysis of level of what we call logic of the form. We’re actually able to do things like find out that in this particular case that the deep subject is “Motorola,” that the deep object is “chip” and that the adjective is “diagnostic,” and we’re even able to figure out which meanings of words that you’re probably using in the sentence.
And we’re beginning to use that now in a number of our new products and in new ways. One of the things that we’ve been able to do is use our natural language technology to effectively automate the process of extracting information from English text. One of the ways that we do that is effectively take something like a dictionary, we’ll use the natural language parsing to effectively extract the relationships between the words there and we build a semantic network that is actually stored in a relational form. So we build this semantic network of relationships between those words.
And you can get a lot of information out of that. You can find out things like relationships between facts that you might find in a dictionary. Here is a bit of a mine net that’s been built around the word “bird” and this comes from a dictionary that was scanned. And you see that you automatically pick up things like a duck is a bird, that a quack is caused by a duck, that a quack is a sound, that a chicken is a bird and so forth. And you just automatically get those relationships; they’re in the dictionary. They’re there and you can use techniques like this to automate the process of getting that information out.
Now, when you think about language and the role language plays, there are lots of things you can do with this. You know, we’ve been using this language technology for things like automatic retrieval of information. In Windows XP, for example, we’re actually using some of the natural language technology in what’s called the Search Companion, which is sort of a question answering part to try to figure out what task you’re working on and direct you to the right person. We’re using it for data mining. But you can also use it — in some senses, sort of the Holy Grail has always been to use it for things like question and answer, you know, being able to ask an intelligent question of someone and then get back something like an intelligent result. And we’ve been doing some work in this area. I just wanted to show you a quick demo of that.
So what I’m going to show you is this is something that one of our researchers, Eric Brill, has been working on. And it’s pretty preliminary work, but what he’s been trying to do is he’s trying to address this issue, which is there’s a huge amount of information on the Internet or in other kinds of sources of information, but this particular one uses the Internet, and he’d like to be able to sort of extract answers to questions by going out to the Internet and getting the data, right, so sort of think of it as Ask Jeeves, but where it’s not people that put in the answers but in fact the net sort of puts the answer in for you.
And the question is how can you sort of figure out, given an English question, what would be the appropriate answer, and so he’s been coming up with ways of effectively using natural language processing to take apart the request, effectively try to recreate what an answer might be, and then go out and look for that and then use statistical techniques and some redundancy information to produce actual answers.
And you can get some fairly straightforward things, like so if you wanted to do something like, “Who is the head of –” and hopefully I’m online because this is actually I’m using a server back in Redmond to do this. So what it’s going to do is it’s going out to get a lot of information and come back and it actually found me. And it’s got some other answers there, too. What it does is it brings up things that it thinks are potential answers based on the information it’s extracted, so it probably found some articles that talked about Nathan Myhrvold back when he was the Chief Technology Officer and relay that back to Microsoft Research. Roger Needham heads up Microsoft Research, Ltd., which is our Cambridge subsidiary, and he is the managing director there and it is the computer laboratory. So you can see where it’s getting a lot of this information. But it is trying to get the sort of essence.
You can ask it sort of factual questions, so, you know, “When did — ” and yell at me if I type something wrong, because that’s how I do it always. So, “when did Abraham Lincoln die?” And again it doesn’t know anything about Abraham Lincoln, right? There’s nothing programmed in this system that knows anything about anything, and it comes back April 15, 1965, you know, which is a pretty good answer. (Laughter.)
If you ask, however, the question of, “When was he shot,” you’ll actually get back a different answer. In fact, you’ll get that April 14, right, so you’ll switch the two.
And, of course, the reason why you have both of them up there is because among other things you’ll find Web sites that have it wrong. (Laughter.)
Now, again it’s the very early stage of the technology. I’d say it’s about, I mean, based on our sort of tests that we do that are more scientific, we get about 50 percent of the answers right, and we often get the right answer in one of the top ones. But you could use this both for answering sort of factual things — it’s actually really good at the Who Wants to be a Millionaire at the $100,000 level sort of up — (laughter). It’s actually relatively bad at the common sense questions that they ask for the $100 or $500 or whatever it is, but at the lower levels, because there isn’t really a good answer on the Internet for those things.
But you can ask it some really funny things and get some interesting answers. So if you want to say something like, “Why is the sky …” — I just typed this in one day and I liked the answer so I thought, you know, “Why is the sky blue.” And you wouldn’t think that would necessarily have an answer, but white is as good a reason as any. Certainly it’s probably the best reason you could come up with that’s one word.
But you could also ask it some really weird things like — oh, shopping, that could make some people blue. (Laughter.) Okay, so “Who was that masked man?” And I always like to ask questions like this whenever I try to use a system like this just for fun. And I was actually surprised to discover that it’s pretty much close to right, and let’s see. The Lone Ranger. It’s interesting that Clayton Moore, who’s the guy who played the Lone Ranger comes up there too, so that’s actually pretty good.
Anyway, I don’t know that the technology is good enough to be used for real purposes, but for fun it’s a heck of a lot of fun.
Okay, other areas: I mentioned breaking down barriers between people. There is some exciting work that’s been going on in that, and I’ll show you a couple of different things. One is something called Sideshow, and it’s probably easier to show Sideshow — no, I guess I don’t have it ready, so let me just start it up — than it is to spend a lot of slides on.
The idea behind Sideshow is the notion that when you’re working with people in your workgroup, it’s nice to be able to have sort of a constant status information on where people are, what they’re doing, the things that you’re working on, things that are importantly to you personally and so forth. And it’s going to take a minute to load up everything here.
But here, for example, now this isn’t my normal laptop; it’s just some demo machine, so it doesn’t actually have my mail. But it would normally have my mail. I’d be able to see it to the side. For you guys that are Hotmail, it would have my Hotmail. If I was set up properly it would have my tasks and my appointments. It’s got breaking news. And I can put any kind of ticket I want into this bar and have different kinds of information maintained and updated for it. There’s the weather back in Seattle, and if you put your mouse over it, it gives you what we call an extended tool tip to give you more information on it.
Now, if my wife was online you’d see her picture, the real picture instead of the converted one. If I’m one of these people that just has to buy something from Amazon every day, which I’m sure they would like, this is Amazon’s recommendation for what I should buy. This is the current stock information. We’re up; everybody else is down.
You know, this is data that would be my contacts list, if it were properly set up. Here are the cameras from the bridges over in Seattle, and the traffic conditions in Seattle and so forth.
But you can put all sorts of things there. You can put array databases, so you can monitor your array database. You can monitor individual Web pages. You can monitor all sorts of things. You can author what we call these tickets really easily and put them in the bar.
And again the idea is to be able to create an environment where you can sort of in one quick glance to the side in your peripheral vision you can see things that might be important to you in a particular workgroup environment. And we’re working with a number of product groups to get some of these ideas put in a product.
But the idea again is sort of putting all the information about a person into one place is something that we think is very important.
Another area that we’ve been putting some energy into by way of helping people work better together is here what you’re seeing is we’ve built a very simple, very inexpensive camera and using multiple, very tiny camera components that are very inexpensive, and we’re using computer vision technology to stitch those together in real time to create an environment where you can literally see 360 degrees around in a room. So you can prop one of these things on a table in the middle of the room and everybody then can be communicating. You don’t have to have a sense of where is the video conferencing camera pointed or steering it to where you want it to be.
In fact, you can see the little squares on people’s heads, where we’re keeping track of people’s heads in the image, too, so that we can do some automatic choreographing of the video. And I’ll just give you a quick view of that.
This is how you might imagine actually using something like this, where you’ve got that 360-degree camera in the room and we’re also able to integrate into this a whiteboard and just using computer vision technology, and we’re able to attach changes to the whiteboard as annotations to the video, so that when you go back and review what happened during the video conference you can actually get that.
Again, we’re using this now to take advantage of it within the research group and experimenting with it, and it’s also something we’re working with the universities on in setting up the virtual classroom using technologies like this.
But again the goal there is to really make the whole process of doing a talk or a lecture or doing a video conference as seamless and easy as possible, you know, and to allow the person watching to have a sense of what’s happening in their environment.
So again this is stuff that we’re doing in the research lab and we’re sort of moving forward with other people.
Another area I mentioned is breaking down the barriers between people and information. Now, I know a number of you are probably familiar with the fact that computer storage is getting bigger all the time, but it’s a funny thing to think about what you can do with that.
You know, when you think about in two, three years time we’ll be able to put a terabyte of store on a single 3.5 disk drive that you stick in a PC or even a drive you could stick into a laptop.
Now, to give you a sense of what a terabyte means — and I know for you guys in Hotmail you know what a terabyte is. In fact, you know what more than one terabyte looks like. But for those of you who are not involved with that, a terabyte is you could store every conversation you’ve ever had from the time you were born to the time you die in a terabyte of disk. You know, for the people who talk less, they could actually have some leftover storage — (laughter) — for other things. For the people who talk a lot, you know, they’d just barely get it in.
But the sense here is that we’re getting to a point where you really don’t necessarily have to forget at least what happens to you as a person. And for small businesses, medium-sized businesses, they may not ever have to forget any transaction that they’ve performed.
Now, our first effort to get the terabyte databases on the Internet was back when we did the Teraserver. And it doesn’t look quite like that anymore. My pictures are old. The Teraserver guys always complain at me that my pictures are old, but I like my pictures, so I keep using them.
But this is really an effort by us to put a terabyte database on the Web. And the cool part about it was we were able to let people look at things like the pyramids, especially the students, and see them online in ways that they have never really been able to do before. So here are the Pyramids of Giza. Here’s the Microsoft campus. This is an old picture. This was from 1990, so it was even before I got to Microsoft. But we’re getting updated images all the time now, continue to run the site.
There’s the Space Needle. I like that one. This one was actually in the press recently. It’s an image of what basically could be best described as an airplane burial ground somewhere outside of Tucson. And you could see that there are various airplanes in various stages of disrepair, and I guess they cannibalize them for parts. I don’t know what they do with them exactly, but it’s a very pretty picture. But I think somebody used this recently to embed some other information in for purposes of demonstrating signography and so it made it in the press.
This is one of my favorites. We had two sources of data. One was the Russian Space Agency, which basically is old spy satellite data, and the other was U.S. Geological Service. So the USGS is just images of the United States. The Russians gave us images of most of the world, including a lot of pretty good images of the United States, not surprisingly.
This is a picture of the Vatican, and it’s actually cool because it’s 1.5-meter resolution, and you can see that there’s a traffic accident on the bridge there. You may not be able to see it from where you are, but you can see it pretty clearly on a PC screen.
Now, you can’t really see the people yelling at each other, you know, but I lived in Italy for a year and I do know there are people yelling at each other there. (Laughter.) If any of you come from Italy you know exactly what I’m talking about.
That’s the Teraserver how it looks like today. It used to fill up an entire room and it’s gotten smaller and smaller as time has gone on, and the database has gotten bigger. Just to give you a sense of how big it’s really gotten now, I think it’s on the order of 17 terabytes of data total, and we’re continuing to add more data from the U.S. Geological Service.
We won an award from the U.S. government for innovative use of government information, working with the USGS, and you can see a sense of how many people continue — these are all current numbers — how many people continually use this Web site. It’s really stunning how almost four years later how much the Web site continues to be used.
What’s even more interesting is that after we got the Web Site up and people were using it, people kept asking, “Well, can we get at the data, how can we get the information.” And so Tom Barkley from our Bay Area research lab up in San Francisco worked with the VisualStudio.net team and turned Teraserver into a .NET service. And so that’s now up and running as well, and it supports the whole stack. So if you don’t know what a .NET service is, this slide probably won’t solve that problem, but it gives you a sense of the fact that we’re really trying to provide a data service using the sort of standard SOAP protocols, UDDI for discovery and so forth, and it can access all through the standard VisualStudio.net tools. In fact, I think there’s a way in MSDN to just directly use those.
We’ve been making this available for students and a number of universities are actually teaching courses now using the .NET Tera Service. I know at MIT they are for doing distributed computing using the VisualStudio.net beta, the academic version of that. But we’ve also made available through the Tera Service a lot of other kinds of information, like topographical information, landmark information, map information. It’s now all up there as well. And we’re making available a lot of services around it. We call it Tera Tile Service, retrieving the imagery, and the tier and doing the projection work, and then what we call Landmark Service actually identify a landmark and be able to access more detailed information.
And it’s actually being used in product. The USDA is actually using the Tera Service in production now in a soil management system that they’ve deployed. And so they have people in the field actually accessing the information, using it and taking advantage of it and specializing it for their needs.
So it gives you a sense again, and again part of what I wanted to get across here is as we have these enormous databases and we make them available to people, what’s going to be exciting is really making the tools available that let them analyze that information.
Some follow-on projects, one of the most exciting of which is the National Virtual Observatory, where Jim Grey and Tom are working with Johns Hopkins and the National Science Foundation and the Sloan Foundation. It’s also called the Sloan Digital Sky Survey. They’re in the process of putting up a Web server that basically is the inverse of Teraserver. It’s images of the sky, images of stars, not images of the earth, and it’s already available for use by astronomers. And my understanding is that there have been a number of new astronomical objects that have been discovered as a result of having this huge database online and being able to do data analysis of it.
The last thing I’m going to mention and I’ll run through this pretty quickly is this thing I mentioned of breaking down the barriers of the home and office and the various pieces of your life.
And I’ve got a little video, but I’m going to skip that in the interest of time and talk about a couple of different technologies. One is something called priorities and now there’s something called a notification platform. Basically these are technologies we’ve created to try to gather huge amounts of data about the things that you’re doing with your PC and the things that you’re doing with your mail and then use that information and learn from you about what you consider important and what you consider unimportant, and effectively help you through the process of organizing and managing that information, whether it’s organizing and managing your mail or organizing and managing your notifications.
So priority was all about mail and the idea is that as mail comes in, you know, we can automatically try to determine how important that piece of mail is likely to be to you. And the way we do that is first off we had a number of regular people who we used as kind of a control group to establish sort of a baseline of what people normally consider important and what people normally don’t consider important. We’re using statistical techniques to analyze that and learning techniques to sort of learn. And then we look at your behavior, how you handle your own mail, and we’re able to sort of understand the levels of importance you place on the mail by how long you look at it, how quickly you delete it, what you do with it and so forth. So we’re able to monitor and use that data to effectively try to learn what you’re doing and to associate the level of importance to it.
So a very high number, close to a hundred are the things that people consider important. Things down in the zero to 25 is considered the junk mail category, and it’s usually pretty good at determining junk mail.
We’re also able to do more than that. One of the things that we can do is because we’re able to use natural language techniques to analyze the mail, so we actually know when mail is referring to something like a scheduling event, you know, you want to schedule a meeting with someone and so forth. We know what your schedule is. We know where you are in the org chart. We know who you normally communicate with on a daily basis. And so we can build models of how you’re likely to be paged.
One of the ways we can use those models is actually to automatically generate group information or to mark your schedule. So there’s something called, it’s sort of a time warping where we can actually figure out in this particular case that based on your schedule and on things that you’ve done in the past you’ve got a couple of appointments on your schedule, but the system is calculating that you’re probably going to be gone longer than that, because you usually take a while to get back.
Right, how does it know that? Well, it knows where you are on your schedule. It also knows whether anybody is typing at your computer or not, you know, logged in as you, and it can use that information to keep track of where you are. And so when you come back to your office it can know, okay, now he’s back to the office. On average, that took X amount of time, so I can start building that into my model of you as a person and take advantage of that, and we can use that to schedule OOF messages. We can even tailor those messages for different kinds of people.
And some of the priorities and scheduling technology is already in Microsoft Mobile Manager. We’ve already made that transition into the product. But we’re continuing to work on this and building what we view as sort of a common notification platform, where we’re again using these statistical techniques to look at the kinds of information sources you have, effectively aggregate that information with knowledge about your notification preferences and the kinds of information passed, and to keep track of where you are, what devices you have access to, where you are in that process and monitor that, and then we could use that to generate notifications in a useful way.
So just to give you a sense of what this looks like, I mean, here’s a little visual of the system keeping track of what’s going on. So this is Eric Horvitz, who’s the guy who drives all this research, and you can see that the system is basically building profiles and trying to calculate what he’s likely to be doing. It’s using visual information. This could be the camera in your room. It actually knows when you’re looking at the screen and you’re looking away from it, when you’re looking at someone else. If you’ve got a microphone it knows that you’re talking or not. When you’re in a conversation, it can judge whether you’re likely to be presenting something or not. It knows whether you’re in a phone conversation and so forth.
So in effect what it’s doing is building a model of you and your behavior. And if you think about your office machine or your laptop, it knows all this stuff, right? It knows what’s happening. It’s not bothering to keep track of it and there’s no software that makes good use of that information. What we’re saying is if we could do that, we think we could deliver a lot of new value to someone in managing their life, in managing the information that comes to them, in controlling what information other people have about what they’re doing. So I just wanted to give you that.
So sort of the last slide. Yes, I think we’re at a new future. I think we’re at an inflection point. And the research organization is really focused on trying to drive to bring a lot of these new technologies out.
I really didn’t talk about a lot of things we’re doing. I didn’t talk about the programming work. I didn’t talk about my own work in operating systems. But I hope I gave you a sense of some of the things that we’re trying to do to really create a new kind of computing environment and a new kind of computing experience for people.
I mean, again our goal in research, and certainly Microsoft’s goal as a company is to create new computing experiences to really give people new value in the work that they’re doing and so this is an opportunity to do that.