Rick Rashid, Senior Vice President, Microsoft Research
Mark Emmert, President, University of Washington
Dan Ling, Corporate Vice President, Microsoft Research
Microsoft Research 15th Anniversary Celebration
Redmond, Wash.
September.26, 2006
KEVIN SCHOFIELD: Good morning. Let’s get started. Tough crowd this early in the morning. Good morning. I’m Kevin Schofield, general manager here at Microsoft Research. I want to thank you all for coming today as we celebrate the 15th Anniversary of Microsoft Research. It’s been quite an adventure for us over 15 years, and I hope to share some of that, a little bit of a retrospective with you this morning.
Bill Gates really wanted to be here in person this morning, was not able to make it because he had to be out of town, but we actually have a few words from Bill on video. So, let’s share those with you right now.
(Video segment.)
One of the things that we’ve learned in Microsoft Research over 15 years is that it’s very hard to sort of capture what’s behind a research lab in the PowerPoint slide deck. So we’re committed to being PowerPoint free today. But we actually have a great full morning planned for you. A little later this morning, Dan Ling, who is the corporate vice president and head of our Redmond Research Lab here is going to actually share with us, he’s going to bring in some of our research, and share with us some of the technology investments that we’ve been making for almost as long as Microsoft Research has been going on. You can see actually some of the great payoffs in these four different areas have brought with us. We’re going to have an opportunity for open Q&A later on this morning.
We wanted to start off this morning with a conversation, really sort of taking time and be a little more thoughtful about why Microsoft has a research lab, why that’s important for a software organization like Microsoft, how we think about ROI for a research lab in a corporate environment like this. And we’re very privileged this morning to have with us President Mark Emmert from the University of Washington, and Dr. Rick Rashid, who is the Senior Vice President and head of Microsoft Research Worldwide with us this morning.
At this time, I would like to bring them up and turn the floor over to them.
RICK RASHID: We’re having the discussion about which side we should sit on.
MARK EMMERT: And what would be a political statement, and what would not. I’m on your left, you might note.
The format for what we’re going to do today is, we’re going to just chat with each other about this whole enterprise that’s been running now for 15 years, and then we’ll get a chance to hear some of Rick’s thoughts on what’s really been going on, what the future holds, and why this kind of an enterprise makes sense in an organization like Microsoft, because it’s really quite distinctive and very unusual. Let me start there, Rick. Fifteen years ago, you were brought in from the academy where you had a very distinguished career as a computer science professor and researcher to found this brand new enterprise. And at that moment, what were you trying to accomplish 15 years ago, and looking back on those 15 years, how do you think that experiment has gone?
RICK RASHID: You know, it’s interesting, when I was considering taking the job to come to Microsoft, I had a rare opportunity of really being able to get some tremendous advice from one of the great leaders in the field, Allen Newell. Allen was a Turing Award Winner, he was the founder of the computer science department at Carnegie-Mellon, where I was, and he’s been involved in getting RAND going. So, he had a lot of experience in really how do you start, and how do you maintain a great research environment. I had a chance to talk with him. Unfortunately at the time, he was ill, and he knew he was likely to be passing away from cancer, so it was kind of a very poignant time to talk with him as well. But he gave me great advice about what it takes to really make a great research lab. And, in some sense, also the feeling for how difficult it was going to be. He wanted to make sure I understood that. And how fragile such an enterprise can be in terms of creating the right kind of environment, and making sure you have it.
I think the key things that I got from my conversations with Allen were, first and foremost, that you had to bring in great people. It was extremely important that when you’re hiring, you’re hiring really for talent, and not so much for what it is those people actually do. A lot of times people today will ask me, if you have specific research areas you’re trying to hire in, or what is your focus? I think the answer is always, when we have an opportunity to hire great researchers, whether it’s great young people coming out of Ph.D. programs, or established researchers that have really created great careers and have continued to be innovative through those careers, we’ll take that opportunity. It doesn’t matter what areas they’re actually working in, historically that’s always proven to be the best way to create a great research environment.
The other aspect is that you really have to have an environment which is focused on moving the state-of-the-art forward. It is, in some sense, more important to me that Microsoft Research is at the forefront of a research area than it is that we’re doing anything that’s specifically related to Microsoft. If we’re doing great research, then we’ll find a way of making that relevant to Microsoft. I mean, great research has that quality. If we’re not doing great research, there’s really no reason for us to exist. So early on, I wanted to create an environment that had great people, that was open, where we were taking our research and subjecting it to peer review in the same way you would be doing that in a university environment, so that your strongest critic, in some sense, had the opportunity to say whether you were really doing innovative work or not.
And then I did also want to create a research environment that really worked well in the context of a software company like Microsoft, and my feeling at the time was that, and really the reason I took the job, because I could have done research in the university, was my feeling that the input to a company like Microsoft was the natural output of a research file, it was the ideas, the innovation, the artifacts that get created in research, the algorithms that can be used to do computer science, that was a perfect input to the product creation process at a company like Microsoft. So that’s what I was hoping to do, and I feel good about what we’ve become.
MARK EMMERT: Good. If you had to go back over these 15 years and point to a few key discoveries and research outcomes that you’re most excited about and proud of, what would those be?
RICK RASHID: Well, there are many points that I could pick, that I think we’re so critical to the way the organization has been shaped. For example, really early on, actually going back to 1992, we were working on technology for optimizing how 32-bit code ran. The goal of that was – it was really very mathematical work, very theoretical, but the idea was to come up with new ways of arranging 32-bit code, so that when you execute it, it would take up less space in memory. Now, we’re really proud of that work. We’re always proud of our own work. And we talked to the product teams at the time, and the reaction we got was kind of interesting, it was very much, well, you guys are really incredibly smart, we really appreciate you, but we don’t actually have that problem, but we really appreciate that you’ve worked on this, and boy, you guys are really smart.
Three years later it turned out that the company really needed that research, we were trying to ship Windows and Office at the same time, and because of trade issues at the time between United States and Asia, DRAM prices were abnormally high, so the size of memory in computers of that day hadn’t really increased the way people were expecting it to. Of course, 32-bit code took up twice as much space as the old 16-bit code did, and so it was important that we find a way to get that space back. So that technology was critical to allowing us to succeed in that product ship. That was an important period for the company. And I think the fact that we worked so closely with the product teams at that point established a long-term relationship with them, that we have built on over time, in terms of technology transfer.
Another point I think was particularly important, to me at least, was going back to 1996 when, in that particular year at SIGGRAPH 20 percent of the papers at SIGGRAPH that year, which is the top graphics conference that’s held every year, 20 percent of their papers had a Microsoft Office. In some senses that was almost like a coming out party for the organization, we’d grown at that point in time to about 100 researchers, and that got people in the academic community to really sit up and take notice, wow, there is some incredible research going on there. These people have built a tremendous research organization.
I can just remember meeting with Nathan Myhrvold, who was my boss at the time, he was running our advanced technology division, which research was a part of at that time. I went into his office just a few weeks after that conference, and just spontaneously we started smiling and laughing, because we were both thinking the same thing. It was like, wow, we started this enterprise, five years later it really seems like it’s taken a lot. It was at that point that we decided to start expanding outside the , dealing with we’d been able to establish a strong research group here, and now it was something where we could start trying to attract some of the great talent that was outside the United States, as well.
MARK EMMERT: Both you and Bill in the video talk about the essential nature of Microsoft Research in being an open research enterprise. That, of course, is how the academy works, that’s how I’m accustomed to overseeing research activities. But, it does cause us a lot of challenges when we have to go to taxpayers and other funders who say, okay, what’s the return on investment here? What are we all going to get out of this? Now, I think Microsoft is extraordinarily unusual in the fact that you’re one of the few really serious research, as well as R&D enterprises going on out there, and then you’ve got this open characteristic, as well. Why is it critical that you be open? And secondly, closely related to that, given that you’re an open research enterprise, how does Steve Ballmer assess return on investment for what is a very large scale investment in computer science and research, in fact, I think it’s as big as NSF’s computer science budget or something close to it.
RICK RASHID: It’s probably not quite that big.
MARK EMMERT: It’s pretty close.
RICK RASHID: Well, the reason it’s important that we’re open is it’s critical for us to keep doing the best work we possibly can. I want to attract the best researchers, and I want to be sure that they’re doing top quality research. Frankly, the only way you can be sure that you’re doing that is to be open, and to subject yourself to external peer review. I mean, there are many issues that people have with the peer review process, and it’s not always perfect, but it’s the best way that the academic community has been able to come up with to really assess, are you innovating, are you doing new work, are you doing work that’s scientifically significant. And by keeping the environment open, by subjecting ourselves to external criticism and review, it really helps to ensure that we’re really doing the best work.
You can bring a lot of great people together, there is plenty of examples of that historically, where you bring a bunch of great people together, you have a closed environment, and for some short period of time those people can do excellent work, because they’re building on what they know, they’re building on what they’ve done before. You can’t sustain that without external review. You can’t sustain that without the criticism of your peers, and so that’s why to me it’s important. Frankly, I don’t think you can hire great people unless you’re able to show new people that you’re doing good work. By keeping yourself open, it allows you to do that.
In terms of return on investment, both Steve Ballmer and Bill Gates have been quoted publicly, which is great for me, saying Microsoft Research is the best stuff that Microsoft makes, and what they’re doing is they’re looking at the historical results. There are virtually no products Microsoft produces today that have not either taken technology from research, come directly out of research, or been built using the tools and technologies we’ve created in research. Microsoft Research has allows Microsoft to enter into new businesses, to meet new competitors, or to respond to changing business conditions.
So we’re really able to allow Microsoft to be more agile, as it confronts new challenges. And for a company like Microsoft, that’s absolutely critical. In the technology field if you’re not able to change, if you’re not able to adapt, if you’re not able to innovate, you’re not going to be around. One of the things I like to say is, the reason you have Microsoft Research is so Microsoft will still be here 10 or 15 years from now.
MARK EMMERT: You do a good deal or work in collaboration with universities in the and around the world, University of Washington has some terrific collaborations with you in a number of fields, and of course you’re doing it in Europe and around the country, and now in Asia, as well. Tell us about why you do that in collaboration with universities, and what you hope to gain out of that process. I understand why it’s clearly advantageous for us in the academy, what’s in it for Microsoft?
RICK RASHID: Well, it’s interesting, one of the very first things we did when we started Microsoft Research, really before we hired any researchers, was to create a technical advisory board composed of academics that would advise us on what we were doing, criticize us if we were doing things wrong, and give us better insight as to what was happening in the academic world. For me, that connection back to the university environment is super-important. For one reason, because we’re not going to do all the research in the world. I don’t care how big Microsoft Research becomes, the academic community, the center of gravity is in the universities, it’s not in corporations.
So it’s important for us to know we’re confident in research in the university environment, to work with universities, to be able to connect with innovative work, like at the University of Washington, where we’ve done several collaborations, so that we can be aware of what’s happening, and be part of that innovation as it occurs, and where it makes sense to help bring those innovations from the universities into the company, if Microsoft needs to continue to innovate. Again, Microsoft Research is not going to create all this innovation, we’re going to bring many of them in from the outside, whether it’s working with academia, or buying technology from some other company. So that’s important.
Obviously, the lifeblood of a research group is researchers, so for us it’s also important to work with the academic community, because we want to be the place where the best and brightest, young Ph.D. student wants to go work. We’ve been, I think, particularly successful in continuing to hire some of the best graduate students coming out of Ph.D. programs, and that’s really had a – that’s allowed us to grow, that’s really made the difference for us. We actually run what I believe is the largest Ph.D. internship program in computer science in the world. And, again, it’s part of both the way we interact with universities, but it’s also part of the way we help recruits, we get to show all these young students what it’s like working at Microsoft Research, what kind of environment we have, why that’s a good place for them when they eventually graduate.
That program is just in the this year, just in Redmond, in fact, had over 250 Ph.D. interns. To put that in perspective, the United States only produces about 1200 Ph.D.s a year in Computer science. So a significant fraction of the cohort of Ph.D. students will have worked at Microsoft Research. Worldwide we have more than 700 interns in our research labs. So it’s a huge program, and again, it’s another way that we work with universities.
So really for us it’s about connecting with the research, about working with students, hiring great students, and frankly, making sure one of our goals has always been to make sure that the university system is healthy, that we do what we can to ensure that computer science departments and some of the top engineering schools in the country really are able to do their best work. A lot of times we’ll actually fund work in new areas that may not yet have gotten the attention of governmental funding agencies, because we can do that. We can be nimble in the way we work with these universities.
MARK EMMERT: Talk a little bit about, if you would please, the international strategy you have with your laboratory. You talked about the huge proportion of Ph.D.s who are interning with you now, and you’ve now established research centers in a number of different locations. What’s that strategy about, and where do you see that going over the next 15 years?
RICK RASHID: It’s interesting that, if you go back to the earliest documents that describe what Microsoft Research might be like, it was actually a memo written by Nathan Myhrvold to the Microsoft board back in 1990. In that memo he talked about starting a research lab in the United States, and then one in Europe, and then one in Asia. It’s funny, because he never showed me the memo. We never actually talked about it. When I felt we had gotten established in the United States, and we first – I first stated talking with Nathan, and talking with Bill about the possibility of starting research labs outside the U.S., and the goal I had was to really capture the talent that those areas of the world have. Not everybody wants to work in the United States.
We have tremendously great researchers coming out of universities in Europe, coming out of universities in Asia, and my feeling was that as long as we were simply U.S.-centric our growth would be limited. There are only so many Ph.D.s produced in the United States every year, and a relatively small fraction of those would be of the quality we would hire. So if we really wanted to grow, and grow with Microsoft, I felt we needed to do that.
We were extremely fortunate in being able to hire Roger Needham, who had really been a pioneer at the University of Cambridge in the computer laboratory there, had run that organization for many years, tremendously respected in Europe, and was a friend of mine. He had come to a point in his life where he was interested in doing new things. We started talking about creating a lab in Europe, in particular in Cambridge, and so we started that lab, I believe it was 1997, and again that’s been growing really well for us.
Our success there led us to start looking at Asia, and we felt that there was a great opportunity in China in particular, because of the many students coming out of the Chinese universities, and the fact that really there was no one else doing basic research in computer science there, of the kind that we did. We felt that we could bring in great students and give them an environment where they could be productive and make a difference for Microsoft. So we started that lab in I believe it was ’98. Then we’ve been growing since then. We then created another research lab in the United States. We have a group in the Bay area now, and the most recent lab we started was in , in Bangalore. Again, in each case it’s always about the opportunity that region presents to us to hire great people, and to make a difference.
That said, one thing I’ve discovered is that we have these labs in other countries, and other geographies, it’s also given us other opportunities, that researchers coming from those environments have different perspectives than those that we might hire here in the U.S., that come out of U.S. universities. So it takes us into new research areas that we might not have gone into before, it looks in many cases at problems which are local to that geography. For example, our lab in India started up a program looking at how technology can improve life in rural communities and can improve education. That’s particularly important in India, and they’ve taken a leadership role there, and it’s just an example of how the researchers in a particular geography bring their own perspective into solving research problems.
MARK EMMERT: Good. So now you’ve collected what as an academic I call the best computer science department faculty in the world. You’ve got a dozen or so members of the National Academy of Engineering, myself included, incredibly distinguished folks, Roger is a great example of that. Where do you want to go with that talent pool in the next 15 years? What’s next on the horizon for Microsoft Research? And let me ask a related question, how do you set the research agenda inside Microsoft Research? Are the individual, I almost said faculty members, researchers, are their research agendas strictly curiosity-driven, or do you try and steer the enterprise in some specific direction?
RICK RASHID: I’ll answer the second question first. There really isn’t a good mechanism by which I could steer the organization.
MARK EMMERT: Now you know how I feel.
RICK RASHID: I don’t think that’s very different than being in the academic world. You hire – in some senses, the best opportunity we have here in research is by the people that we hire, what their interests are, what their talents are, and even there we’re really driven more by the quality of people than we are by specific areas. If I have a chance to hire an incredible person, it may be in an area we were never thinking that we would do research, and suddenly we’re doing research there. So to me I don’t really think about direction.
What we do do is we make sure that our researchers understand what’s happening broadly in the community. So as they think about their research they’re certainly aware of what the Microsoft product teams are doing, what’s happening in the academic world outside of Microsoft, what’s happening internationally. Certainly that can have an influence on what people do. But, it some sense, the goal is to have great people that figure out where they’re going. And I must admit, I come into work and I’m not infrequently surprised by some research project that I’ve never heard of before, that we’re working on. To me that’s good. That’s what makes my job fun.
In terms of where we’re going to go in the next 15 years, I just hope that we will still have 15 years from now the kind of vibrant environment that we have now. We’ve been incredibly lucky, because of – frankly, because Microsoft believes in doing basic research. It believes in supporting the work that we do. We’ve been very lucky in being able to build a stable research environment, with the same set of goals, the same mission, and not change that for 15 years. If we can continue to do that for the next 15 years, then I think the sky is the limit in terms of what we can accomplish. What technology will be doing then, like I said, I get surprised every morning. So my ability to predict 50 years in advance is probably poor. But, I’m sure it’s going to be exciting, if we continue to move forward the way we have.
MARK EMMERT: So let me put you on the spot just a little bit. As a computer scientist, if you got to pick one or two, or three really interesting problems, intellectual problems in the field, that you’d love to see cracked, that you think would have some really significant impacts on our understanding of computer science in the field overall, what might those be?
RICK RASHID: Well, I think there are several areas I’d point to. I think we’ve only scratched the surface, for example, in the biomedical area, in terms of what computing can do to improve people’s health. I think that –
MARK EMMERT: And control the cost of healthcare.
RICK RASHID: I met with the governor, the governor’s staff, and the cost of healthcare is very high on their list of issues that the state confronts. Internationally every country you go to, everyone is looking at the cost of healthcare, it’s a major part of the concerns that they have about their ability to support their citizens in the future. So I think there’s a huge opportunity to be able to bring computing technology, and the underlying theory into the biomedical area, and have a big impact.
I think another area that is extremely important is the environment. Again, I think that’s an area where technology could play a very significant role. One of the research projects we’ve had here, last year for example, is looking at traffic analysis, traffic predictions. Actually, one of the technologies we’ve licensed out to an outside startup now, is going to be bringing that technology to market. Again, the opportunity is, as we can collect more and more data about cars and how they’re flowing through our cities, and if we can get that data back to the cars to the people driving them, there’s an opportunity to positively impact traffic flow, to provide much better data to urban planners, as they think about how the roads work, and ultimately to think about new ways to drive the transportation business in a way that’s more energy efficient, and respects the environment. I think that’s really important.
And that’s the tip of the iceberg. We’re doing research, we’re working with, for example, we’re partnering with the University of Washington in oceanographic studies, Project Neptune, really looking at what’s happening on the deep sea floor, being able to stream live information to scientists, so that they can understand what’s happening and hopefully in the long run help preserve that environment, and improve upon it.
So those are just some of the challenges. I think there have been many others, just augmenting what people can do, how they manage their lives, and supporting it. We’re capable of doing much more now than is being delivered, and I think in 10 or 15 years we could help solve some real issues people have.
So I’m an optimist about technology. A lot of times people worry about, what could happen, where is technology taking us. In my experience, I think it’s allowed us to do so much, it’s allowed us to improve so many people’s lives, and I think the next 15 years will take that even farther.
MARK EMMERT: Agree, what are then, then – the flipside of that, what are the biggest challenges that Microsoft Research is going to face in the next 15 years? Let’s say 5, 15 is a long timeframe.
RICK RASHID: Again, I think two different ways. As an organization, a research organization is always challenged to keep the quality of its work high, keep the quality of its staff high. In some sense, as I said from the very beginning, research enterprises are very fragile. It’s very easy to start to lose your edge, perhaps lose your best people if you’re not careful. So I think organizationally it’s important to always keep striving, always keep moving forward, trying to improve, making sure that you’re really subjecting yourself to external criticisms, that you’re really doing the best work you possibly can. In some sense, that’s my personal cognition.
Organizationally, I think it’s always a challenge to take great research and get it into practice. I often talk about technology transfer as a full-contact sport. It’s not something that happens by accident as often as we would like. It can happen by accident, but mostly technology transfer is hard work. It’s hard work on the researchers’ part to identify the work that extends to transfers, to put it in a form that can be used in technology transfer. It’s hard work on the receiving end, the product groups that figure out how to take advantage of research, or how research can solve problems that they have. We actually have a team of people whose sole job it is to help do technology transfer, and it’s hard work on their part, too. They really act as the glue that brings the two sides together. So, again, I view that as an ongoing challenge, and it’s not something you could ever slack on. You have to keep pushing and pushing.
MARK EMMERT: Well, my last question, then, is one that was brought to me by Tyrone Willingham. He asked if you could help with a defensive strategy for Southern Cal?
RICK RASHID: You know, landmines. They’ve got a pretty good team. You know, the University of Washington won this last weekend, so you guys are on a roll.
MARK EMMERT: We’ll see if it’s a roll or just a fluke. But, in closing, you must be enormously proud of what you’ve created. It’s a great anniversary, 15 years of constant discovery, you’ve assembled this great workforce. You’ve been able to contribute an awful lot to the advancement of the field. So I just want to congratulate you and Microsoft on having done a wonderful job, your great partnerships with the academy are terrific. The technology that’s moving into the hands of people is very exciting, and it’s been a delight to get to chat with you.
RICK RASHID: Thank you, Mark.
MARK EMMERT: My pleasure. Thank you. (Applause.)
KEVIN SCHOFIELD: I’d like to thank Mr. Emmert and Rick one more time for that great conversation and sharing their thoughts with us.
While we get set up for the next session, we have a fun video to share with you that sort of explains a little bit about the extent Microsoft researchers will go to sort of get a glimpse of the future. Let’s go ahead and see the video.
(Video segment.)
KEVIN SCHOFIELD: Now I’d like to bring up Dan Ling, who’s the corporate vice president and head of our Redmond lab here in Microsoft Research. He’s going to for about the next 45 minutes or hour or something like that share with you some of the technologies that have been long-term investments for us here in Microsoft Research that have really been seeing big payoff most recently, and tell you a little bit about that, and actually bring up some of our other researchers to share some of the demos with you. So we have a number of stations and he’ll sort of be walking around and sharing that with you. So, Dan, come on up.
DAN LING: Great, thanks Kevin.
Good morning, again, and welcome to our 15th year anniversary event. This is an opportunity for us to take a look at a few demos and chat with some of the researchers. But since we only have about an hour or so, we’re only seeing a very small slice of the work going on at Microsoft Research, but hopefully it’ll be a fun peek into the kinds of things that we’re doing here.
So for the first project, for about the entire 15 years, close to the entire 15 years, we’ve made a very heavy investment in a technology called Bayesian inference where the technology allows you to reason even under situations of uncertainty, and then to have the software take action based on what it’s reasoned and based on preferences of the users to take perhaps an optimal line of action.
And so we’ve used this technology to do a whole variety of different things, but one area that we’ve applied it in is in taking a look at how we can conserve the user’s attention. Our attention is a very scarce resource, it’s certainly not growing according to Moore’s Law, and it’s consumed by almost all software programs as if you had all the attention in the world to devote to it. And so in the cases where you have, for example, tons of communications, you’d like the software to be a little bit more intelligent and to conserve the user’s attention.
Let me turn it over to Eric Horvitz to show you a little bit about our work in this area.
ERIC HORVITZ: Thanks, Dan.
So, for about five or six years now we’ve been looking very carefully at models of attention, building statistical models that can actually infer how busy somebody is at any time and as well as what’s the cost of interrupting that person might be at the time.
This work actually led early on to taking an economic model to attention in that we have built systems that can trade off a user’s focus of attention with the value of being aware of communications at a variety of times. And that meant not just building sentry systems like you see up here that could track where I was looking at any time, and my laptop is also listening to me right now to understand that I’m having a conversation, but also to understand the value of staying aware, the value of staying aware of urgent information, of being available for phone calls, and so on.
One of our early projects was called Priorities here, and it’s still being used around Microsoft, it led to several products, and this system actually learns by watching you work with e-mail to understand which e-mail messages are the urgent ones. In this case, a message from (Paul Kosheningly ?), but this is much more urgent than a message from Rick Rashid today, talking about I think our anniversary, which is kind of important but not urgent. So, that’s how that system works.
Now, we combine the valuation of urgency with the cost of interruptions and set priorities, actually not just considers the value of looking at e-mail and being aware, but also at how costly it is at any time, as well as when you’ll next look at e-mail for the mobile applications of this situation, of the tools here.
These systems led to broader applications of technology where we combine telephony with computing, exploring the unification of phone and computer in a way that led to prototypes that were used, one of which was called the Bestcom-ET, by 12,000 Microsoft employees for several years. And that project went off and the code base evolved into what we call today Communicator, which is shipping as of several months ago, as well as the Live Communication Server at the back-end.
So I’d like just to highlight the core notions here.
DAN LING: Right, that’s great. I mean, this is a great example of a technology investment that we’ve made for many years and has really led to important product impact here at Microsoft.
Another topic that Eric will show us something about is in the area of search. And in this case, what we’ve tried to do is go beyond the actual search tool itself and think about the entire search process and the whole activity that the user goes — that surrounds doing the actual search and the context around doing the search, and how we can actually engage and stimulate the user’s memory.
ERIC HORVITZ: So, memory like attention is a key pillar of cognitive psychology, and it’s just another example of how we started with the basic science of understanding cognition and then started working with applications and prototypes that have product influence eventually.
So I think about it quite a bit, as well as other MSR teams on search and browsing, both for Web search as well as desktop search; in fact, the Vista desktop search was originally a prototype by our team as something called Stuff I’ve Seen, very much focused on memory, landmarks memory.
What you see here is an application we call Life Browser, and hopefully you’ll see why we call this Life Browser in a few seconds here. But what you see here in the left-hand column is an automatically constructed timeline that goes back 10 years for me. And on this timeline the computer’s automatically selected thumbnails, like it’s crawling over my hard drive to find interesting, what it believes will be interesting memory landmarks, as well as video collections. You see here in blue a set of what the system has decided might be important landmarks in my meetings and appointments over time. And on the right, the system is bringing up what it thinks over a period of time here going back 10 years are important landmarks in desktop activity.
So notice I have a big slider here, and I can actually grab this slider and add more and more detail all the way up to where all these columns, all these type of things are pretty detailed here. And so to manage information overload, the idea here is that we have, even like a mix like you might see on an audio system here, let me see all my meetings. And see all that blue, how many meetings are on my calendar, but the system knows what the important ones are. Same for file activities; what you see here in red are files that I actually edited over time on these days. In black or dark gray or getting darker towards black are things that I just looked at without touching over time. And that’s how that system essentially works.
Now, how does it do that? Well, behind the scenes, we have Bayesian models that are trained by users. In this case, here is my Bayesian network, which is predicting which appointments are likely to be considered landmark appointments. And it turns out for me, it can be attendee atypia, strange people coming to meetings; strange organizer, organizer atypia; notions of location. (Laughter.)
DAN LING: (Off mike).
ERIC HORVITZ: Strange for two years of my appointments, unusual or atypical, right, that’s not necessarily strange people, but people that come very rarely to meetings. And so on.
Thanks, Dan.
Notice also that the system understands, uses Bayesian models to actually weave together experiences automatically from my collection. So I was in Venice a few weeks ago, and here’s the system’s view of what that was like. And here is an audio-visual experience here where a learned model can go through my hard drive and give me sort of a little holodeck experience for each thumbnail that I see.
And, of course, there’s also search here as well. So if I just put in, let’s say I was at CMU a couple months ago for an oral exam, I want to find that quickly here. And the basic idea is we could do a search, and when you do a search, we bring up items that the thumbnails will wrap around and find here. And so those items fill the right-hand column.
DAN LING: That’s research software. This is just research software, although some of the ideas have definitely gone into Microsoft products. So it may take a little while for the results to come up here.
ERIC HORVITZ: Right. But anyway, but the idea is when they do come up, the memory landmarks wrap around the search results in a way that give you a sense for what happened in your life, what happened on your desktop, when these events happened, and when these files were created or these appointments occurred.
So that’s just where we’re headed. We’re actually looking more deeply at this technology for both autobiographical tools, beyond search, for experiential tools where you can sort of have a way of actually probing and experiencing over time different events in your life.
So it started out as a search browser and a search tool. All of a sudden we had something that people said, wow, I’d like to have that for autobiographical kinds of means, or I’d like to have this for even memorializing a set of trips.
So I’ll stop there.
DAN LING: Great, thank you Eric.
So let’s move on to something a little bit different, and this is a project about interaction techniques and how we can interact with computers away from the desktop, or actually, directly on the desktop, away from a keyboard and mouse, away from, let’s say, using a portable device, but actually interacting directly with the flat surface. And Andy Wilson here will say a few more words and then show you a demo.
ANDY WILSON: Right, and so one of the big ideas in ubiquitous computing is the notion that computation will actually be in your environment, in your everyday space, so a little confusion around the notion of a desktop is completely appropriate.
We’ve gone away from sort of the real world, and what I’m going to show you right now is a prototype that we’ve been working on that brings back some of these tangible, real-world aspects into computing, and there are some really interesting advantages to doing that.
Now, my background is in sensing, and so I like computer vision and cameras, and sensing is actually a really very flexible modality and technology. And so what we’ve got here is a projector and a camera, and the camera is trained on the very same regions that the projector is displaying an image. And so we have the nice collocation so we can put my hand in front of the ball and the ball bounces off the ball.
DAN LING: But what you can’t see on the screen is where his hand is.
ANDY WILSON: Right. So there will be some examples where there is that, a little more visible. So this is not a particularly interesting example, but at least gives you a flavor some of the minimal kinds of interactions you can do with computer vision.
It’s also a very flexible modality so that I can actually work with real physical objects. These are little just acrylic plastic pieces with laser-printed patterns on them, and as I put these down on the surface, we get these graphical objects. This is not a real application; we’re just running through some of the technical capabilities of the system. So these could be game pieces, obviously, or they could have any sort of functionality the application designer can come up with; we’ll put it down, maybe it’s a knob for controlling the parameters of some search or a volume in a media playback.
Just continuing in that vein, here’s just a regular piece of paper. As I put that piece of paper down on the surface, the system sees that and then projects a video onto that. So as I move the piece of paper around, the video keeps up with that piece of paper.
And so this is an interesting thing to think about like what kinds of things would you use in a real interface where you have real objects?
And so I’m going to show a couple little more vignettes real quickly.
DAN LING: So this is technology that uses a lot of computer vision techniques, which is an area that we got into about 10 years ago at Microsoft Research. And at the time, we really had no idea why computer research might be an important technical area. And it’s a good example of some of the things that Rick Rashid talked about earlier. But as we’ve moved into this digital era, we have found more and more uses of how to apply computer vision.
ANDY WILSON: So one of the capabilities that we’re looking at is this notion that you can have, you can actually put your hands directly on the data that you’re manipulating and really can move beyond the single-cursor model, single-point-cursor model that you find in Windows, for example.
So the idea here is I have this map that’s being drawn off of a Windows Live Local database in real time, and as I put my hands on this data, I can manipulate it rather like it was a real map. As you see there, it’s actually pulling in the tile data, the images, from the server. And so it’s a very natural kind of interaction. It’s a little bit like a real map, but of course you can’t pull a real map in and out like this.
So this is an interesting kind of idea that we’re looking into, you know, what are the kinds of situations where this being able to put your hands directly on the data, makes sense. And the neat thing here is if you know how to manipulate a piece of paper on the desk, chances are you can get this to work. Notice that if I have just my one hand on the desktop, I can rotate it; if I have two hands I can rotate it. And so that’s something that it’s very interesting to look at these different interaction techniques that this sensing affords.
And so for the final idea, we’re also looking into different ways to incorporate multiple devices, mobile devices, as they’re placed onto the desktop. What I’m going to do here is I’m going to take a quick picture using this camera phone. This is a Windows Mobile 5 smartphone. And quickly take a picture of you all here.
And so I’m going to then place the phone down on the surface and it’s going to enable that part of the demo. As I place the phone down on the surface, it then notices that there’s a Bluetooth-enabled phone that connects to it and begins to pull off the camera phone the photos.
So, you can imagine that if I meet with you, someone face to face, they can put their phone down, I can put my phone down, we can exchange photos or contact information or what have you. And the idea is that any device that’s placed on the surface is connected by you having placed it there. And then I can manipulate this photo rather like the map. And so this is just a very natural kind of interaction. And then when I want to disconnect, I just simply pick up the phone and walk away. So it’s very natural, we don’t have to mess around with Bluetooth pairing processes. And so this is actually using a technology called Blue Rendezvous, which is also a separate project in MSR.
DAN LING: So, that’s great, and I understand that this was so exciting that another company has actually licensed this technology.
ANDY WILSON: So another company is licensing, TouchLight, that’s right, which is a related technology to this.
DAN LING: Right. And so this whole approach of using computer vision to enable new kinds of interactions is really getting some really interesting — enabling some really interesting new capabilities in this demo here.
Thanks, Andy!
ANDY WILSON: Thanks.
DAN LING: So let’s continue on to the subject of having a lot of data and information and what you can do about it. So, one thing that you can do about that is to apply some machine intelligence, as Eric showed, and as part of his demo he also sort of showed that there were ways of using some visualization techniques, sort of improving the way the data is displayed so that it can really harness the capability of the human visual system.
And George Robertson here is going to show you a few more examples of that, crafted for very specific audiences, and I think the first set of audiences is actually programmers at Microsoft.
GEORGE ROBERTSON: I want to show you two things that we’ve done to help programmers. This first one is called Code Thumbnails, and it’s designed to help an individual —
DAN LING: I think you have to push a button or something.
GEORGE ROBERTSON: Oh, right.
OK. So this is Code Thumbnail, so you’re actually looking at Visual Studio, which is an environment that a lot of programmers use. And what we’ve done is to replace the normal scroll bar for the editing window with this Code Thumbnail, which is a snapshot of the entire file that you’re looking at shrunken down so that it fits on the entire screen. And essentially it replaces the scroll bar.
And the other interesting thing about it is that it highlights the methods and classes of interests. So if I want to move, for example, to this method, I just click on this and it will move me to that location. Now, so it gives you a way of very quickly getting to arbitrary parts of this particular file that we’re editing.
Then we take these individual Code Thumbnails and lay them out on a desktop and arrange it in whatever way is appropriate for the particular project that you’re working on. And so we’re taking advantage of human spatial memory here, making it easy for you to remember exactly where the relevant files are. And again, if I select one of these and click on it, the editor will move to that file and to that particular location in the file.
Another feature is let me go back to this file and to this particular method. If I do a find in files and look for this particular method name, then it’s going to highlight all of the instances within this particular Code Thumbnail. And if we look at the desktop view, you see where all instances of that particular thing occur and which files they’re in.
So this tool helps an individual programmer get their work done. The next question is, how do we help a team of programmers get their work done? And what we’ve done there is to create a system called Fast Dash. This is for shared team awareness. And it would be a dashboard put up on a display surface in a common location where everyone on the team could see it. And it’s showing us a lot of information about what’s going on with this particular project that’s being worked on. So it’s showing, the check marks here are showing all of the files that are checked out. The blue items are all of the files that are currently open and being viewed by someone on the team, and the picture on the right shows you who’s viewing the file. The orange outlines are the files that are actually being edited at the moment, and it’s actually telling you which method is being edited. And then this is a particularly interesting case. The crossed-hatched area is a warning sign that’s saying this particular file is checked out by two people. So that’s a potential conflict that you need to be aware of.
DAN LING: And this could obviously be used for other groups of people rather than programmers.
GEORGE ROBERTSON: Sure.
DAN LING: So collaborating on other things, an advertising campaign, or, you know, some other collaborative effort, right?
GEORGE ROBERTSON: That’s right, absolutely.
So those two systems help individual programmers and groups of programmers.
Another area that we’ve been looking at is helping in business situations. This particular system is called a Schema Mapper. The concept was originally developed by the BizTalk product, and the problem they’re trying to solve here is in businesses you typically have schemas that describes sort of industry standard information, and then you have a custom schema for a particular business or a particular application.
DAN LING: A simple example of that is when you’re importing a database, that usually has a schema that’s different from your own database.
GEORGE ROBERTSON: Right, right. So the problem there is that you need to define the mapping between this generic schema and a particular schema. So what BizTalk did was come up with this basic idea that they have one schema, the source schema on the left, the destination schema inverted on the right, and then a mapping in the middle that shows the relationship between the two.
And one of the problems with this, the interesting thing about this is that when BizTalk first introduced this five years ago, a number of other companies picked up the idea, and there are like eight companies out there that all use the same basic technique.
The problem with it is that customers immediately start working with larger and larger schemas and larger and larger maps and very quickly get to the point where the mapping is just too confusing, there’s too much information there, it’s a lot of clutter. Particularly with the technique that was being used for tracing through, so if I want to find out what this is connected to, I actually had to trace through and move things around and it’s not very easy to figure out what it’s connected to.
So we have applied a number of visualization techniques, and I’ll just turn them all on so you can see the end result of this. The end result of when I select something, it will find all of the connected items, it will highlight those items and de-emphasize everything else. It will auto-pan the map so that the interesting stuff is in the center. It also auto-scrolls to tree view so that the interesting stuff is in the center.
So we’ve taken a really difficult problem and turned it into something where the interesting stuff is very salient and very easy to work with.
DAN LING: Great.
GEORGE ROBERTSON: OK, so that’s a business application.
DAN LING: Now, you’ve been thinking about consumers as well.
GEORGE ROBERTSON: Yes. So, two things, two examples for consumers. This first one has to do with how a consumer gets to a large amount of information on a small device like a smartphone. So what we’ve built is this system called (Fathom ?), which stands for Faceted, Thumb-Based Interface.
And basically, the phone display is broken into three major regions. There’s a region here that shows the various facets which are there’s metadata associated with whatever data you’re looking at. In this case, what we’re looking at is Yellow Pages information for Seattle and the Eastside. So there’s a hierarchy of metadata that you can explore in this part —
DAN LING: You should give some examples. There’s a category that displays to the place, the location of the place, when they’re open, so on.
GEORGE ROBERTSON: Yep. And then the region right above that is the current results of your browsing or search activity.
And one interesting thing is that we always show some results, so you’re never seeing an empty screen.
Now, the question is, suppose I wanted to find something like a really nice, say an expensive Italian restaurant near my home. Because this is a faceted system, there are multiple ways to get the answer to that. There are lots of ways you could find the answer. If you’re using a traditional search system, you would have to type in a bunch of keywords, which you all know is very difficult on a phone. In this system what we do instead is I’m going to go down to category, restaurants, ethnicity, Italian, and now I’m going to pin that one, go back up to the top, and now go down the distance facet, so distance from home, say ten blocks, that’s now five different restaurants that fit that. Now I can order these by price is what I was interested in. So Nick’s Italian Cuisine is an expensive Italian restaurant close to home.
DAN LING: That sandwich is probably not an expensive restaurant, even though it says that up there.
GEORGE ROBERTSON: Well, some of the data, we’re using real Yellow Pages data, but the Yellow Pages, you know, doesn’t have price information, so we kind of randomly introduced some price information so that we could develop this concept. The price information is not correct but the rest of it is.
Okay, this same idea can be applied on a larger scale, too, and for that we created a system called Facet Map. This particular data that we’re looking at is actually information about Gordon Bell from the MyLifeBits project.
DAN LING: We should probably explain that a little bit. So, Gordon Bell is a principal researcher at Microsoft Research, and he’s been really interested in collecting almost everything he does in digital form and saving it. So that’s all pictures, all Web sites he’s ever seen, all phone calls he’s ever made, all documents he’s ever written, and so on. So he’s been sort of doing that as an experiment, essentially taking his entire life, digitizing it, and keeping it online.
GEORGE ROBERTSON: Right. And the original interface that was built for that was kind of a traditional text-oriented search engine, and we took the challenge of seeing if we could do a completely visual way of gaining access to this information. So this is again a faceted browsing and searching system. The major facets here are things like dates and people and type of communication, the time spent on a particular item.
So, if we dive in and look at an example, like if we focus on the year 2004, what it’ll show on the right is all of the items that have that 2004 attribute, and on the left all of the remaining facets, given that you’re filtering on 2004. So, now we can look at pictures within 2004, and then notice that the main facet that’s left is location, so we can go and look at, say, San Francisco pictures in 2004, and now you’ve gotten down to a smaller set. So we were very easily able to get down to some particular thing of interest in a very large database with just a few keystrokes and no typing at all.
Now, another interesting thing about this particular technique is that it’s completely scalable. It will work in whatever size window I put it in. So if I make it smaller, it will go smaller. The same visual representation and interaction techniques work regardless of the size, and in particular, it works on a really large display. That 18-panel display at the far end of the room has the same information that I’ve been showing you, just shown on a really large display.
DAN LING: But here you’ve been able to be much more explicit about many more of the facets.
GEORGE ROBERTSON: That’s right.
DAN LING: Okay, great, thank you, George.
So for our last station, we’re going to show you some of our work in the computer graphics and computer vision area. As you heard a little bit earlier, some of the computer vision technology was used in Andy’s play-anywhere system, but we’ve applied it also to digital media.
So about 10 years ago, Michael Cohen, who does computer graphics, ran into Rick Szeliski, who does computer vision, and they chatted and realized that each field only did about 90 percent of what they wanted to do. For the computer graphics guys, the problem was how do I create really complex, real realistic models that I can then show in a computer graphics system; and for the computer vision guys, it’s really quite the opposite, which is how can we extract a very complicated model from a series of pictures.
And I’ll turn it over to Michael for the rest of the story.
MICHAEL COHEN: Great, thanks, Dan. So as Dan alluded to, we both had problems in that we could solve about 90 percent of the problems, but the key was we couldn’t create models as rich as we find in the real world; in other words, describing everything in all its detail.
So, this launched a whole new area, called image-based rendering. So, for example here, we took this fuzzy lion and asked the computer vision researchers what would we get, we would get a model that might look like this. It would be a pretty good representation but it didn’t have all the little hairs. But at the same time, every pixel of the photograph we could use in combination with this model using our traditional graphics pipeline to create new images. And this launched an idea that was called Lumigraph, which was one of the papers that Rick alluded to that showed up at SIGGRAPH in 1996; this is 10-year-old work.
And this is one of the original Lumigraphs. It’s as if we took a camera, we took a few pictures of this lion, and what we’re able to do then is look at this lion as if we were looking through a window. So all of these images that you see here are being assembled on the fly by picking and choosing pixels from all those different photographs that were taken of this lion.
Now let’s jump forward about seven or eight years, and I’ll show you where that work has led to.
RICK SZELISKI: So we’re switching videos to the other, to this laptop here.
So, what we’re seeing here is a video called Massive Arabesque. And I’ll start the video playing and tell you what’s happening.
So in this video here we have a local rap dance group called the Massive Monkeys, and we’re filming it using a series of camera. So, what you’re going to see is the kind of freeze frame effect that you’ve seen maybe in the “Matrix” movies, that some people call it bullet time. And the way that’s traditionally done is you set up a series of maybe 40 or 100 video cameras and then by switching rapidly between the different video cameras, you get these visual effects. What we did in our tape is we only used seven cameras, so this is a much more manageable setup that you can imagine using commonly in studio situations. And we’re using these image-based rendering techniques, which basically build up a 3D model of the scene, and then that allows us to fly around a virtual camera from place to place. So when you saw a few minutes ago the person being shown in sort of multiple views at the same time, that’s because we have a 3D model, and we can blend different versions of this 3D model, render them together, and look at them from different points of view.
So it’s a natural evolution of the work that Michael showed you, where we started with a camera, just a regular photo camera, taking multiple shots, and now we’re doing it with videos, so we get the element of time back in. So we think this is a precursor to what in the future we’re going to see in terms of 3D television.
So I’m going to stop the video here, and Matt’s going to come over and talk a little bit more about some of our image stitching work.
MATT UYTTENDAELE: So we tried to give you a little bit of an historical perspective here on this work. OK, so digital photography is huge today, right, and digital cameras are great because they let photographers and people take lots of photos for pretty much free. And the question we asked was, can we apply some of our computer vision techniques to help people organize their photo collections and to make better pictures?
So, here’s one application of that. We have a small group of photos here, and some of these photos are just individual snapshots. But this set of photos here, the photographer’s intent was to rotate the camera because he couldn’t capture this whole panoramic scene in one image.
So if we just select all the photos in this directory, and tell our software to help us organize these, let’s see what it does. So I selected everything there, and I’ll just have it automatically organize it.
While it’s doing that, I’ll show another collection of images that the photographer was changing a setting in the camera because they’re intent again, this was a very bright scene. It had lots of details in the highlights, and simultaneously, details in the shadows, and they couldn’t capture that in one image. So they took multiple exposures. But you don’t really want that as multiple exposures in your directory.
So our software’s gone off and organized this directory for us, and here it’s automatically found that panoramic view for us. And I’ll double-click on that and let’s see what it looks like. It’s taking a second to assemble. And here it’s assembled that automatically for us into the panoramic view that the photographer was intending to capture.
We’ll also look at this scene, which had the multiple exposures in it, and we call this a High Dynamic Range Scene. I can tweak it a little bit here. And now we’ve assembled this into a composite image that has detail both in the highlights and in the shadows.
So we’ve let people internally play with this software and they’ve applied this to years and years of photo selections. This is just a small example. And the fun thing is that it helps you find similarities between photos that you didn’t even know were there in your collection, and Rick will show the follow-on idea to this is PhotoSense later where you’re assembling links between photos.
So here’s sort of a collection of images I shot, or two images I shot last year in France that I didn’t even know overlapped with each other, but it automatically found it and created a panorama for us.
So another reason you might want to shoot multiple pictures is something we call a group shot, and it found this group shot for us in our collection. And Michael’s going to describe this to you.
MICHAEL COHEN: So the stitching software that you saw has showed up in Digital Image Pro and in the Expressions suite. We have another tool that you can go download for free that we’re just putting off our Microsoft Research site, it’s called Group Shot. And as you saw, two of the images in that directory you just saw were these two, which was our colleague Alex and his wife, Heather. Now, he took these by just holding a camera out at arm’s length and taking a couple pictures of themselves. The first thing that happened is it aligned those images just like it was doing the stitching for that wide panorama.
But the problem is, of course, if you’ve ever taken pictures of more than one person or even one person at a time is Heather looks good in this one with her eyes open, and Alex, of course, looks good in this one, and neither picture is really what you want. So let’s just focus on Heather first.
We can look at Heather and say, “Well, this one looks good, so I’ll select this one.” And then let’s go back and focus on Alex. And as we focus on Alex, we can see, well, he looks good in the other one, so I’ll select that one. What the system has done is automatically gone in and found exactly the right place to cut between those two images so you get a seamless image that’s good for both of them.
Now if we were really picky, we’d go back and see this woman was walking by and we didn’t really like that. Of course, in the first image she wasn’t there. And now in just three clicks we’ve created the kind of image that we were really hoping to get, but never actually existed in any particular instant in time, but rather this moment of them standing on the beach is the one that you can —
MATT UYTTENDAELE: You can’t believe in photographs anymore.
MICHAEL COHEN: So, let me turn this back over to Rick and take this idea to its next step.
RICK SZELISKI: So, one of the things that Matt showed you is that you can create these very nice big panoramas. And I want to focus a little bit now on the viewing experience, because you saw the authoring side.
And what we’re looking at here is one of these big panoramas. How many of you have been up to Deception Falls on the way to Stevens Pass? Do you know that waterfall? It’s a beautiful area. So this is a panorama I took. There’s my dad, because we were there a few years ago, and I took all of these photographs and stitched them using our software. But what we did in addition to that is I also had a video camera, and with the video camera, I basically stitched in the waterfall. So now you have this 360 panorama, which you might be familiar with for doing things like real estate tours, you see that very often, but we’ve actually added the video. There’s the sound. So you sort of get a real immersive sense of being in this three-dimensional environment, and it’s all done using a combination of photography and video like this.
So, this is one kind of immersive tour you can do. And this one I captured basically by taking a camera on a tripod and taking a few dozen photographs and then taking a video.
Now, the next one I want to show you is one where we’re actually going to do this for a home tour, and I’m going to launch this thing here. And what we’re looking at is a home, it happens to actually be the house that Dan was living in a few years ago, and we’re going to do this tour here where basically we’re going to go into the house and see what it looks like.
So again, for those of you who are familiar with 360 tours inside people’s homes, usually you’re just jumping from one room to another. There’s not this sense of movement, of continuous movement. And we basically can move through here. And look at all the details you can see. See, this is a beautiful home, and we can go around and select different rooms to go into, so we can either go left into the guest bedroom or stay straight across this bridge that’s in the middle of the atrium.
And look at all the visual effects. You see the reflection off of the copper floors. If we look this way, we have the reflections off of the fireplace to the right here. So there’s an incredible sense of visual richness that you don’t get just from still photographs. This is like a combination of gaming technology, because you really get to have free movement inside the thing, but it’s based on real-world imagery, so it has an incredible realism that in traditional games is very hard to get because there you typically just have surfaces with textures on them. So there’s an incredible richness here.
Now, in order to do this, we couldn’t take a camera and put it on a tripod and take a shot and move the tripod over. That would have been much too slow. So what we did instead was we built this special 360 camera. That red thing you see there is actually a video camera that looks in six different directions at the same time. You can put it on your head and walk around inside a home or a garden, and then, when you get it back to the lab, we process it and create this kind of a three-dimensional tour.
So this is really exciting for real estate applications, but it turns out, it also has applications in documentary journalism, and Matt’s going to take us a little bit about how MSNBC used this.
MATT UYTTENDAELE: So, we did that home tour about three years ago, and developed the camera for that project. And the technology was sort of, actually the camera literally sat on my shelf for the last three years, we weren’t sure, we’d give lots of demos, we’re not sure where this is going to be applied.
And I met some people from MSNBC who were doing a project called “Rising from Ruin” to document the devastation of Hurricane Katrina on two towns in Mississippi called Waveland and Bay St. Louis. And I showed them the demo of the home tour. And they were trying to tell the story of Hurricane Katrina in these two towns, and a single snapshot doesn’t really tell the story of the devastation, this wide-scale devastation.
So they thought that this technology could help them tell that story. And here I just brought up the “Rising from Ruin” Web site, and we’ll click on some of the footage that we captured down there using this 360 technology.
So here we go. We geo-referenced everything, so that every frame in this 360 video has a GPS tag associated with it, so we can see where we’re driving along in Waveland, Mississippi here. And at any point we can drag around and just see the devastation everywhere.
And we rented, we took a rental car and mounted the camera on the roof and just drove around Waveland and Bay St. Louis for a few days to create this experience. And everywhere you can stop and drag, look around a little bit, full immersive viewing everywhere inside this tour. It really, I think, helps tell the story a lot better than a single image could.
DAN LING: So this is live on the MSNBC Web site, so any of you can go there.
MATT UYTTENDAELE): I’m streaming off their Web site right now, the URL’s up here in the address bar.
So we did a more somber home tour in Waveland. And there’s some audio with this too. So this is a homeowner taking us to his home. And as he describes how the hurricane affected his home, at any point you can stop and look around and see what he’s talking about.
One of the challenges here was the reporter who was doing the story wore the helmet on his head, and as you watch this, you don’t really see the head bobbing motion that he had. So one of the software challenges we had was to make this look like it was shot from a stable platform, even though it was shot from the top of somebody’s head. That’s some of the technology we were able to apply to this application.
Okay, and I think Mike, Michael is next.
MICHAEL COHEN: Great, thanks.
So, we were thinking about all this stitching software we had, essentially assembling imagery from multiple cameras, and thought, wouldn’t it be nice if we can make really, really big pictures, pictures that are so large that we’re able to simply tell a story with a single image by just looking around in it. So what Matt and I did was we went up on top of a building in Seattle and we had a rig that looks like this. We now have a new rig —
MATT UYTTENDAELE: It’s over here..
MICHAEL COHEN: — you can see over here that we’re using for the future.
And what we were able to do was essentially to scan the world that we were looking at. And it looked something like this. This was taken from this building top on Capitol Hill. The nice thing about this image is that it’s not just a single image, it’s in fact a collection of a lot of things going on in Seattle on that morning in February.
So you can all see the pair of gloves in the image, right? No, you can’t really see it? Of course, there’s some construction sites down here, and so the workers are going to have to wear a pair of gloves as they do their work. And you all saw the owl up on the building, right?
The nice thing about this is we didn’t really find most of these things. We let a lot of people play with this image, and just looking over their shoulders, they kept discovering these various things hidden around Seattle. And it was only recently that we were watching somebody and they started zooming in here. And up on our wall upstairs we have a printout about six feet wide, about three feet high, but even on that one, if you put your nose right up against it, you would not see this airplane, which is probably flying somewhere over Japan, I think. (Laughter.) In fact, if you were to print out the image with the airplane this big, that photograph, that printout would be larger than the entire building that we’re sitting in.
So this is one way to assemble lots of photographs from a single point of view into a big, big picture.
I’m now going to turn it back over to Rick to show some of our latest technology for assembling different —
DAN LING: How many pictures was this, Mike?
MICHAEL COHEN: This was about 800 images. And in the end, what you’re looking at is a 4-gigapixel image, essentially about 4 billion pixels all assembled together into a single image.
RICK SZELISKI: So, the last project we’re going to show you is something called PhotoSense. And this actually originated as a joint research project between Microsoft Research and the University of Washington. So, both Michael and I co-advise students at the University of Washington, and one of the students Noah Snavely and his adviser Steve Seitz and I developed this system called Phototourism, which was shown at SIGGRAPH this past summer.
And the idea there was to take large collections of photographs and actually see if they could be rearranged in 3D. So, a lot of the stuff we’ve shown you up till now are panoramas taken from the same point of view. With PhotoSense you can start with a large collection of photographs, so as I hit this fly-around button, look for the little red triangle. Those are where the photographs were taken. So one of the developers on this project, (Jonathan Dugi ?), was in Rome a few months ago, and he took a few hundred photographs in St. Peter’s Square, and all of these photographs are automatically registered by the system. The system figures out where the person was standing for each photograph and builds this kind of sketchy 3D model.
So we’re not trying to build a completely realistic 3D model like you might find in a computer game; what we’re trying to do is basically take the photograph and allow you to navigate the scene by moving from one photo to the next. So if I put my mouse in the middle here, I can say, “OK, let me see what’s to the right of this photograph.” And it basically brings in the next photo and lets you move from one photo to the next. So we can go and do a tour all the way around the side of the square, and you get a very strong sense of three-dimensional movement, but you never take away the beauty of the photograph.
So, just like with the home tour we had previously, we keep the photography around. And this is one of the common themes in this whole field of image-based rendering is to let the photograph, the realism of the real world shine through. So, we basically go all the way back to the original Basilica here. And now if you want to see something like a detail, you can go look at this photograph and say, “Okay, I’d like to see a nice detailed shot of the cupola here,” so you can do that. And again, anytime you want to, you can fly around and overlook the scene and we can actually click on one of these little triangles and jump to some other part of the scene.
So this is the latest thing coming out of our group. This system, which actually has been built and re-engineered inside Microsoft, inside our Live Labs group, is called PhotoSense, and we demoed this at SIGGRAPH, it’s going to be going live in a few weeks, so that people can explore these collections on their own. And it’s just another example of how research that starts off in the lab in a very academic context can quickly make it into Microsoft products. That’s just the way we’re set up here.
DAN LING: And, in fact, this is a great example also of a 10-year view of a research program to see how kind of these ideas started and where it’s all evolved over the past 10 years, and then some ideas of new things that we’re planning to do with the technology. That’s great.
So, thank you very much for coming today. I think we’ve tried to show you just a little bit about some of the projects going on at Microsoft Research and tried to highlight a little bit of a historical perspective, point out some interesting ways that we’ve both impacted products as well as licensed our technology, and also worked with universities in the case of this PhotoSense project.
So, thank you very much, again, for coming. (Applause.)
KEVIN SCHOFIELD: Thanks, Dan, and thanks to all the researchers who helped with that. We’re going to set up now for a Q&A, bring Dan and Rick back up here. While they come on up, we’ve got one more video to share with you. This one won’t be quite as funny as the last one, but it’s exciting anyway.
You know, as Rick mentioned this morning, one of the things that’s very important for us here at Microsoft Research is making sure that the field of computer science is healthy, and those who follow this closely will know that there’s been a significant drop-off in computer science enrollment over the last few years, and that’s obviously something of great concern to us.
And one of the things that we’ve been trying to do is work with some of the association organizations, work with even the high school guidance counselors to try to get better information out there about why this field is so exciting, why we’re all in it, why is it a great career, a great field to be in.
And so actually the last thing I want to show you is a video that we put together, we’re actually sharing with the partners to really help sort of get the message a little better out about why we think this is such a great field to be in.
So let’s go ahead and play the video.
(Video segment.)
KEVIN SCHOFIELD: So there are some technologies that we actually showed today, some technologies in there that don’t exist yet, but that’s why this is a great field, that’s what research is about, really getting the opportunity to sort of create that future.
Let’s go ahead and bring Dan and Rick up, and open up the floor for questions.
QUESTION: So Microsoft is one of kind of a shrinking number of companies that maintain this kind of in-house research. I just would be curious from you guys why you think more companies aren’t doing this, and why that number has dwindled so much over the years?
KEVIN SCHOFIELD: The question is, Microsoft is among a shrinking set of companies that are actually investing in creating research labs like this, why are we still doing this, why is it an interesting thing for us?
RICK RASHID: I think there are a number of issues. One is that you need to, as a company, take a long-term view of what you’re trying to accomplish. I think Microsoft has historically done that. If you look at the investments we’ve made over the years just in our product teams, we’ve been willing to make substantial investments over many years. It took many releases, and almost a decade before Microsoft Word, for example, was the number one word processor. It took more than a decade, really, to get the original Windows NT to the point where that was a key product for the company. We’ve made long-term investments in areas like IPTV, which are now starting to pay off for us. Long-term investments in computer graphics which are now paying off for us in the work we’re doing in the consumer space with Xbox and Xbox 360, and so forth.
So, I think, as a company, we tend to take a long-term view, and I think when you take a long-term view, you see the value that basic research has in order to keep your company vital, and to keep it agile.
DAN LING: I think now we’re starting to see other companies be interested in research again. I mean, Yahoo’s research lab is a relatively new organization, but clearly they’ve also decided that research is an important thing for them as well.
RICK RASHID: I think, again, there’s a little bit of an ebb and flow in any area, but I think people have looked at what we’ve done, and the success we’ve had, and I think they’re going to look at that as a model for what they might want to do in the future.
KEVIN SCHOFIELD: Other questions?
QUESTION: (Off mike) can you or Dan talk specifically about [Windows] Vista, and what types of things in there represent things that started in your unit?
KEVIN SCHOFIELD: So, the question was, I’m just repeating it. The question is, we talked a lot about TechNet, for example, this morning, are there any specific examples you want to elaborate on about Microsoft Research technologies going into Windows Vista?
RICK RASHID: I’ll just start out, one thing I’m particularly pleased with is the fact that as part of the device developer kit for Windows Vista, we were able to work with the product teams to bring in a software proof tool that came out of Microsoft Research. So this was something that was originally within Research called the Static Driver Verifier that came out of a project called the SLAM project, and the idea behind this software is it actually will take mathematically defined properties and literally take a large body of source code and prove whether those properties are true for that large body of code. So this is the first case I’m familiar with where a proof tool of that type is being used on a large scale to ensure the quality of a particular piece of a large system. So I think that’s certainly something I would point to. Really, this whole area of software proof tools is changing the way Microsoft, and I think other people are now beginning to think about software development. The fact that we can identify specific properties of programs that we really want to be true or false, and then prove that to be the case, rather than test for it, change it the way people think about how they would develop software in the future, and it’s changing our internal processes as we think about the tools that we give to our developers.
DAN LING: It’s easy, essentially, to underestimate the value, for example, that Rick really talk about for [Windows] Vista, because when we look at sort of the stability of the operating system, device drivers are a huge facet in that, because they work at very low levels in the operating system. They work with hardware. If device drivers aren’t written well, they can add incredible instability. Just from the point of view of making sure that our operating system is really super reliable, being able to take the advanced technology like this, and really apply it, not only for the device drivers that we write ourselves, but this is a tool that we’re actually making available to third parties now, hardware manufacturers, other software developers, to make sure that they’re doing a really good job of building really high quality device drivers to work with our operating system.
That’s an example of something that’s really in the guts of the operating system, and how it was built. There’s another really cool example, I think, of something that’s kind of just fun. One of the things that Vista is really going to be targeted towards is helping people communicate better. So IP telephony is becoming really important, people are starting to use it more. A lot of people make voice calls using their computer over the Internet. One of the problems with voice calls like that is that very often the microphone that’s available is kind of a tinny microphone, and doesn’t sound very well, and if there’s a lot of background noise, that’s all captured as well and sent across the line. So one of the things that Vista now supports natively is this whole idea of an array microphone, where you have multiple actual physical microphones, the signals from all of those microphones are combined in such a way so that you can focus in on the speaker, and actually get a much better voice quality and much lower background noise.
RICK RASHID: I point out the Windows CD Photo. We announced that a few months ago, and that’s the technology that really allows you to do extremely high quality photos, and also reduce the amount of storage that’s required to keep the photos. So that’s something which I think will be of particular benefit to people that are excited about digital photography. And certainly the other media technologies we’ve delivered in the past that have created things like Windows Media Audio, that have led to the Windows Media Video Format, and now the VC-1, which is the basis for most of the really high quality, high definition video, I think those are, again, examples of how research technology has really impacted our products, and those technologies are obviously in Vista.
DAN LING: And another area which is, again, more about how the system was built, we had pretty much since the beginning of Microsoft Research, as Rick mentioned, sort of a group looking at tools, and technologies to help our programmers write better software. And we took a group of those people, about three years ago, and moved them into the Windows organization to really rethink the development processes in Windows, and how the system is actually built and engineered. A lot of that, I think, is really based upon technologies and ideas that grew out of Microsoft Research.
RICK RASHID: We could be going for a long time.
QUESTION: I want to talk a little bit about licensing, Microsoft has been increasing its licensing, and you mentioned that Touch Live, how many products have been licensed, how much technology has been licensed that’s come out of Microsoft in the last few years, and highlight a few of the more interesting ones?
KEVIN SCHOFIELD: How much technology has been licensed out of Microsoft Research over the last few years, and what are some highlights?
DAN LING: Well, this is a relatively new program, so this was probably the first full year of licensing that I think we did. So there’s not that much history to it. A number of interesting things have already happened. The Touch Live thing that you mentioned. There’s another project that we did which involved looking at real-time traffic data, and then making predictions of when traffic jams either will dissipate and roads will clear, or if the roads are currently clear when they might become crowded. And this is based, again, on the technology I talked a little bit about, drawing on historical data, time of day, weather information, information about the Mariner’s game, things like that. That technology was also licensed.
There was a project that we worked on called While, which was a social networking piece of software that allows people to build multimedia blogs, and that kind of thing, and that was licensed by a startup. I think there were several others.
RICK RASHID: I don’t have a full list, although you probably could get one if you talk to the IP Ventures team at Microsoft. But I think the key thing from my perspective is that by sort of opening up what we’re doing to external licensing, it’s allowing us to work more closely with the VC community, give us better new options that are contacts to that community. I was at a VC conference last year, and it was surprising to me how many people came up to me and said, I’m really glad you guys are doing this, this is great. Who can we talk to? I think there’s a pent up demand for some of this work that can do more than just improve Microsoft products, but potentially create new businesses for other people. And I think that’s something of real interest.
I think also it’s an opportunity for our researchers, we do a lot of projects, not everything really is necessarily going to want to be a Microsoft product tomorrow, and so there’s an opportunity to have another outlet for the research that they’re doing, and to be able to have an impact on the rest of the world.
QUESTION: (Off mike) are things that Microsoft itself it’s passing, they’re a second rate product? And also how do you make sure that it gets out there fast enough, because there may be a time when Microsoft or the other product teams may sit on a bit of technology for a while trying to hedge or decide whether they want to keep that, and whether if they want to keep that proprietary and not give it to a competitor? It just seems like you guys are in a difficult position.
KEVIN SCHOFIELD: There are two questions around strategy. One, is there a perception, or do we need to sort of combat a perception that these are technologies what we license are technologies that have been passed on by our own product groups, and somewhat related to that what’s the speed, how fast can we actually sort of get these things out there for licensing?
RICK RASHID: I think in a lot of cases that the people that we’re talking to are interested in doing products that are really outside of Microsoft’s space. When you’re talking, for example, about traffic analysis, and managing and collecting large amounts of traffic data, that’s not intrinsically something that Microsoft would necessarily do. And so I don’t think there’s a clear conflict there. When you’re looking at display technologies, we’re not a company that produces displays. There’s a bunch of issues that are areas that we do research in where we may create a technology or develop a new idea that just isn’t really in Microsoft’s core businesses. It’s also the case that there are technologies that we license that are also being used within Microsoft products, and it’s really not the case that we’re sort of saying, here’s things that we’re not doing. In many cases, we’re working with people that are interested in licensing technologies that we’re also using internally perhaps for a different purpose. Again, I think that’s a perfectly fine thing to do. I agree that there’s always a bit of a tension whenever you’re doing anything with external companies understanding what exactly are you licensing, what is the IT that’s being transferred, what software goes with that, or what pieces of technology may go with that. But that’s just part of the business process, and it’s no different for us than it would be for a university in some sense, it’s licensing technology.
QUESTION: (Off mike.)
KEVIN SCHOFIELD: The question is, does the IP Ventures team actually have a specific business goal or quota that they’re trying to meet that’s sort of an incentive to make sure that as a whole we’re trying to – we’re actually making progress and driving research technologies out into licensing?
RICK RASHID: I think the simple answer is that they were created as part of Microsoft specifically to license technology. So, yes, their goal is to do that.
QUESTION: So they’re very motivated to do that?
RICK RASHID: I don’t actually know if they have quotas or not, that’s not something they would share with me.
QUESTION: (Off mike.)
KEVIN SCHOFIELD: The question is, just coming back to the timing issue, again, how do we address the issues, or what concerns do we have around timing. How quickly is IP Centers and IP licensing an opportunity for us to move faster on some things, essentially, find another approach to potentially missing the boat on a business opportunity?
RICK RASHID: I think there’s always an issue with research. We’re working, we’re pushing toward the state of the art, we’re doing new things. You may or may not get technology transfers exactly on a schedule, right, that’s not really how it works. You’re looking for opportunity, when we’re working with our own product groups, for example, we’re always looking at, well, what stage are they in the development process? In certain stages it’s going to be easier for them to take new technologies, in other stages they may not be. They may be particularly interested in moving into a area from a product perspective, or that may not be their focus right now, and it may be later that they’re interested.
The same thing is true in the venture capital community. The question of, when this group of investors decides they really want to pursue a particular idea or not. That’s not something we particularly have control over. What we try to do is make sure that our ideas are out there, that there’s an opportunity both for our product groups, and now for outside ventures to take advantage of those. And we’re not really making a secret of it. I mean, as you said, we had many tens of thousands of users on the system going back quite a few years. So there’s no secret about what we’re doing. I think it’s just a question of when the particular opportunity presents itself. I think in the case of the traffic application we were mentioning earlier, that was a case where there was a local company that got really excited about the technology really early, and was able to pick it up and incorporate it within their product.
DAN LING: I would also say that as an organization, I think we can respond quite quickly to any of those opportunities. So when a product group gets really excited about some research work, or there is some opportunity that arises in licensing, as an organization we in research can usually respond very quickly to any of those demands. I think that PhotoSense was a good example of that where it was shown at SIGGRAPH in August, and it will be available through the online products that we have in just a few months. That’s a very, very short cycle.
RICK RASHID: I think it just depends on circumstances. Usually when I talk about, give my speeches on technology transfer, you really have to look at the particular set of circumstances that exist and do your best to take advantage of opportunities, when they present themselves, but you can’t force technology on people. That doesn’t work either.
KEVIN SCHOFIELD: Other questions? Okay. Well, thank you very much. We’re going to have lunch in just a few minutes, set up in the next room over, and an opportunity for you to chat with some of our researchers over lunch. I appreciate all of your time, and particularly those of you who traveled to come here today, and thank you, once again.
RICK RASHID: Thank you. (Applause.)