Remarks by Richard Rashid, Senior Vice President, Microsoft Research
Microsoft Research Faculty Summit
July 29, 2002
DOUGLAS LELAND : Well, good morning, everyone and welcome to the 2002 Faculty Summit. For those who weren’t here last night for the reception, thank you again for joining us and taking time out of your busy summer schedules to spend a couple days with us. I think we have an exciting agenda planned out for you today, tomorrow and for those of you who are staying on with us for the third day.
I’d like to kick off this morning’s proceedings by introducing our first keynote speaker. Our first speaker joined Microsoft in 1991 from Carnegie Mellon University where he was a professor of computer science. He’s well known for his work in network operating systems, and distinguished himself as director of the Mach project. The Mach kernel has influenced a broad number of operating systems and is still in use in a number of commercial organizations and laboratories.
Ladies and gentlemen, it gives me great pleasure to introduce Dr. Richard Rashid, Senior Vice President of Microsoft Research.
(Applause.)
RICK RASHID : Thanks, Doug.
OK, well, this is the second year in a row I get to be Bill’s warm-up act. I don’t know if that means I get to eventually graduate and have my own act, but I’ll have to make do with what I’ve got here.
I’m just going to speak very briefly and give you a sense of some of the things that we’re thinking about doing. Well, last year I had a slide pretty much exactly like this except it had a fewer number of people, smaller number of research labs and a smaller number of areas that we were working on, so we’re continuing to grow and, in fact, over this last year we’ve really continued to grow in a fairly dramatic way.
So just this last October we opened up our building on the Cambridge campus at Cambridge University in England. This is really a major milestone for us. When we started that research lab now about five years ago, we had our plans that we were going to be building a building right on the campus right next to the new computer laboratory and we finally moved in. I was actually out there a couple months ago and it’s a beautiful facility. And the great thing from my perspective is it puts us right on campus and right near the students and the faculty at the university and makes us really part of that environment.
We continue to grow our research lab in China. It is by far the fastest growing group that I’ve got now. And just over the last year they’ve published something like 200 papers in international journals. Last week at SIGGRAPH there were four papers from the China lab, so that’s a very major milestone for them. And we named it from MSR China, which really focused on China alone, now it’s MSR Asia, and really the focus of that organization is now in Asia broadly and they’re working very closely with our subsidiaries in Japan, in Korea, Singapore, Australia and so forth, so they’ve really expanded their efforts in a significant way.
And interestingly enough, for those of you who don’t know, we’re actually an educational institution in China. We have a large number of graduate students working in a post-doctoral degree program there and we’re the only really outside organization, western organization that has the privilege of giving out degrees, and it’s the highest degree in computer science in China. So that’s actually quite interesting, so we’re kind of an academic institution there as well.
We also opened a new lab this last year in Mountain View. And when I first came to Microsoft I remember everybody was really excited,
“Well, why aren’t you opening up a research lab in the Bay Area, why are you moving to Redmond?”
and all these things. Well, we did a good job of creating a lab in Redmond. We finally thought the time was right to create a research lab in Silicon Valley, partly because of the growth of our campus in Mountain View. We now have a very substantial campus in Mountain View with almost 2,000 employees for the company as a whole and so we thought it made sense to have a research lab down there working with them, and so that lab is moving forward and it’s really focused on distributed systems and theory. And you’ll have a chance to meet some of the people from the lab later on today.
So overall this last year we grew by about 10 percent, which I think makes us one of the very few of the industrial research labs that’s actually grown during the last year.
Looking forward, we’re continuing to plan to grow for another 10 percent in the following year, so our fiscal year really starts basically now in July and ends in June. So during this next 12 months we’re going to be growing by roughly another 10 percent.
Most of that growth is going to be outside of Redmond. We’re going to be almost doubling our lab in Silicon Valley, so they’ll be getting closer and closer to being really a standalone research lab. They’ll be coming up on 25 or 30 researchers by the end of this next year.
We’ll be growing substantially in China. We’ll be growing in Cambridge and we’ll be growing some here in Redmond as well, but most of our growth is going to be outside of the Redmond area.
So we’re going to be continuing to grow and continue to start new initiatives.
One of the areas in which we’ve really gotten a tremendous amount of energy focused over the last six months is this whole area of Trusted Computing. As part of Microsoft’s Trusted Computing initiative we’ve put a lot of energy into bringing our technologies to the rest of the company to really try to improve the security and the quality of the software. And you’ll hear more about that during various presentations today, but I thought I would highlight that.
Another area of focus for us in terms of technology transfer over this next year is going to be really working significantly with the product teams in getting ready for the next really major release of Windows, which for us has been codenamed
“Longhorn.”
Some of you may have seen that in the press. It has nothing to do with Texas. It’s actually I think a bar between Whistler and Blackcomb in Canada, the two ski resorts, the two ski mountains there, and we had the last major release of Windows was codenamed Whistler. The one that is supposed to come next, as far as later, is Blackcomb, and the one in between is Longhorn. But we’re working hard with the various product teams to do that.
And I’ll just preface this a little bit: I’ve been working very closely personally with Lili Cheng and various people at her group on a project called Sapphire. You’ll hear more about that later.
But one of the things we’re really trying to do is rethink the whole relationship between the user and the computer and really change the nature of what information is kept and how it’s managed for the user so that at any point as you’re working related documents, related information, queries are maintained and that there’s a deep history of your interaction with documents, your interaction with the computer that’s kept at all times, and you’ll hear more about that later. I think Lili is going to be on stage as part of Jack Breese’s presentation.
Now, another area that we’re going to be expanding and even more rapidly in some senses than our basic research organization is university relations and our university engagement. We’re upping our commitment to universities over the next year by more than 35 percent, so it’s going to be a very significant increase. A lot of that will be in Europe and in Asia, but we’ll also be increasing our university relations activities here in the U.S.
One of the key things we’re going to be doing as part of that is really focusing on how we can work with universities to really improve pedagogy and improve scholarships and really move the state of the art forward in those areas. And I’m not going to talk a lot about that, because you’re going to be hearing a lot about that during the next few days, not just from people at Microsoft, but from a lot of our research partners in academia that are going to be talking about their work and some of the exciting things that are going on.
One thing I did want to talk about in this regard though is something that is sort of a personal area of interest or concern to me, which is how can we think as a field about how we can improve scholarship and improve pedagogy by really taking advantage of what is this sort of emerging change in technology, substantial new storage, much faster network and really this notion of emerging Web standards built around XML, built around self-describing data structures and information.
One of the things that’s always annoyed me to some extent as a researcher and as a professor is the extent to which so much of my community — I’m now speaking of the operating system’s world — has really sort of created a lot of isolated islands. I mean, yes, people do send e-mail a lot back and forth, they do share information but there isn’t the kind of sharing of lower-level experimental data and really ways of thinking about doing that that you might expect. And as a result, a lot of papers will often get published with a lot of information in them but you dont really have the underlying data, you don’t really have the information that sits down behind that.
So today we’re sort of in this world, and I stole this slide from Jim Gray, who also shares my concerns in this area. We have this world where scientists and scholars are gathering information, they’re analyzing their data, they’re publishing their results, they’re putting that information out. They’re often sharing data files, but there isn’t really a standardized way in which that information is shared. Everybody has their data in their own format. The way they make their data available doesn’t necessary lend itself to being manipulated or examined by others, and typically applications really operate on a single computer and people build their own and they sort of roll their own ways of analyzing or manipulating data, and there isn’t really the kind of sharing that you’d like to get.
What I think we’re moving to, and you can see a number of communities in the academic world beginning to do this, is a world where people really share information and publications in an online fashion that they go to self-describing data structures built out of XML, have well-defined schema for sharing that information, and that they federate their large dataset.
So really what we’re talking about here is fundamentally taking advantage of the fact that you can now buy a terabyte of storage for under $2,000, that we’re building in the business world and increasingly in the academic world sets of Web services around XML, around standardized schema, around self-describing data and information many of the things, which in my area of distributed systems that we talked about back in the 1970s now we’re really seeing that have an impact and I think that has an opportunity to make an impact in the scholarly world as well.
And Digital Rights Management becomes an interesting part of this picture because now we want to not just sort of share information but people would really like in some senses to be able to control who gets the information that they send around. In the early phases of working on a paper, for example, they want certain people to be able to work with them but maybe not necessarily to have that information widely distributed.
So Digital Rights Management becomes an issue for scholars not just for things like sharing music or sharing videos online, but also for being able to control the flow of their scholarship, control the flow of the information during the early phases.
So you’ll hear lots about Web services. I just wanted to put the stack up there.
The key part of this revolution I think is this notion that we can build standardized interfaces now, that the data can be self-describing, that we can write software that effectively looks at the data, discovers its structure and then extracts from it those things that it needs to do its work, and you’ll hear more about global XML architecture later as well. I’ll put up that slide. But again the notion is we build up a stack of protocols much like the sort of IP, UDP, TCP, HTTP kind of stack, but now built around building distributed applications or distributed services where people can build applications that work on their data and that share their information in a fundamental way.
And then this notion of sharing large archives of data that people have been collecting. To make this more concrete, one of the early examples of trying to do this was the work that Jim Gray did out of our San Francisco research lab, putting the U.S. Geological Service and Russian space agency images online in something called the TerraServer.
The TerraServer started out as a set of Web pages that people could just visit, and 60-some-odd million people have done that, and they continue to be one of the more visited sites on the Internet. But over time, and especially in the last 18 months to two years, they’ve moved the TerraServer from being a server to being just a part of a distributed service and really created for it a set of interfaces that you could access that are being used both in educational purposes, they’ve been used in courses at MIT and elsewhere, but also they can be used for business purposes and, in fact, the TerraService is actually used in production now by the USDA really helping people build, create reports for doing soil analysis, for looking at how plots are being laid out in a particular property and things of that sort.
So the idea is that you can now think about building applications acting against this kind of global data, data in disguise, and that brings me to the next example, which is more recent work that Jim and a number of people in the academic world and the National Science Foundation and the Sloan Foundation have all been a part of, and that is really trying to build a national virtual observatory as a set of Web services, really trying to create an online resource that any astronomer at any time will be able to get access to and to find information.
So the goal really here is to create what Jim Gray often refers to as the world’s largest, best telescope that’s always got a great view of the sky, which is really an online resources, a federation of databases taking data from Palomar, from the Sloan Digital Sky Survey and ultimately from many other telescopes and pulling them together, putting the information in a standardized, schematized way, making it available for access using fairly standard database techniques out on the Internet.
So the process that Jim and others have been going through here is to build what they’ve called a Sky Server, which is to get the data initially from the Sloan Digital Sky Survey and Palomar online, make that available and many, many people at Johns Hopkins, at Cal Tech and in the NSF and the Sloan Foundation have been involved in that process, and then build a mechanism for doing distributed queries across this large federated data information with all of the sort of standard support for being able to build applications against that data so that you can use things like a SQL-style database to access the information, but you can also do your own programmatic work.
So I just want to throw up a few examples. I’m not much of a SQL jockey, and I suspect probably very few of you are. I know some of you are actually, so I won’t say none of you are, but probably relatively few of you are. But it gives you a sense of the kinds of things that you can do now against this database.
Here is just a query to look for the top 1 percent rare star-like entities and he put the query down. I stole these slides from Jim, so I clearly don’t know enough SQL to put them together myself. But the point here is that this now is a fairly simple thing that you can say against this kind of a database, and you don’t have to write an elaborate program to do your own data mining.
Here it is: Find asteroids. Well, that’s again a fairly simple statement operating against this large database. And something, which periodically seems to show up in the news as something we should all be worried about, is looking for fast-moving asteroids that are in near-earth orbit or coming near the earth’s orbit and it’s a little bit more complicated bit of SQL to do this but you get the point, which is that this is actually something where now we can think about operating on this data in a standardized way using standardized techniques, whether SQL or some other programming language that might be available, and literally programming against the service of the sky, where the service is not just a single database but a set of federated databases that are built up over time.
And the point of bringing this up as an example is it’s something that we can begin to think of aspiring to in many different disciplines. As we start to think about how can we take the data that we’re collecting in our experiments, how can we take the data, the raw information we’re collecting as we’re doing scholarships and think about creating schemas for that information, begin to think about putting that online, making it available in a form that others can directly manipulate with software rather than simply something that they can read or can download and look at in some fashion.
Now, I mentioned the notion of trying to move scholarships forward. Another area that I again have a sort of personal interest in is pedagogy, both from my years in teaching and the fact that I work a lot with people in the university environment.
And here I think again there’s an opportunity to take technology and really have an impact on the way we share the fruits of our teaching materials and the things that we build to do the teaching that we do and to do the educational work that we do.
I remember coming up as a young professor at Carnegie Mellon. I always get this feeling like,
“I bet somebody has done this before,”
as you’re putting together your lecture materials and you sort of feel like,
“Wow, I’m kind of reproducing a lot of work that other people probably have done.”
Well, the opportunity I think exists now to again think about creating not online courses per se but, in fact, getting the raw materials in a standardized form that we can think about creating sort of shared stores of pedagogical material that can be worked with and manipulated and really also using that way of sharing best practices and best information between people.
Obviously you can use technology to extend the reach of the classroom. I’ve stolen these slides fairly shamelessly from Hal Abelson, but I have at least noted that I stole them from him. Where basically the goal is to take things like the laboratory and make that remotely accessible. I know many of you are doing that in your universities. MIT has a substantial effort in this area and I think Hal will be talking about it later on, but the (I-Lab ?) project that they’re having and they’re extending the reach of (I-Labs ?) out to a variety of disciplines. It’s not just about computer science, not just about material science but many different forms of engineering and ultimately any area where you may want to be able to make available the equivalent of a laboratory in a remote way, possibly in a safer way than having people actually work in the real lab itself.
Here is just a little slide from the Micro Electronics Lab that they have online and that many students are using at MIT.
Another way of again sharing best practices and sharing work across universities, this is an example of something that’s being done with MIT and a number of other universities where they’re actually doing shared online assessments. These are assessments of writing for again looking at student submissions and being able to allow a number of faculty and students at various universities to analyze and provide assessments of that material in an interesting way. So again this is a pilot project but it’s an example of ways in which we can begin to think about sharing the work that we’re doing in building our environment.
I’ve mentioned I think the notion of sharing pedagogy in a raw form. MIT has the Open Course Initiative. You’ll hear more about that. They also have something called the Open Knowledge Initiative where they’re talking about creating an open-software architecture for sharing a wide variety of different kinds of information.
So to sum up, what do I see as our role at my organization moving forward? Well, we’ve always had the first two bullets there as our sort of key mantras, which is first and foremost our goal is to improve the state of the art in all the areas where we’re doing research. We publish thousands — thousands of papers a year and are really trying to make sure that the work that we’re doing is available, is out there for people to build on and work with and we’re committed to move the state of the art forward in the many different areas that each of us does our own research.
We’re also obviously trying to make Microsoft products better and to create new products and new product categories, and we’re putting a huge amount of energy into tech transfer. And it really has an impact. One of the things that we do each year is something that we call Tech Fest, which is our technology festival, where we get in this building actually we get the entire building, we set it up as kind of a research fair and have people from campus come by for a day — initially it was one day; this last year we just did it for two days, and the response from the Microsoft employees is tremendous. We had over 5,500 Microsoft employees attend the last event and that’s more than 20 percent of Microsoft’s employees in the Puget Sound area showed up for this, so you could see the amount of interest that Microsoft’s employees have in research and in new technologies. It’s just tremendous and it sits across the board.
One of the things we did this last year that was new was we put in a number of lectures, presentations and I think we had 10 different lectures that we did to really go in depth into a number of different topics for people. We kept records, of course, as to who was really attending these things and I think there was something on the order of four people that attended all ten lectures and we were pretty surprised about that. That seemed like a lot. Next year we decided that we’ll actually give out a special award for people that do that. My recommendation is to call it the Iron Butt Award, but I don’t know whether we’ll really go with that one or not.
The last point I think has become increasingly important part of what we think of as our efforts here of our research organization, which is to work with you, to work with universities, professors, students. We bring a lot of professors into Microsoft Research as visitors and interns, but also to reach out into the academic community to improve education, to improve scholarships, to really create the technologies that are going to be the future of our field.
I don’t know if any of you saw Steve Lohr’s New York Times article just recently. I contributed to that a little bit with a quote or two. But I think this is a time where, especially last week it just seemed like it was a really depressing week. Every day there was some new company seemingly going bankrupt, everybody talking about the vacancy rates in Silicon Valley, which I think are now 26 percent of the buildings there, the downturn in the economy, the stock prices going down and so forth.
And I think sometimes you can lose sight of the fact that our goal as researchers, our goal as educators and professors is really to keep the future coming, right, to really continue to create the technologies and create the excitement that is going to turn that around and really move things back in a positive way. And I think the news over the weekend of the miners being rescued, maybe that’s a symbolic event, maybe things will really turn around. I know the stock market was up for a little bit this morning; I don’t know if it’s still up or not.
But again I think the key thing here is we have to work together to make sure that our field continues to be an important part of people’s lives, that it continues to help people and help our economy and our nation be prosperous, or nations for those of us from all over the world, and to really make sure that the world is a better place.
And so with that, that was my inspirational message for the day.