Craig Mundie, Microsoft Research Faculty Summit 2005

Remarks by Craig Mundie, Senior Vice President and Chief Technical Officer, Advanced Strategies and Policy, Microsoft Corporation
“The Future of Technology” Microsoft Research Faculty Summit 2005
Redmond, Washington, USA
July 19, 2005

CRAIG MUNDIE: Good afternoon, everyone. I have two parts to this presentation. The first one is quite short, and the second one gets a bit more into what I think some of the big issues are that we’re all going to face together. Let me deal with the first one directly.

Many of you, of course, are involved in teaching computer science and until now if you wanted to teach anything about the internals of Windows, you really didn’t have much to work with. And we’ve been asked for many, many years to make some type of arrangement that would allow more curriculum to be developed, more course work to be developed around Windows. And, the number of reasons I think that people have an interest in that, they range from the purely technological interest to a better understanding of how the system works. And, frankly, because there’s such a wide usage of these products on a global basis, there’s a lot of interest in having more people that are just well trained on these systems. But the proprietary nature of the product, the way in which we have essentially maintained these from a source control point of view in the past, has made it very difficult.

But recognizing that there is probably a win-win situation here if we could find a way to deal with it, we started about two years ago to move beyond this situation where all you had was a book that a select group of people could produce that would tell you about Windows internals, and saying, How can we create a way where you could have more direct access?

Four years ago in May, I announced that the company was going to formalize a number of its source-related activities into what we have now called our Shared Source program. And the idea, as evidenced in 2001, is that we would try to create licensed access to our key intellectual properties, including source code, and offer them to different constituent groups in a form consistent with their need and consistent with our ongoing requirement to control that intellectual property. It’s been very, very successful. We’ve ranged from pretty much complete free access for academic purposes to things like the source of Windows CE to much more controlled things like we do in the Government Security Program on a global basis, where we offered 60 governments complete access to the Windows source code for the purposes of doing security analyses.

In the case of academia, we’re announcing this week that we have expanded this into a new Shared Source program that deals directly with providing access to toolkits and sources necessary to develop and teach programs around the Windows system. And there are a number of components to this. There is a curriculum resource kit that allows you to teach introductory as well as advanced courses in this area. We did a thing we called Project Oz, which, when taken in conjunction with this curriculum kit, allows people to actually design and extend important components of the operating system, and the tools with which to build, and then be able to run the resultant derivative work for training purposes. And then we’ve got a research kernel that you can use in conjunction with this kit to actually allow people to do design projects and to build experiments and to be able to instrument them and do other things that we think are interesting in trying to teach both advanced courses, and in particular to do operating system research in new areas.

So we’re quite enthused about this. We’ll continue to look for more ways to do it. But this is a fairly comprehensive set of materials. It’s not slipping a set of sources over to you and saying, Here, have at it. It’s been quite thoughtfully produced in terms of these curriculum kits. I was reviewing the other day a printout, which is like the Manhattan phone book, of essentially slide decks and other things that you could essentially take and assemble your own custom materials to incorporate this as part of an existing course, or to, in fact, build completely new courses. So I think that we’ll look forward to getting feedback from you. I think it’s an important step forward in allowing more people to understand what it is that we do, and are planning to do, and also to ultimately enlist more help in understanding how we can address some of the changes that we all face.

So let me use that as sort of a segue into what I think some of the biggest challenges are that, as an industry, and as an academic, collaborative, we’re all going to have to come to grips with. I call this talk “Overcoming Concurrency and Complexity Through Composability.” And those three “C” words — none of them are new — we’ve all dealt with complexity in many ways. We all deal with concurrency to some extent in computing today. And there have been a variety of ways that we’ve looked at composition as a way to improve the process of producing software.

Unfortunately, though, my view is that we never really fundamentally focused on the three C’s as something we really had to get a handle on. They are things that we sort of encounter them, we deal with them in an ad hoc way, but we haven’t really focused to develop a fundamental way to either address the problems or the opportunities that are created by the concurrency. In particular, Microsoft struggles, I’ll say, with the complexity problem. Our business is to regularly produce some of the world’s largest monolithic applications and operating pieces of software. And because of their size and the capacity of the machines and our ability to use that capacity, these things have gotten really big and complicated.

And to some extent we’ve been able to invest through our research organization and our tools organization to develop internal capabilities that allow us to try to manage that complexity. But if you really go back and look at it, in my opinion, no, we and the industry and academia at large through the last decade have never fundamentally addressed the question of how you make software an engineering discipline akin to many others that have been around for a lot longer, where composition is, in a formal sense, very much a key part of how you assume that you get something that works at the end. Today the complexity has manifested itself in many different types of failure modes. It adds a lot of cost to the testing of these products. As we have come to grips with problems in the Trustworthy Computing Initiative, for example, like security and privacy, it just rears its head over and over again to say it becomes very hard to make any kind of attestation about certain aspects of these very, very large programs.

And so I think that as a discipline, we really need to take some fairly radical steps in the decade ahead to make software move to be more of a “true,” I’ll say true in quotes, engineering discipline.

So let me explain why I think that the time has come where we can no longer finesse this problem. It starts with really an understanding of where we’re at in the evolution of the CPU itself, and I’ll start with just the physics of it. Everybody knows and talks about Moore’s Law, and actually if you talk to most people, maybe not in this room, but if you talk to people on the street, and ask them if they thought they knew what Moore’s Law was, they would actually mostly tell you that it’s that thing that says that computers get about twice as fast every 18 months.

Now it turns out Gordon actually never said anything about how fast your computer would get; he actually only talked about how the transistor density would double approximately every 18 months. That’s only part of making the computer system go faster, or have more capability. A corollary has always been that you get the clock rate to go up, too. And a large part of the apparent performance benefit, not capacity benefit per se, but performance benefit that has accrued to software through the years has come from the increasing clock rate coupled to the increasing transistor density. And to some extent, the density has allowed more complexity, and the complexity has been used to do other interesting things in the architectures. But at the end of the day, clock rate has been a key factor.

The unfortunate truth is that the progression of clock rate is largely going to slow down, if not almost stop in its entirety. And, in fact, there’s even a chance that for a single CPU the appearance will be that clock rate goes backwards, or at least just kind of holds where it is. Why is that true? The answer is, for clock rate to go up in CMOS processes, we essentially had to lower the voltage at the same time. You could raise the clock rate, but it produces more heat, and at some point it becomes either too expensive or too inconvenient or impractical to get rid of the heat, so you want to do something at lower heat, so you lower voltage. Unfortunately, we’ve kind of lowered the voltage about as far as we can go in these processes down to sort of the electron bolt levels. And so, if you want these things to be reliable, you want a transistor to do the same thing every time, you can’t really keep shrinking it that much more in both geometry and voltage, and so we’re sort of out of gas in terms of our ability to see clock rate go up because voltage isn’t going to come down.

Unfortunately, we can’t just say, well fine, we’ll just let them get hotter because a lot of the applications that we want to deal with, particularly in the mobile space and the entertainment space, lobby against the idea that we’re going to have more and more exotic cooling systems. The appetite for people all over the planet to have access to this kind of computing at lower and lower prices also lobbies against more and more esoteric cooling capabilities.

I remember in the days when I was in the supercomputer business, Seymour Cray, who was one of the kings, obviously, in supercomputing when he did the Cray computer, he announced that his greatest achievement wasn’t the computer architecture or anything else, it was the cooling system, and the way in which he could remove the heat from the machine. And to some extent that problem hasn’t really changed. If you look at this graph, it’s pretty interesting. Pat Gelsinger, who is a friend of mine and was up until recently CTO of Intel, is now running their entire chip business. Back in the spring of ’04 he put this up at a developer forum and pointed out to people that we’ve been of late on an almost exponential increase in the power density of the CPUs, these chipsets. And it’s indexed in the red arrows against how hot other things are in terms of their power density, watts per square centimeter. And way back around the 486 time in the ’90s, we passed the hot plate. So all you had to do was make these things big enough, you could cook on them.

Then if you keep on going, and as you get into the class of Pentiums, we’ve passed nuclear reactors about this year, and we’ll get the rocket nozzles in the next couple of years, and at that point you’re only at an order of magnitude away from the surface of the sun. So at some point there’s a real practical limit to how we can cool these puppies. And so we kind of realized something has to give, and what we think is going to give is that we can’t really assume that we’re just going to clock them faster and deal with the heat. So if we’re going to have some tonic on heat, we’ve got to have some tonic on voltage.

What does that really mean? It means that CPUs, as we know them, are not going to get faster by dint of clock rate improvement. This basically shows what the implications of that are going to be over time, and how that relates to other phenomena we’ve all been dealing with from a system design and architecture point of view for some years. If you look over the last 15 years or so from 1990 to the present, the graph on the left shows the clock rate of the CPU plotted as a gap against what the DRAM access lines are. And what this means, as you all know, is that the appearance of memory as it gets farther and farther away in clock ticks for when you have to miss. So what have we done about that? Well, we’ve tried to put more and more sophisticated intermediate memories in. And then that gets complicated, and that’s complicated just in one machine. But then when you want to say, I want to have multiple processors, I still want memory coherence, and I want cache coherence, the complexity of the computer system itself, before you even get to the software, is essentially going up at very significant rates.

If you look at the relative improvement, which is the graph on the right, what you can see is that the microprocessor, memory, disk, all of these things have essentially gone up in bandwidth, but of course there’s no corresponding improvement in latency. And, because of that, things just appear to get farther and farther away. And so a great deal of energy has gone into both software systems and hardware systems to try to mask this latency, but at some point that becomes quite challenging as well.

And so I think one of the more profound things that happened, happened last fall, Oct. 14, when this headline hit the press:: “Intel Cancels Top Speed Pentium 4 Chip.” In a way, this was a very, very significant event because Intel had always made their products faster and faster, and they had a road map, and they were actually building this chip. It was actually getting fairly near production. But, in a sense, they kind of realized that they could build it, they could deliver it, it would be a little hotter. But at the margin, in terms of how fast an application appeared to run, it wasn’t that much faster. And so the cost to develop it, plus the market appeal of it, you could say, were coming to be imbalanced. And so, in what I think was a very important shift, Intel said, we’re not going to make any more single processor machines, we’re going to turn our energies toward making machine chips that have multiple cores on them. And AMD has done a similar thing. And this year we saw the advent of the dual-core machines from Intel and from AMD.

And so, again, what you start to see is things that used to be happening at a very macroscopic level — first it used to be separate machines, then we built the symmetric multiprocessors, and now we’re moving some of that onto single chips — that trend will accelerate because Moore’s Law continues to give us a doubling of transistor density, therefore we can put more stuff there, but to do anything interesting with it, we’re really at the end of the line relative to the traditional computer architectures, or so it seems. Which isn’t to say that there won’t be other clever things that come along, or that there won’t be some revolution that will be created, but it’s pretty clear that for quite a few years, a few decades at least, through the efforts in scientific computing, and supercomputing writ large, we know that there certainly is a class of applications that can benefit from parallelism.

And the real challenge for us is to decide if there a much larger class of things that could benefit from concurrency if, in fact, we really were motivated to go get it? Many of the things that we do in our lives every day, our sensory systems, the whole world around us sort of runs in parallel all the time — no one is taking locks, or worried about semaphores, it just works. But, unfortunately, the way that we mimic these things in software has started with the assumption that hardware was a scarce resource, we multiplex the hell out of it, we try to get  we have procedural programming systems, all of these things have conspired to create an end-to-end system that’s really not very friendly towards extracting concurrency out of software systems. And so we now have to figure out what we do about that.

So it’s clear that change is coming. You can say people are looking at what the architectural implications are of cutting the design teams loose in the hardware design companies and saying, OK, go get more creative about what you put on there instead of just trying to make one thing faster and build memory system accelerators around it, well, let’s try to build some specialty processors. So one example already that you can see is the work by Sony, Toshiba and IBM with the cell architecture, where they basically have built a machine that has one traditional CPU core, the power architecture, surrounded by eight specialty cores. It has a different memory organization than you’d see in a traditional symmetric multiprocessor system. And they can posit that just through transistor scaling and other things, they’ll be able to put more and more of these elements on an individual chip, and then package more and more of these chips in larger and larger systems.

Again, about no more than a year ago, Pat Gelsinger at Intel said, We see a really significant shift in what architectures are going to look like in the future. Fundamentally, the way we began to look at doing things is to move from instruction level concurrency to multiple cores per die. What that does is, it essentially moves the problem up a level and says, Well, great, now you’ve got a lot of CPUs down there, they’re less expensive, but they don’t really accelerate all the traditional applications. There’s a certain class they do, but there’s a very large class that they don’t.

Clearly this gets cascaded up into what we think the aggregate computer system looks like. This is what IBM, as I understand it, is talking about with Blue Jean L, where you start with these cell architectures, and you essentially scale them up at the chip, the board, the cabinet and the integration of cabinets, and more and more what you’re looking at is an interconnection system that is routed packets, if you will, or message-oriented. And what’s interesting is, the manufacturing phenomenon in the hardware is going to continue to have things essentially collapse down from right to left in this picture, and it won’t be that far away where we’re going to be looking at what today is probably certainly a board, if not a cabinet full of CPUs that will collapse down to be on a single piece of silicon. And as that cascading happens, you move away from saying, Hey, this is a very, very expensive machine, put in a special place, and use for special problems, to saying that the architectural implications of that will essentially accrue to all of us who want to write standard software to run on desktop, laptop, televisions and other things. That’s where I come to the conclusion that we don’t really have the tools, and we really haven’t thought deeply about how we’re going to move the world’s 10 million professional programmers, let alone the amateurs and the students and everybody else, to thinking about a model that gets something out of this.

Today you could say that under the traditional model of software, and the tools that we’ve used to write it, that as the clock rate went up, and if it could have continued to go up, it was just a free lunch for software from a performance enhancement point of view. But, unfortunately, since that isn’t going to happen, what we’re going to do is take this big right turn and say, Well, we’re going to have more cores, but we’re not going to have all that clock rate enhancement. Unfortunately, the right turn doesn’t produce the free lunch for traditional software. It doesn’t get faster merely because of the work of the underlying engineering. And I think that the implications of this are not largely reflected yet in what you see happening in the industry, and I would specifically say to you they’re not very reflected yet in what I see happening in academia, at either the research, and certainly not at the core levels of training people towards these future work environments.

So we’ve been thinking about this for quite some time. We see paradigms that have emerged when the zoom factor is sort of at the level of the systems we’re building today, at the data center, at the Internet scale. And it forces me, at least, to try to think through the questions that say, well, when things that used to be data center-class problems collapse down to be one machine, and things that used to be big SMP machine problems collapse down to be single chips or desktops, where do some of the things that we’ve done inform our direction there, and what other things are going to have to be done?

One of the things we’ve looked at is, if you take a very simplistic view of what a computer is, and you say, it’s just three things, storage, processing and interconnect, then a computer architect and a computer user’s challenge is given any particular set of constraints, and they’ll design the optimal computer and design the optimal application for that computer. One of the things that’s been clear to us, and certainly a benefit to our business, has been that most of the world’s computing capacity accrues at the edge. And as we have moved to interconnect where the Internet became the interconnection, it enabled a new and larger class of problems to be solved that way, and the people who are trying to build, for example, grid computational facilities, are just looking at this with a very large zoom factor, and saying, We’ll see what happens.

Now, what’s happened, actually, at that level, is that processing has, courtesy of the Moore’s Law plus-plus clock rate improvement, has appeared to be on an exponential track. What we now know is that processing capabilities will stay on that track, but single processor performance will fall off of that track. Storage, you could essentially call it Shugart’s Law, well, it has too been on an exponentially increasing track, at least as to the capability of rotating magnetic media. But we’re also continuing to see very rapid improvements in the capacity of semiconductor-based storage, too, so that will continue, so it’s smaller and cheaper. Interconnection in terms of bandwidth, obviously, is going up, too. At the wide-area level, though, it is not the case that it will necessarily be cheaper over time. When you start to deal with the long latencies and the regulatory environment and the interconnect and the fact that rocket launchers and backhoes don’t follow Moore’s Law, you really have to have people dig up the streets, we haven’t seen, and I don’t expect to see, the same kind of exponential improvement or reduction in the cost of bandwidth.

And so at the macro-level, we concluded that that leads you to the conclusion that if you’re going to build these distributed systems, you have to put the data as close as possible to the processing. And a lot of the work that various people have been doing is thinking how do you partition problems, move it out there, and get it done. Now, if you actually take and ask, how does this evolve? When I say, Look, it’s not the Internet anymore, it’s not even the data center, this whole thing just collapsed down and it’s on your desktop, then, in fact, it isn’t clear what conclusions you would draw. When you’re not out there having to dig up the streets to make the bandwidth get out there, you’re just essentially trenching on a die, that’s a little easier problem. And so the bandwidths are going to go up, but also the apparent cost of connections, at least within a chip, are going to get dramatically cheaper and dramatically faster at the same time.

So I contend that when you start to make all these changes, many of the things that have been just basic tenets of how we think about computer architecture, how we program them, how we design and organize them into systems, many of these things are going to be proven to be incorrect when we start to collapse some of these things in this environment.

So when I tend to think of this concurrency spectrum, I think that there are some elements of it that tend to resemble the problems that we’ve now dealt with at the data center, and wide-area level, but there are a new class of problems, and opportunities in what I think of as intranode concurrency, where you’re down within a machine and the nodes within a machine or the processes within a single chip, that create new problems as well as new opportunities. And so we really have to say how are we going to address these problems?

The other thing I want to talk about a little bit is complexity, and so if you weren’t motivated enough to decide we need a strategy for concurrency by the mere fact that the hardware will demand it, I’ll say that there needs to be a strategy, independently at least, to deal with the complexity problem. Here’s a couple of graphs that are kind of interesting. These were people who studied things outside of Microsoft, and by our measures they were relatively small systems. The biggest one was 10 million lines. And today we’re regularly producing things that are over 50 million lines, 60 million lines of code as a single, monolithic system. To some extent, the graphs are scary enough, even before you get to that next level.

On the left, what you see are basically a graph of the percentage of products against lines of code, and how they completed. Blue was early, green was on time, yellow delayed, red cancelled. And you can see as the codes get larger a high percentage never complete at all, and very, very few of them have a predictable outcome relative to timing. If you look at the graph, only 13 percent finish on time when you get out to the right here.

As you look at the other one, you basically say, well, let’s ask what are the programmers really doing as they work on these big codes? And, of course, you see through a whole variety of effects that as the code gets larger, and here only was studied out to a million lines of code in a system, people spend less and less of their time actually coding, because the inefficiencies, you could say, that come from having to be part of a large team, the communications stuff, the requirement for more sophisticated design activities, the need to document it, all these things are parasitic relative to actually writing the code that does the work. And so all of these things conspire against, today, building bigger and bigger systems, getting them right, on time, and making them understandable.

And so for us this is a big challenge at Microsoft, and we’ve thrown a huge amount of energy at it. One could say it is quite an engineering achievement that we can build and assemble fairly regularly code the size that we do with the number of people involved that we have. But even so, we would certainly love to be more efficient. And even with that level of investment, the complexity does come back to haunt us, for example, as we try to be very, very disciplined about things like security and all the ways in which those problems can be manifest in a system.

This is another motivation to think about a different way to write applications. For me, a few years ago, I became quite convinced, and I’ve been trying to move Microsoft gradually to think more about these problems, that concurrency and complexity are two things, and that some type of, I’ll say, verifiable composability would be one hammer that you could use to basically pound on both of these nails at the same time. Today, if you look at the attributes that I want to have in future systems, they are almost exactly the inverse of how we build and offer systems today. I want them to be loosely coupled at every level, today most programs are written as tightly coupled things. I want them to be fully asynchronous, but today most things are completely synchronous. I want them to be highly concurrent, but most of them are not concurrent, or even at very low levels of concurrency, the challenge for the programmer becomes extreme, and our ability to make them reliable becomes very poor. I want them to be composable, so that we can essentially build on them over time, not in the traditional sense of component-oriented programming models, but more in the sense that I want to be able to assemble these things and reason about them not at runtime but at compile time, and we don’t really have the technology in languages today or in tools today to do that in a wholesale way. We want them to be decentralized because all this capability is being at every level moved out farther, separated from one another. And perhaps most importantly, we’re going to need to be resilient. At some level, we should start to expect, even within individual machines, that the kind of failure modes that have been present in the wide-area Internet environment, or even in very large-scale data centers will begin to be manifest within a single machine.

That then leads to all kinds of new challenges that say, well, how do you know when something broke? What’s the equivalent in a single chip system of “unplugging a blade in the data center and plugging another one in”? And how do you schedule work around that? So the notion that our software and systems today, which largely assumes that at least within one machine that the thing is mostly perfect, that every bit works every time, all the time, or in some relatively simple ways, like ECC memory, we can mask the failures that tend to occur fairly frequently. But I think the hardware is trending in the direction where there’s likely to be more transient failure, not less. And I think certainly the complexity of the software is such that we’re likely to see more transient failures, not less. And yet, are we really architecting our products in a way to deal with that? And I contend that, of course, historically, we haven’t.

In part, we just have never really created the methodology and the tools and the fundamental concepts, and not trained large numbers of people around these things in order to be able to go out and address them. I mean, every one of these little adjectives tends to, I would say today, give people nightmares. And yet I think that they have to be the hallmark of computer systems going forward. The requirement for resilience and scale both come because these things are becoming critical infrastructure for everything. And, as such, the world around us is going to demand from all of us that we are able to produce these systems with a much greater degree of reliability and resilience than we’ve engineered in the past.

So I contend that these things start with some deep architectural issues in hardware and software, and they then extend all the way up to the question of how we are ultimately going to inform all the programmers on the planet, and train all the new ones who are going to come to be able to deal with the capabilities that are going to be implied in all these new machines.

So I think we have a great set of challenges. Clearly, in the next few years we can keep doing more of the same, but I think that there are some discontinuities in our future. And as an old sort of computer designer and software hacker, I look at these things and I say, these are both the biggest threat to the way we have all historically thought about computing, but similarly they’re also probably the biggest opportunity for all of us to pick ourselves up and move forward to a completely different level. And, it’s sort of the latter view that really motivates me to focus on this.

I also think that the reason I think that the two are related relative to this concurrency problem is that, in the systems that we design today I contend that there is sort of a missing level of exploited concurrency. The huge effort has been expended to try to get instruction level parallelism. And we’ve squeezed almost all the blood we can out of that stone, and, in fact, I think people have realized they’ve almost expended too much energy in the hardware world to try to get that last little bit, that it’s pushed them to a level of complexity and transistor utilization that doesn’t yield much for the investment. And so I actually think that we’ll see some simplifications there, feeling that if the software wizards could figure out what to do with it, it’s a lot more efficient to give you another whole core than to tell you I’m going to get hyperthreading to work, or get some other low-level instruction synthesis of parallelism.

Then you go up a click and you say, The operating system within a machine, it’s concurrent, but largely only to a limited degree. Even if you look at a Windows machine, relative to the number of apparent processes that are there to run at any given instance, the number that are actually running versus the number of parallel things are really possible to run are quite different, I think, simply because the level of granularity of blocking that grows up as you build systems in a traditional way is so large that it tends to obscure a lot of the underlying potential concurrency. And yet, trying to get that rung, trying to squeeze it out of the system, requires a level of analysis that is sort of the software equivalent of trying to get more instruction-level parallelism out. It isn’t clear that it’s been cost-effective relative to the traditional class of problems.

Then you go up to the level where you say, OK, you application programmers, we’re going to give you a couple of models to deal with, we’re going to give you threads and processes and interprocess communication, remote procedure calls, et cetera, but if you look at how that’s sorted itself out, the only place people seem to make really good utility of that is with very, very large granules. I can have my e-mail system talk to that database system, or I can have my desktop thing talk to an Exchange Server. But where we’ve been able to impose protocols as the way in which these systems talk — because we haven’t had a good way to really say we’ve got rigorous contracts that define these interactions, we can’t reason about the systems as we construct them — I think you end up where people just basically come out focused on a level of complexity and concurrency that they can manage.

But the reality is, in the real world, I think, that there is a wild amount of concurrency that goes on, that there are many, many problems that probably have more intrinsic parallelism than we exploit, but we don’t find it because of the tools with which we write these applications. And so my personal thesis is that there is a way to focus a lot on how to write programs, create things that are more composable, and, as a by-product, have the ability to extract a lot more concurrency in a world where we’re going to have this kind of hardware in the not-too-distant future.

And so I’ve been very interested in encouraging research at Microsoft in this area. We’ve been examining different ways to put that together. It’s pretty clear to me that there are interesting research directions, but there is no obvious winning answer yet. And yet we’re staring down the barrel of the hardware gun that says, hey, the bullet is going to leave the barrel soon and, therefore, there could be a huge mismatch between what the software world is prepared to deal with and what the hardware world will essentially offer up. And in that is a great deal of opportunity.

So I think this is, in some sense, years away, but relative to developing curriculum, training students, doing research in computer systems, and language systems and tools, four, five, even six to 10 years is not all that long when you realize how much hysteresis is in this system. And so I chose to give this talk to you today in hopes that I could encourage you, if you’re not already pondering these questions, to turn some of your attention to it, and to realize that we’re going to have a critical requirement for students that are well-trained in these disciplines. And because, despite the research budgets of companies like Microsoft, Intel, AMD, IBM and others, I don’t believe that there is well-understood direction for many of the questions that I’ve posed in this talk, that there really is, perhaps, more demand for real research to be done in conjunction with the industrial efforts than we have for quite some time. And so I want to encourage you to think about that.

So I’ll close my formal remarks there, and we have another 15 minutes or so for questions, to the extent you want to have a discussion.

QUESTION: I liked your trajectory, and the way you painted this, and sort of running up against a hard wall with hardware, but I want you to put in perspective for me where on your trajectory some of the following things fit. One of them is, if you remember the old computer company Burroughs, which designed a machine 

CRAIG MUNDIE: I worked there.

QUESTION: You remember the design of the 550, which was based on functional combinators, which had a lot of the properties of the things that you’re seeking now, but there wasn’t a conceptual model to talk about what it meant to debug a program running on a combinational machine. So that’s one of the places in this trajectory.

CRAIG MUNDIE: Right.

QUESTION: Another thing is, last month, there’s a paper in Nature that talked about the demonstration of a single molecule transistor. So, with sort of a million size reduction, or at least those orders of magnitude reduction in power consumption, do you think that that’s going to deflect your thing, do you think we can integrate it along your trajectory?

The last one is, functional and logic programming attempted to build technology that was premature because the sequential machine became too fast. Is it time to revive that kind of paradigm and see how it works to build the composability that you’re looking for?

CRAIG MUNDIE: I guess, kind of in reverse order, I think that, yes, I think the time has come again to look at some of these issues.

Relative to your second question, I think the time for commercializable, single-electron transistors is not in the same time horizon as this problem is going to hit the industry. And it also isn’t clear to me that even if you had them that the whole system would essentially be able to be accelerated. I mean, that curve I had that showed the gap between CPU clock rate and DRAM access time just continues to get bigger. It isn’t clear to me even if you shrunk the thing down to single-electron transistors that we wouldn’t continue to see some divergence if you just keep the computer architecture the same. And then, if you say, I’ve got a million times more transistors on a die, well, then you’re going to say, Well, great, what are you going to do with all of them? And if you want to say that they’re all different logic elements, then you’re back to the same question of how do you connect them to the memory system? What is the memory system? And so it isn’t clear to me that any one of those things is going to allow us to escape, and certainly that particular one is not in the planning horizon from a time point of view that correlates to these very real changes that are upon us now.

QUESTION: Could you go back two slides in your presentation, the one before the discussion?

CRAIG MUNDIE: We’ll try.

QUESTION: So, for your presentation leading up to this slide, I thought was the best speech I’ve ever heard about multi-agent systems, which is my research area, as it turns out. So I think what you really need is fundamentally different ways of thinking about software, not just from the standpoint of performance, as you discussed, but also the standpoint of project management and correctness, which you also mentioned. I think somehow computer science has to break out of this single-node mindset, which it has dominated it. In the multi-agent systems community, which has been around for over 20 years now, we have considered autonomy as a starting point. So you think of autonomous computations as key entities, and then what it leads to is — there’s a buzz word that’s lacking here, which is interaction. Drag your attention away from a single computation to interaction among computations.

And when you begin to study interactions as first-class entities, you discover that composability is more natural, at least to conceptualize, than other evolutions of structures. You can think of various levels of dynamism there. So the outcome evolves naturally. Is Microsoft pursuing those lines of thought? Or would you be interested in pursuing them?

CRAIG MUNDIE: Well, I think we’re interested in all these things. I don’t know whether the way I think about it is exactly what you think about it. I mean, I’m happy to have multi-agent thinking as one way to achieve composability.

QUESTION: But making interactions central as opposed to sort of incidental to the enterprise?

CRAIG MUNDIE: You’re right. I mean, I didn’t put that on there, but it turns out that if you make these things loosely coupled and asynchronous, I think you end up with an interaction-centric requirement. So that’s why I say we may talk about or think about these things in slightly different ways. You’ve got a moniker for one approach to that. I was trying to really just talk about attributes of things I think I’d like to see, and there are different ways to try to achieve them. But clearly, producing autonomous work items and then trying to figure out how to assemble them into a working system is an interesting challenge.

At the scale of the Internet, and it’s all the things we’ve been doing for the last few years in Web services, and service-oriented architectures are basically an interaction-centric way of thinking about decomposition in design. What’s interesting there, my view is, we have not yet actually given people a formal model to think about how to get correct programs in that environment. So in a way I think we’ve given people the plumbing, and if you’re a genius, you’ll probably figure out how to do some interesting stuff. But if you’re just an average programmer, I’m actually quite afraid that the problems that we used to reserve for the guys who work on the kernels and the device drivers are now going to pop up with everybody. And they are not as equipped to deal with that in either tools, or IQ, or a whole variety of areas to deal with those kinds of very complex problems.

QUESTION: I think we need abstractions, and then the tools will follow from those, but I think the abstractions are incomplete right now.

CRAIG MUNDIE: Yes, I agree with that.

OK, over here.

QUESTION: You gave us a great set of challenges. You also pointed out that the hardware folks have produced a lot of quantifiable improvement in hardware over the decades. Just so us software types won’t feel so inferior, can you give us any quantifiable examples of improvements in software that have been made in the last  in a similar period of time?

CRAIG MUNDIE: Well, I guess the issue is one of dimensionalization. My view is that, and I guess the short answer is, I don’t have any numbers I can cite for you off the top of my head in any particular dimension of how software has improved. On the other hand, the complexity of the programs, I mean that’s the one that I focused on in this talk. I mean, you’ll remember the infamous quotes even by Bill Gates of who would ever need more than 640k bytes of memory, and what would you do with it. And, Ken Olsen basically said, why would anybody need a desktop computer. There’s lots of famous people who have looked at this and said, I don’t understand, given the software as I think about it now, how you would ever need that kind of capability, and software has essentially just run rampant over any of those apparent boundaries. So you can use the memory system and the capacities of these systems as one metric for how software has scaled up to deal with that.

But I don’t think we have a way to deal simplistically with the characterization of improvement in software. What I know is that the complexity that we’re able to deal with in just the aggregate lines of code that we integrate into a system and try to qualify relative to its reliability and security and other things has improved more decimal orders of magnitude than even the hardware improvements have, which is about decimal orders of magnitude in that last period of time. So, that would be  how to characterize that might be an interesting question to analyze, but I don’t think anybody has focused much on it.

QUESTION: I appreciate your picture of the trajectory, and I’m sure there are many applications for which faster processors and grid computing will be beneficial. But the tone of this event in the last two days, a lot of the emphasis has been on three directions that focus more about the users’ experience, and I wanted to ask you to comment about some of these issues. People talk about the frustrations of e-mail and spam, about the shift to mobile devices and cell phones, and that the bottlenecks they perceive are user interfaces and screen size. So if we’re going to talk about increased productivity and experience from the user, can you comment about emphasis on those issues?

CRAIG MUNDIE: Well, I think the solution to many of these things is ultimately related. I always have said to people, I came here 13 years ago, and I started all the non-PC things. Watches, cars, television, game machines, et cetera. Now, all of these things we’ve started over that period of time are all kind of coming to the fore now as a big part of the overall computing environment that people live their lives and work in.

During that period of time, we used to say, it takes a lot more software to make it appear simple than it does to make it appear hard. And I think as we move forward to change the modalities of interaction with computer systems, the speech, and vision and other things, it’s pretty clear that having more horsepower and a better way to do it might be a factor. That said, there are clear improvements that can be made in the current environment. You talk about spam, there are clear directions for how to deal technologically with part of the spam problem. But I would contend that in a number of areas, digital rights management being one, spam being another, that for all of these systems to work, you actually have to have an equilibrium among a number of things, some of which are technical, and some of which are not technical.

For example, the public and the mores of the public have to be brought into alignment or awareness about these things, they have to care. I mean, it wasn’t that many years ago people didn’t use e-mail, and then they knew what it was, but spam wasn’t an issue, and now it’s essentially you can talk to almost any politician, and they’ll know what it is. So society has to progress in terms of its perception of the need.

The second thing you have to have is, people have to offer people a choice as to some things are better than other things. Many of these systems are new enough, or because of the interactions being at such a scale, it’s hard to get them to converge on a single system. Then you get into the need to have technological speed bumps that try to get people to do the right thing. And then, finally, you need policy and legal environments that essentially sort of coerce the people who are at the fringe, or the criminal elements to actually also do the right thing or face some penalties.

In many of these areas, those things are no longer in equilibrium as we move from sort of the all-physical world into this cyberenvironment for some of these things. And so my job is more than half policy nowadays, trying to get governments to contemplate these things, and have their legal systems and law enforcement activity and other things lined up, because if you overburden one of these things, because of a deficiency in the other, you don’t end up in a stable state. And so right now, I think on spam, there’s been too much of a tendency to want to shift the burden to a purely technological solution when, in fact, we probably need to get global law enforcement to have some kind of notion of effectiveness.

But then that gets you into all these crazy new problems like the world has no global governance system, and so you now have a network and computer systems that don’t stop at geographic boundaries, and yet our society globally hasn’t figured out how to deal with problems that can be put on any one place in the planet from any other place in the planet. So I think some of these will be resolved over time, but there is no easy answer to some of them.

As to the last point, it is true that 

QUESTION: (Off mike.)

CRAIG MUNDIE: Yes. We think that they’re critically important. We think, particularly, I believe, in a lot of the emerging markets the first computer in many people’s lives will be their cell phone. The second computer will be their television. And the third computer will be some shared-access traditional notion of a computer. And then, finally, they’ll actually aspire to ownership of a device that supports them that way.

So we view it as a very, very critical thing to create symbiosis between the things we traditionally know and grew up first with in the rich world environment, with the things that are going to emerge and get the scale in places where PCs are, for example, not going to be there first. There are many challenges in trying to create that symbiosis. We do, as I guess you mentioned, share the idea that the basic  one of the defining characteristics of these different classes of computer systems happens to be its relationship with physical screen size. So, we think, OK, what are the interesting screen sizes? Well, there’s the watch screen size, that’s interesting for a certain class of things. There’s the cell phone one. There’s the pocket PC one, that’s equivalent to the paperback book class of screens. There’s the one that’s a page of paper size. Then you get up to the desktop, the big desktop, the wall size displays, and the stadiums.

And I think at one point people kind of start with the idea, that, well, maybe you can just have the same thing work in all screen sizes, but it just doesn’t work. You just can’t shrink the icons or the model of user interaction where screens are a component of it down. On the other hand, there’s probably a lot more that can be done to try to facilitate moving information that was originally rendered for one mechanically down to be rendered for the other one. There’s quite a bit of research that’s been done in Microsoft Research-Asia around that kind of topic. And I think that’s important only in terms of as the number of access points increases through things like cell phones and televisions, but also at the same time the number of Web pages that have been authored under the assumption they have a browser on a PC-type screen as a fundamental display environment increases also dramatically. Anything we can do to not require everybody to go back and reauthor their concept of a user interface, or their information to be presented in these smaller screens, that would be a good thing. And I think that’s an area where more research should be done in the short-term.

OK. Last question.

QUESTION: What are your thoughts on what areas in software engineering where you work with components, different models of computation concurrency, verifiability, composability, and things like that?

CRAIG MUNDIE: I guess as I implied, when I think of engineering disciplines, my favorite example is people who build bridges and buildings and things, or even the people in the hardware business, they have evolved to where, layer-by-layer, the people who do work, are specialists at a particular layer, and they make very strong contracts about what they can assume about one to the next.

My view is, software as an engineering discipline has not yet evolved to that state. And that everything, more or less, is still too ad hoc. We’re starting to put some mechanisms in place that will allow, in certain parts of the total software problem, to have taxonomy, or to have a way of essentially describing these things. XML and things of that ilk are ways of doing it for a particular problem class. But when you get down to the software itself, I think we still don’t have a fusion of the model by which we write applications and the model by which we can express the contract that we want to have between what we’ve written and what anybody else has written either next to us or below us or what we want to advertise to the people above us.

As you go into this service-oriented environment, we’re being forced to do that. But that’s really incipient in the world of software today. And so I think that we have a very big deficiency relative to the formalisms of sort of verifiable composability.

OK, they’re telling me I’m out of time. I appreciate your attention. I hope you enjoyed it, and I’m glad you could come and join us. Take care.