Craig Mundie: Microsoft CEO Summit 2005

Remarks by Craig Mundie, Senior Vice President and Chief Technical Officer, Advanced Strategies and Policy, Microsoft Corporation
“Technologies of the New World of Work” Microsoft CEO Summit 2005
Redmond, Washington, USA
May 20, 2005

CRAIG MUNDIE: Good morning, everyone. In the next 45 minutes or so, what I would like to do is use a series of vignettes to explain to you the collective wisdom, or judgment, or vision of a lot of the people at Microsoft from across the company ranging from our research people to the people that work in the individual products groups, and tell you a little bit about how we see the technologies that are already in many ways quite evident, if you look at them one at a time, how they could come together in the future to move even to the next world of work.

Bill in his talk yesterday morning talked about how the technologies, as we know them, are already beginning to change the workplace. But we also know that we’re able to put the high performance computing capabilities and more and more sophisticated software into many new types of devices. There are a number of technologies that range from improvements in cameras, today you see that in the form of both higher-resolution digital cameras, you see them getting simpler and less expensive. You can hardly buy a cell phone today, certainly in Asia, that doesn’t have a camera embedded in it, and you see, for example, the movement to high-definition television. Of course, being able to take all these pictures at higher and higher resolution isn’t really very interesting if you can’t see them. And so another thing that’s changing at quite a rapid rate is display technologies. And these range from flat panel technologies to new types of projection technologies.

One of the things that’s quite interesting is that many of the same underlying silicon processes that have been used to make computers faster and faster and cheaper and cheaper over the years are now essentially the same fabrication technologies that are being used to create new radio techniques, to create new display technologies, and to create new forms of sensing. And all of these things have the potential to change the modalities by which we interface with our computer. Today, we’ve all grown up thinking that the way we do that is with the mouse and the keyboard, the point-and-click type of interface. And while that will become increasingly important in a range of new applications, it will become also more and more natural to address the computer as if you were addressing other people, and to have it respond to you in ways that are quite natural, and not strictly in the form of the traditional display based interface.

So I’m going to take you through a set of examples where we show how these things could come together. All of these are essentially either prototypes or mockups of things that we think could happen. So none of these are essentially directly on a product road map. But, interestingly, we think that something like the things I’ll show you this morning almost all of these things could actually come to pass as real products in the marketplace, certainly within a five-year time period. So, while some of this, compared to what we know in the past, may seem a bit like science fiction, I think as somebody remarked yesterday, if you can go back certainly 10 or 20 years, many of the things that we take for granted today, and frankly which our children grow up thinking should have been an entitlement forever, are things that did seem space age at the time, and they are with us today.

So let’s start in advance to think about what your workday might be like when you start at home. Many of you today probably still get a newspaper and think that this is the way that I get my morning introduction to what’s happening in the world today. Some of you may, in fact, have moved beyond that or to augment the paper-based interface with the use of your computer. The Internet, the ability to search, to go to Web sites that eventually bring the world’s information to you are increasingly powerful mechanisms. Some of you sat with us and talked about blogging yesterday as a completely new way in which communication, as it relates to business, is happening.

But basically as display technologies get less and less expensive, we’re more and more likely to find them in all different sizes, in all different places in the home and in other parts of our life. So this is actually our flat panel display, a large one, in fact, you might think of it as the top half of your refrigerator door in the future, or some other place that would be convenient for you. We’ve experimented with even having these things be projected through mirrors, so that you can wake up in the morning and you have to shave or whatever, you could look at parts of these things in the corner of the mirror.

But basically you have a touch-screen interface, and what you see here is essentially integration for a family of a whole variety of information. First, it can be essentially the equivalent of the refrigerator magnet, the kids can put things here, they can move them around, you can leave messages, you can have the clock. On the bottom, you can actually see a visual presentation of an integrated family calendar, but it actually not only shows the people color-coded here by icons, you can show what the next things are that the family people are doing, it also shows that we have now the ability to track people. So there are lots of obvious privacy issues around this, but the reality is it’s going to happen.

And when you talk to many parents, one of the things they’d love to say they know pretty good for the younger kids is, where are they, what are they doing. And so these maps essentially at some level of granularity can actually track movement. This happens today through cell phone towers. Bill talked yesterday about a very important aspect of what we think of as presence, the ability for the computer to know where you are. The whole concept of instant messaging really grows out of the idea that computers know, in real time, where you are, and to some extent they increasingly know what you’re doing. So there’s always a tradeoff between the intimacy that the computer has, and how that intimate and real-time information becomes valuable. And so people will have, I think, a great many personal choices, but this just shows a way where a family who trusts each other to some extent may find that it is valuable to use that information and present it to each other.

Here, basically what I’ve got is also video stream, which essentially progresses along, and is preselected, the things that I care about in general, and I could pick any one of these, I could advance at my own rate, but here, in fact, I might decide that I see something that’s sort of a troubling event. It may relate to my business, so I can say, let’s track that. And what I’ve done is essentially increase the priority of my interest in things related to what this news clip was about. We have the ability to not only process a video, but to translate the spoken word into text. That can then be fed into computer systems. This can be used in automated searches, and we can aggregate lots of information and present it to you in ways that we think might be powerful and interesting.

So basically I might finish my breakfast. I’ve looked at these things and decide that I’m going to go on my way. Now, this is actually a Microsoft Smartphone. It is a product that ships today. And what you can see on here is actually exactly the same video as what I was watching live on the screen here a moment ago. We have the ability now as the networks get more powerful, certainly in the local area networks in the home or the office, to download this kind of video into these devices in real time. And so it may not be just for entertainment purposes, it may not be that these things will just be, if you will, the iPod of the future when they all have little hard disks in them. They, in fact, may be the place where things that are very important to you, or that you want to be able to take to review or to show to other people can very rapidly be put into these devices.

This is another interesting device. This is called a Portable Media Center. This one is manufactured by Samsung, and it’s actually shipping today. And here I actually have, again, the same video playing on this device. These can be synched up either by wired or wireless interfaces to the computers in the home. This one presents exactly the same interface that the new Media Center PCs do, and so you have on the tiny screen exactly the same user interface concept that we have on the latest generation of personal computers that drive large high-resolution displays in the home. In fact, this device is one that will be your memento for this conference when you turn in your tablet today, we’ll give you one of these to take home with you. So we hope that you and your family mind find some enjoyment from that.

Basically, this gives me the ability to take with me the things that I care about. So now I will leave the home, and I get to my car. So, in a way, I was troubled by some of what I saw on these news events, so I’ve set my computers up to the task of tracking that information and building up more of a portfolio of what I might want to review with the team. But I’ve decided that this is a real problem. In the blog discussion yesterday, we pointed out that the CR people sense now of how much time does an institution have to react when the blogs basically decide to attack you about something that’s material? And the answer was, basically, about four hours. And so increasingly it becomes important to be able to communicate with people and make decisions, and you might want to begin in your car.

So here we see the extension of real-time collaboration. Bill talked about this as something that certainly will happen in the office, but you might want to, in fact, extend it into other venues. So, say, I want to talk to my COs. I just say to the car, I need to talk with Kim, it’s urgent. What you really see here is not merely a voice-based command and control interface, which many of you may already have in your car that dials the phone, that will change the track on the CD player, here the car is essentially stepping up in its capability to process your speak, and we, in fact, will do more complete natural language processing and understanding. And so when I say this, what really happens is, I’m kicking off a process whereby my computer is essentially talking to Kim’s computer, and they are essentially going to negotiate the best way, the quickest way, where we can decide to have this urgent conversation.

So how does that actually happen? Well, first, the natural language software will process this fairly simple sentence, and it has to figure out who Kim is. So it understands the context. It understands my contacts. It understands my family. It may, in fact, know what my history of communications tends to be, and therefore people who seem to be near the top of that communication list. And so her identity is essentially determined from that, just like a great assistant would know who Kim was likely to be in the context of an urgent, business-related call.

The fact that I said it was urgent gives the computer the sense that I have to establish a high priority for getting this done. And then it begins to go through the process of trying to figure out, well, where is Kim. One of the things you can see in this graph, denoted by the circle, is that Kim has many different places she might be, in fact, in that green circle, many different ways that I might be able to communicate with her. So the real question is, at any given instant, based on where she is and what she’s doing, what is the best way to communicate with Kim?

So this presence information, the kind of tracking that was shown there, without me knowing where she is necessarily, the computer essentially can similarly determine what is the right way to associate with this transaction. So we go through a calculation, and we determine both the optimal way to approach Kim, and whether to do it now. Because I’m her boss, that’s given some significant priority. But, ultimately, we believe it is in the interest of the individual to make the ultimate decision about whether that should happen.

Today your cell phone will vibrate or buzz and all you can do is either ignore it or answer it. But in the future we think that what’s more likely to happen is, her phone may ring or buzz in some subtle way. So one of the things that we think would be quite trivial for her to do now is for her to glance at this phone, realize it’s me and that it’s urgent, realize that she’s talking to an important client and can’t really stop at the moment. And she can just hit 15 and reject.

Instead of wondering whether or not Kim knows I’ve got a problem and need to talk to her, in fact, I’ve been acknowledged by the person at the end, and I’ve even had my expectations set for when, in fact, she might be able to come back and answer this question and talk to me. So a lot of the anxiety that you might have in this situation about, I need to talk to her, where is she, can I get her, the computer and the presence information can really facilitate that transaction in as unobtrusive a way as possible.

So I continue on, I get to the office, and I’ll go into my office and see what’s happening in this environment. The first thing I find is that we want new ways to interact with the computer. Identity is very important, security will continue to be important. Most of these new keyboards now come with fingerprint readers in them, and what that gives me is a biometric way of either augmenting a company-issued credential, or in fact, providing a simple way for people to log in.

So just putting my finger on the fingerprint reader, and you can do this at home today, will essentially log you into the computer and validate your identity. So here what we have is what you might think of as a new generation of desktop display surfaces. One of the things we’ve already learned in our study in the last few years is that if you give people more screen real estates, you get more done. And as more and more of your actions move away from paper and into this real time environment, it becomes more useful to have more screen capacity.

Neil Gershenfeld yesterday talked about paintable display capabilities. And while that’s a very, very nascent kind of approach, there’s no reason, as he showed, to believe that maybe in the future your desktop itself won’t be a display, or the walls around you might be a display. We’re still a ways away from that, so we’re going to use a variety of other technologies, like the large-screen flat panels. To some extent the flat panels you all have on your desktops today just get bigger and bigger, but at some point it becomes quite different to fabricate them or integrate them into an environment. What you see here is a display technology based on rear projection.

The technology is, in fact, exactly that which is being produced in higher and higher volume for home high-definition displays. People who don’t want to pay for plasma panels and things that can hang on the wall but are willing to tolerate projection systems that are maybe 14 to 18 inches deep now have tiny little projectors built in that can drive very, very large screens. This will essentially be the next generation of the way that people in their home and in their business will get large-display capabilities, coupled with touch panel technologies like you see in this one.

In fact, the touch screen on this is actually done with two infrared sensing cameras. So there’s not a big sheet of plastic or anything else that has to go around it. We just put two cameras in the bezel around the thing, and we can triangulate because we can sense your finger. And that’s how the touch is, essentially, ascertained. We can do this on larger and larger displays with very, very low cost. So all of these things begin to accumulate and give us new ways of interacting with our computers.

So the first thing I might realize is that I’ve got this topic tracker, and this essentially is driven by the elections that I made, for example, on the touch screen in the morning. And on my desktop where I have iconic presentations of a variety of materials, what I actually see here is a graphical interface where the different things that have been monitored on the topics I select are presented here. The size of each little block represents the amount of traffic or activity from that particular source. The brightness of it kind of indicates the age. So in fact the topics that are the most relevant relate to the one that I selected this morning, and there’s been a lot of traffic.

In fact, this graph below it might actually be some type of real-time graph. In this case it might be tracking blog activities related to this topic or our company. So I can see how active the blogosphere is in talking about these kind of things, and how I am going to address them. So all of these things come together and give me new ways to think about that.

So now basically Kim has finished her 15-minute interaction and has essentially talked to my admin and said, I’m ready to have this conversation. So I answer the call.

So we can begin having a real-time video interaction. Here this is probably happening with a portable video camera that she has embedded in her cell phone, from a place that was convenient for her. It gives me the ability to begin to have a common workspace. Bill talked yesterday about Groove, or Chris Caposella did, as a technology where people can have a shared interaction space. Increasingly the real-time communications capability will also become a part of interacting in that space. In this case I can tell her, of course, that what I really cared about was this news story that I saw this morning and that we’ve been tracking ever since. So I can now just drag that into this space, and in real time it gives both of us the ability to see what was going on, and to have a discussion about it.

I can take notes in this environment by either typing or writing by hand, and I’ve decided that I want to have a meeting about this with Kim and Laura, and while this is going on I can actually ask my admin to arrange that meeting. And in fact, she does. The computers, as they do today, can negotiate when that meeting can take place, and it can be presented immediately in the calendar for me, Laura, and others, in order to facilitate that meeting. So we can essentially go on about my day now, knowing that we’ve taken the first steps in resolving this crisis, and we’ll continue on.

Basically I can work in this environment. I may pick up and take another small tablet-type computer with me, in this case a Pocket PC phone. In fact, the appointment that I had has, of course, been transmitted to this device, too. And as I go around I might get the reminder that that call happened. It could come, in fact, from these new Spot watches, which are essentially also getting these things broadcast back to them through the FM radio airwaves that are sending you your music today, but in the background they can also send you data now.

So we have many different ways that we can essentially help you remember what’s supposed to happen. Basically this now reminds me, I’ve been out and about, and it’s time to have the meeting we talked about. In this case, the conference call is the meeting that we’ve chosen as the format, so it’s a virtual meeting. And by clicking and answering it, we can start to have this meeting.

So we start having our conference, and this is a time when it becomes interesting to have something other than a keyboard and mouse as a way to both view the information, participate in the call, and to have a way of interacting with these things, pointing and annotating, a bit more traditional, much as we all have been accustomed to with pen and paper.

So this is a tablet computer, much like the ones that all of you are holding today. Because it supports not just handwriting recognition but the ability to draw and annotate, I really have an environment that is really more natural to me. These tablets are very important because the ergonomics of this type of interaction, the ability to do immersive reading, you can do that on this because it’s really the same ergonomics as holding a book. It’s very different than trying to sit and look for a long time at something that’s on the screen in front of you. The physics of that just don’t work for people. And it’s why you can read your mail there, but you wouldn’t want to read a novel there. We do believe you could read novels on these devices that essentially you hold in your lap, and you can essentially move around to get comfortable with.

In this environment I may decide that some of the material I have I want to add into this conference call, too. So I can essentially drag and drop things from here into the videoconference, and now my information essentially will be brought into this conference as well. So many of the things that we used to have to run around and collect, they’re all going to be in this environment, and they can be collected, or aggregated as a function of the context of the call.

Now another thing we have, and will increasingly do, is we’ll maintain the provenance of all the data that the machines have. So in this case I can hover over this particular data, because we said, who was this that actually created this piece of data, could we get them involved, would they be able to help us? In this case it tells us one of the guys, Thomas Anderson, might be there, and the little icon that you can see here is green. That means that Thomas is actually at a place and doing something where he would be happy to have some type of interruption or interaction.

This is essentially how instant messaging works today. So now I can just essentially drag that little icon through a graphical interface over to this screen and create an instant message. So I type to him, You should join us in the meeting. He says, Sure, no problem. Again, I can essentially just drag the little icon over to the workspace, and he is added to this videoconference.

So multiparty video telephony will become de rigueur in the next few years. It won’t require special rooms, it won’t require special equipment, it won’t require special circuits. All of this is essentially being done using standard computers and the Internet connections that all of you already have in your offices, certainly, and even in your homes.

So we can begin, we can continue this conversation, but essentially my computer reminds me that I have to go to a meeting with a client that’s outside of the office. And in this environment not only can I just get a reminder, but of course now it knows this meeting isn’t in my office or with my own colleagues here, and we can also now detect, and in fact we do this today, on my cell phone here at Microsoft we have a research application that actually gets real-time feeds from all the traffic sensors in the roadways and cameras in Seattle. And it actually processes that along with a lot of contextual information.

And it not only shows you a graphic but shows what the state of the traffic is, it actually predicts what the traffic is likely to do over the next few hours, based on historical patterns, weather conditions, whether there’s a Mariner’s game. And I actually use this almost every day when I drive back and forth, so I can make decisions about how much longer I can stay at work because it helps people predict when traffic jams are likely to clear, again, based on current conditions and historical patterns.

So we’re starting to actually use the computer to do things that, in fact, even great assistants would have a tough time doing. They could look at what the color of the map is today, or right at this moment, but it would be hard for them to gauge in five minutes, or 20 minutes, is 520 likely to get better. We can do all that, and we can in fact feed it into not only a reminder that says here, there’s a traffic jam down there, and so it suggests an alternate route. The second type of things that your car does if you say there’s a traffic jam, and you’ve told it where you want to go, all of that can essentially be preconcluded, suggested, and used in a graphical representation. And also, of course, it will be projected onto your device. So if I look at this again, I might find that not only do I have the reminder that it’s time to go, but I carry with me a suggestion for what the optimal route might be to get there. And of course by the time I go down and get to the car, then I have the ability to find that the car already knows, too. Basically, I’ll agree to do this, it will log me out of my computer, and I’ll essentially head for my meeting.

So as I go from the office to this external meeting, I might decide to take another device. This is essentially a mockup of a device that we think will come to market probably in about three years time. We call them ultramobile PCs. It’s basically kind of mostly a piece of screen with a computer behind it. It’s only about a little more than a quarter of an inch thick and weighs substantially less than a pound. But this is essentially a computer as powerful as the desktop computer you know today. It will obviously depend on the kind of speech and handwritten interfaces that we use today on pocket computers and cell phones, but it gives us the ability to have more real estate, to have something that might be convenient in a meeting, but might be, in fact, either a little bit more inconspicuous or portable to go into that kind of a climate. And so there are many other things that we think can change or facilitate our interactions even in those kind of meeting environments. Here I’m going to go visit our client, and in this that might be what a business card would look like. Interesting, again, Neil kind of alluded to some of these things when he talked about the system adding things yesterday. We obviously have to have very, very tiny chips. And the things that are being put in radio frequency ID tags could also, for example, be embedded in business cards.

Even before you get to that point, as we develop new camera and vision technologies, today when we look at a business card we can read that it says Microsoft, Craig Mundie, see my title, get my address. Now, is there a way to either encode that such that it’s machine-readable? Today, if you have contacts, you can essentially exchange them between people with pocket PCs and Palm Pilots, or other things. But can’t we even make that process more automatic? Can’t we bring people who aren’t using those technologies and who still like this type of exchange into this new world? So this is one way to show what might be possible, the logo here, the round one, has actually been designed so that the dots that are there are actually an encoding for that company of the person’s identity within the business. And what’s interesting about that is that cameras can recognize that encoding just like a human can recognize the text. And I will show you in a minute how that might be useful.

But one of the things, when I go to this meeting, that’s interesting is, of course, now we have in theory at least a completely automated protocol of exchanging business cards. We all know who is in the room because of the same presence-tracking technologies that I talked about earlier. And in fact the granularity with which we can resolve your position in an enterprise just based on the devices that are on your person, and their interaction with the radio system in the building, is getting down to be within meters. And we can certainly understand the relative position of people at even higher resolutions. That is all something that we do in the lab in Microsoft today. And so, in fact, I can look at this tablet, I’m in this conference room, it can know what the physics of the room are because it can get it from an intranet site, and so it can actually position the people around the table. So the little step I go through of getting all the people’s business cards and then try to figure out who was that person, or which one is there, and you put them around the table in front of you, well, this would just show up on the screen on your computer, you would know who they were, they’ve essentially exchanged their business cards. And you would be able to go on from there. And so there are many, many things that I think will be surprising to people that will change your experience.

Another thing that I think will happen quite quickly now, which is one of the great frustrations for any of you that have to do what I do all the time, which is you go out and visit somebody, and you want to make a presentation, or you want to actually have a discussion. So, today, if you look at a picture of a conference room, it might have a lot of whiteboards where you can go up and draw on them, and of course you probably see a conference table with some kind of display controller on it, a giant batch of cables hanging out, where you try to hook up your laptop. That’s always a challenge to get to work right. It wastes a lot of time. The new radio technologies actually will allow us to have a wireless ability to discover and interact with these high-resolution displays. So it’s pretty easy to say that the whiteboards essentially become projectors like these on the back of this desk, and, in fact, the ability for my computer, and all the people’s computers in the room, to discover those displays, to basically put stuff up on them, becomes a completely automated process. There are no wires.

In fact, this is a precursor of what will happen not only on the computer but pretty much on all the consumer electronics in your home. If you have a home theater and have ever had to look behind it, or, in fact, had to hook it up yourself, there’s probably a mat of wiring behind it. And in fact the biggest problem most people have when they try to buy these new technologies is just trying to get them hooked up. The same is true with personal computers, as we integrate more and more interfaces into that environment. And so one of the things that will happen is, we will come up, we are even this year, the Universal Serial Bus, that latest simplified technology for hooking things to your PC still produces a huge array of wires that cascade to all the devices. It will be fairly economical in the next couple of years to replace those wires with what we call ultra-wide-band radio technology to give very lower-power, very high-performance interfaces to each individual device within the span of an area the size of your desk, or the equipment that would be your media equipment in the home, there will basically be elimination of pretty much every wire except the power cord. And therefore the ability to hook up complicated systems will essentially fall to the software that is within them, and not to the ability for you to read the manuals, put the color coded adapters in a color-coded hole, and make it all work with great difficulty. So whether it’s in business or at home, people will find new ways of interacting.

In the future, when you have this capability, if you want to go to the whiteboard, you’ll just tap on that ultra-mobile PC and draw on that surface. And whatever you draw will essentially be projected in real-time on the board. If I project a PowerPoint slide, and Sam wants to comment, or tell me I’m crazy, he can take his and essentially draw on my PowerPoint slide while I’m up there talking. And so I think there will be many new ways that facilitate an exchange. And just as it’s true with the tablets here, while you’re taking your notes on them and writing on them, these don’t happen to be yours, so when you leave today, if you want them, they’ll stick all of that on a little USBT and you’ll take it home and plug it into your computer and get it all back.

The ability to get a PowerPoint presentation of course becomes automatic, too. So just like the business card exchange could be done, you could exchange the documentation, or even the annotated discussions with somebody. So many of the things you see people experimenting with in recent years, marked whiteboard, things that like we did with the Xerox machine, they’ll scan the boards and capture your drawings and pictures, all of that will become implicit in the interaction between you, the computers, and the different display surfaces. Of course, it will take a long time to change society’s infrastructure in this regard, too. And so one of the things that will help us bridge between that world and the world we have today is essentially the ability to use cameras and other types of video and vision-based processing to facilitate a lot of these interactions.

So, let’s assume I’ve finished the meeting, and I’m going to head off to the airport. I’m going to have a different meeting. So I get there and I pick up my Pocket PC phone. It of course will remind you that it’s time to go on this flight out of town, and it basically has a way of bringing together a whole bunch of additional information. One of the things it also can remind me of is that, as I pick this up and get in the car and head to the airport, that this device is saying, OK, all communications will now be routed to this device. So just as was the case with Kim in the earlier vignette, the device that happens to be on your person, or in proximity to you, that’s the one that should be the focal point for any type of urgent interaction, certainly for communications. There may be lots of other ways, and in fact there’s an algorithm to determine that it’s not appropriate to interrupt me now, then the thing can be scheduled for an interaction with another device that you end up near later. In fact, it’s scheduled much like a meeting and will present itself to you at an appropriate time and in an appropriate location.

So another thing that this thing tells me is that I’ve gotten to the airport, and I sit down at the lounge, and it says, Oh, there’s a complimentary e-table available. One of the things that we’ve been playing with is the idea that, in fact, as display technologies become less and less expensive, the idea of essentially having them available to augment what you could otherwise do on your tiny personal devices becomes an interesting thing to do. Of course, there are lots of potential issues about privacy and other things, but to the extent that simple tasks can be done in this environment, it might be useful to have both a graphical user interface and new ways to do that.

So essentially this table is one that’s essentially been outfitted. What you actually see here exposed is an infrared illuminator and an infrared camera. This camera basically is just watching this table, and it’s looking for things through infrared pictures and shapes that it recognizes. So in this case it’s recognized my Pocket PC. It says, Hello Craig, it can figure out who I am because it presents a credential. And it says, If you want to use this table, log in with your fingerprint. So I do that and essentially this table has become an auxiliary display for my little pocket computer. It’s important to realize that as computers get more and more powerful, in fact, these devices even today have the computing power and storage capacity of the desktop of probably just three, maybe four years ago. And so they’re not really weaklings in terms of that they’re able to do. And certainly, when coupled with a network, with the ability to draw information from other places, to bring selected video for example, like the ones I decided I wanted to track in the morning, they can all be embedded on this device. And if I wanted to watch them here instead of holding up a little screen, this gives me a way to do it.

But this also becomes a way where, in simple ways, where maybe I want to do other simple tasks. So here I’ve got a couple of business cards that I actually brought with me. And, of course, they’re recognizable by the camera now, too. And as the resolution of camera processing and related software gets better and better, it gives me the ability, for example, here to review these things, e-mail these people that I didn’t know well, and maybe never met before. So business cards now essentially can show me a picture of the person to help me refamiliarize myself with them, or what they said in the meeting.

In this case, I actually happen to get this business card, and on the back there was a note that said, send me the first-quarter marketing presentation. So recognizing this handwriting is actually no different technologically than recognizing the handwriting on the tablets that are in your laps right now. We can do it in any orientation. We can pretty much do it in any reasonable size. And so I can take that, I can essentially register that on the card, and I can say, all right, I want Andrew to be someone that’s in my contacts, and the act of putting it on the desktop up there may, in fact, initiate a transfer of that into the device. And that’s all I’ve had to do to essentially record the note, get the identity, affirm that I want them to be a buddy or a contact, and essentially continue to move on that way.

So, as I sit here, I get a notice that Thomas Anderson, the guy that we dragged into that video conference earlier, they finished the meeting, they thought about what we should do. They’ve decided the company should issue a press release with the materials, and they want me to essentially authorize it. So here it’s essentially coming in as a piece of e-mail. I can review the press release. They really want me to sign it, I use my essentially biometric identity to present my signature on the document, digitally signed, and it’s sent back to them. So there are many, many things that could be done, and here we’re trying to demonstrate that there will be lots of device, lots of displays in your life. There will be lots of displays in the world. You may think that you’ll get into it on commercial airplanes, and in the future the seatback will have a display on it. In this case, instead of being able only to show you whatever the airline’s determined a priority, whether it’s the movie that they had, or information technology they wanted to give you, it could be just like this table. I can be an adjunct display, and I have the ability to put different things on it.

So in these vignettes I’ve tried to show you a number of things. In fact, this thing tells me it’s about time to go board the plane. So merely picking up my device essentially logs me off from the table and I can go on my merry way. In these vignettes my goal was to show you not what may, in fact, actually emerge, but what could emerge. Sometimes it’s hard for us who have gotten acclimated to one way of doing business, one way of thinking about computing to think that, I’d never do that. But when you talk to young people who grow up today and their baseline for computing in the next few years is things like the Xbox 360 that Bill showed yesterday.

Even when I look at those things, like the basketball players and others in those gaming machines, I don’t know if you’ve had this experience yet. But a lot of times now even with the last generation, you go to a trade show and you’d be walking around and you’d see these big monitors, and I know I’d look at them, glance at them out of the corner of my eye and I think I’m watching television. The fact of the matter is, I’m not watching television, I’m watching a real-time synthesized thing that just looks like television.

So as people grow up acclimated to that, as they think that it’s completely natural to have wireless ear pieces and pocket computers and refrigerator displays, and desks that have big high-resolution monitors, they will acclimate to these things. And as the computers change in the next few years, they will be qualitatively different. And these demos this morning are intended to make you think about what the world would be like if I could trust the computer the way I trust a great personal assistant, if the way I interacted with them was, in fact, very similar. I’d have abbreviated, natural language conversations with the spoken word, or a quick typed message, if I had the ability to take and move things around iconically that represent huge bodies of information, if I can task these computers to do things which, frankly, even if people could do them, it’s very hard to believe they would do them reliably.

The wonderful thing about computers is that if you task them to do something, they’re happy to do it over and over and over again. If you say, Look, I want you to go monitor for the next five hours everything that’s happening in the blog that relates to our company, the way these things scale is fundamentally different than the media we have known in the past. So if you went to your PR department and you told them to check everything that’s happening in real time in a traditional media environment, a few of them could say, OK, we’ll do it, or we’ll get our people on the phone, and we’ll look at all the CNN and a few other TV stations and we’ll think we know what’s happening. That won’t work in this world that we know today.

The rate at which you can engage a huge number of people on a global basis with instantaneous communications is just something that’s very different, and it has a qualitative effect, not just a quantitative effect. So you need to think about how you’re going to be able to take these technologies and essentially turn them to the tasks that allow you to adapt to those kind of events.

So it is always interesting to me to take small teams of people who do prototyping, as the ones who put this together for us this morning, and as they go out and they interact with their colleagues at Microsoft, they have a very fertile imagination, and I always find it quite remarkable to look at how quickly they’re able to take these technologies and hook them together, conceptually to do things that maybe never would have occurred to me, or maybe wouldn’t have occurred to any single person. But this idea of collaboration, of course, is a powerful one. These technologies will allow that to happen more and more broadly.

We’re clearly entering a world, many of us again in other meetings talked about globalization. Bill referenced Tom Friedman’s book “The World Is Flat.” Today we face the challenge of interacting on a scale and a geography that is very, very different, and certainly on a time scale that is very different than many of us grew up with. And all these technologies will not just be determined as fanciful or nice, I think to get maximum effect in that environment they’ll be determined to be essential.

So the idea today is to give you an idea of what the next world of technology might be like, and this isn’t a distant dream. We think these things in some form will come to Best Buy, Dell, and other places in the next few years.

So I’ll just close by saying, one of the key ideas here is around the idea of the display surface. The only really high-bandwidth input mechanism to humans is the visual system. The reason that so many of these things tend to have a display as a focal point for the vignette is because it is a very powerful way to interact with people. Speech is good, but certainly the old adage of a picture is worth 1,000 words, where did that come from? It came from the fact that you can convey so much more, and in fact, the human visual system is an incredible processing system for pattern recognition and other things. So we think computers will become more visual, more visually oriented, and as such it gives us these capabilities.

The things that are driving high-definition displays into the home, and flat panels onto every desk, and the fact that they’re built out of these same silicon technologies that are making computers so tiny gives us the ability to expect display technologies to now improve at this exponential rate for quite a few years. That was not a characteristic of CRTs, cathode ray tubes. That’s what we all grew up with and called television sets, what we called the big CRT-based monitors. Those technologies were not on this exponentially improving curve, and despite their steady improvement it was not at the rate that computing was improving underneath it, and it certainly was not at the rate that you’re going to see present for all these new types of projection and flat panel displays.

So we do foresee a world you could say it starts with the things I’ve shown you this morning, and maybe ultimately ends up with the things Neil showed you yesterday, whether it’s your house, or your desktop, or belongs in your conference room, they in fact may be display services, the whole wall, high resolution. You might be able to go into a meeting and not just put a big PowerPoint slide up there, but do what we were doing on this desk, take individual documents, arrange them on the wall, arrange them on a large desk surface, and read them right there, as if they were paper.

That takes a huge amount of computing power, it takes a huge amount of input-output capacity, and of course it takes a very high-resolution display. But if you can fabricate things at the scale and with the processes that Neil talked about yesterday, it’s not out of the question that in 10 or 20 years’ time that, in fact, may seem normal.

Thanks for your attention. I hope you enjoy the rest of the day. And we’ll see you again.

Craig Mundie: Microsoft CEO Summit 2005

Related Posts