Speech Transcript – Gordon Mangione, Professional Developers Conference – 2003

Transcript of Remarks by Gordon Mangione, Corporate Vice President, SQL Server Team, Microsoft Corporation
Microsoft Professional Developers Conference 2003
Los Angeles, California
October 28, 2003

EDITORS’ UPDATE, Aug. 27, 2004
— Microsoft has announced it will target broad availability of “Longhorn” in 2006, and make key elements of the Windows WinFX developer platform in “Longhorn” available for Windows XP and Windows Server 2003. See
press release
.

GORDON MANGIONE: My name’s Gordon Mangione. I’m here to talk to you about two things today: SQL Server “Yukon,” which, clearly, I still have to clear up what it is, and “WinFS,” which is the heart of what we’re doing in “Longhorn” and how we’re going to store information. But before I get started, I really want to thank all of you in the room. It’s amazing what you’ve been able to do on our platform.

You know, I have a couple of customers up here who’ve done some pretty amazing things. Verizon is now storing 17 terabytes of information inside SQL Server, including five terabytes in a single instance. NASDAQ is rolling out Super Montage, their new trading system for downstream ECM — over 70,000 transactions a second on a standard eight-way machine.

Jet Blue has moved all of their frequent flyer plans onto our platform, and recently, just upgraded to 64-bit and saw the power of that platform. This is just an example of what some of the folks have been able to do on our platform, and for that, thank you very, very much.

OK, let’s get on to “Yukon.” I’m going to spend the next half an hour drilling down on every single feature in the release. (Laughter.)

Let’s start with Multiple Active Result Sets, or MARS, a feature truly out of this world. In seriousness, the demo team made me do that. In all seriousness, you’re going to drill down on all of these. We have a number of breakouts on “Yukon.” We’re going to get into all of these feature sets. For you, what “Yukon” really is, is a big, big, big release for SQL Server. All of the features, all of the feedback that you’ve given to us — it’s going to be a fantastic release.

It’s been in Beta with about 2,000 people for about six months now. You’ve all got a copy of it. What I ask for you, leaving here, is go get your apps running on top of “Yukon” and give us feedback before Beta 2. At the highest level, every release we do in SQL Server is going to focus on enterprise data management. You know, there’s never enough scalability, there’s never enough reliability, there’s never enough ease of use in the platform, and this is really what the DNA of the team is made up of.

This is what we’re going to concentrate on in every single release. But this release, I really tasked the team to take to heart something I call serviceability. Serviceability is about giving you the tools in the database so that if something goes wrong, you can figure it out immediately. In many ways, I described this to the team as “no repro required.” Now, there’s nothing you hate more than when something goes wrong and you go and look out, and you have to turn on three trace flags and repro the problem in production.

This is all about the database maintaining those stats, making it just a query away so you can find all of the information about what’s going on in the database. But I know the thing you all want to hear about is developer productivity, and that’s really where I’m going to drill down in the next couple of slides and talk about what we’re doing with Visual Studio “Whidbey” and SQL Server “Yukon” to really make them just fantastic together.

And finally, business intelligence is probably the hottest area in the database trends today. It’s all about getting the right information to the right people to make timely decisions for their businesses. I’m sure you’re all getting asked about it. How do I get the right data? How do I get it into the format that my end users want to consume it in? It’s great that all that data goes into the database, but it’s useless unless it comes back out as information that I know what to go and do with. So, we’ll do a bit of a drill down on business intelligence, and there’s a lot of tracks on that today as well.

You know, when you think about .NET development, you go through a phase as a developer. You do designing, you do some coding, you do some tuning, some debugging, some redesigning; if you’re really not lucky, you do some re-architecture. In many cases, that’s the last thing a project leader ever wants to hear, is re-architecture, because that usually means schedules move out. So it seems to be what all developers want to go and do.

And then, you have to go and deploy, and then you start the cycle all over again. And we’ve done a lot of stuff in Visual Studio to really help you manage this cycle. In many ways for the data developer, we haven’t provided the same set of tools. And with “Yukon,” and with Visual Studio “Whidbey,” we’ve really integrated these products deeply together. We’ve taken the .NET Framework and embedded it in everything we’ve done, and we’ve taken all the tools that you expect to have as a .NET developer and made them available for data developers.

Source control management, project management, language of choice, the way you do integrated debugging, cross-language debugging — all of those services you expect as a .NET developer are now part of the database. In many ways, I think about this as a data developer’s nirvana, where these two worlds come together, and you have all the services you need — this is truly a cheesy slide. (Laughter.) So cheesy, I had to put Cheese-Wiz, Cheetos and a pizza on the slide. But this is really about bringing these two worlds together and giving you the tools that you need to go and manage your project. So how do we do this? At the heart of everything we do in “Yukon” is a .NET Framework. You’ve probably heard about writing stored procedures in the language of your choice as part of the .NET Framework, or triggers, or user-defined types, or user-defined functions. But it’s really so much more than that.

You want to write a data transform, you can write it in the language of your choices as part of the .NET Framework. And you can actually get source control and project management for checking in your transformation, something, frankly, that is very, very hard to do in SQL 2000 today.

You want to write a custom resolver for replication, write it in the language of your choice as part of the .NET Framework. You want to write a report or a report renderer, do it in the .NET Framework. In fact, we ourselves use the .NET Framework to develop “Yukon.” Reporting services, which I’m going to introduce in the next slide, was written entirely in C# on top of ASP.NET.

All of our management facilities and all the IDE tools that we built on top of the Visual Studio IDE are built in managed code. We really believe we have to eat our own dog food and develop apps the same way you develop apps so that we know we’ve got this right. One of the big benefits we got out of this is all of the services we have in “Yukon” are exposed as Web services. You can now take a stored procedure inside your database and actually expose it as a Web service.

Now, some of your are probably scratching your head and going, ‘Why would I want to do that?’ You know, a question I get asked a lot is, ‘Gee, Gord, when are you going to port SQL Server to other operating systems?’ Not going to happen. When it comes right down to it, there’s just too much of an opportunity cost, too much I can’t take advantage of in the underlying operating system, too many things that, frankly, if I had to port to 17 different flavors, we wouldn’t go and do.

But the thing that’s critically important is interoperability with everything else you have in your environment. It’s why we’ve done things like invest in all the OLDB and ODBC drivers to other data sources. It’s why we built the JDBC driver for SQL Server, and it’s why Web services is so critical to everything we do, because it allows you to get access to your data independent of whatever machine you’re trying to connect to.

So not only can you expose the database as Web services, but your analysis services is now exposed as Web services. You can do data transforms from Web services to bring data into your database. You can expose reports as Web services. You can even make calls into the management facilities as part of Web services.

This is our commitment. We know that interoperability is critically important to you, and Web services and the .NET Framework is at the center of everything we’ve done in this release.

The Microsoft business intelligence platform really has three key things. At its center, I just want to keep it simple. Everything you need to build your data management applications as part of the platform, we’ve included in SQL Server. The same way we’ve had analysis services in SQL Server, the same way we put data transformation services in there, we’re going to do exactly the same thing with reporting services.

Reporting services is in Beta 2 today. You can actually get a copy of it in the Microsoft booth, on the show floor. I’d encourage you to go get this, because it’s going to dramatically going to make it easier for you to build reporting applications in your environment. But the idea here is just keep it simple.

All the tools you need for developing business intelligence applications run inside Visual Studio. And while we’re going to do great stuff on the Web and great stuff with HTML, it’s all about delivering the information in the format that your end users want to read it. Rich e-mail, Word documents, Excel documents are one of the main ways we want to deliver this information.

Probably the best way I can describe this is tell you a little bit about a customer and how they’re using this today. MGM Mirage owns 18 hotels in the Las Vegas area. Their smallest hotel has 3,000 rooms — most of them have about 5,000 or 6,000 rooms. MGM Mirage has over 50,000 slot machines. Every single nickel, dime, quarter, Susan B. Anthony silver dollar that goes into those slot machines — transaction against SQL Server.

Literally, millions of transactions a day flow into SQL Server, which is impressive in itself. But frankly, what’s more important is, how do you get that into real information that people can make decisions on? How do you figure out it’s that person over there that we want to go give a surf-n-turf dinner to, to keep them in our casino, as opposed to going next door? And all of that has to happen in real time.

They’ve recently upgraded their system to 64-bit. It took them three hours to upgrade to 64-bit — same protocols, same database format, same execution environment, everything was identical. They literally detached their database on their stands from their 32-bit machine, attached on their 64-bit machine, did a little bit of testing, three hours of testing — I’m uncertain that they should have cut over as quickly, but they did — and they’re up and running, with no changes to their applications whatsoever.

But reporting services is really the key new feature. We got such great customer demand on this, we felt we had to make it available early. It was originally only going to be a feature of “Yukon,” but the customer demand for it was so great that we decided to bring reporting services to market early. We just released the Beta a couple of weeks ago, and as I mentioned, you can get a copy of it on the show floor.

This is all about being able to create reports from whatever data source you want. Just like analysis services can work against Oracle databases, or IBM mainframes, or any data source, reporting services can do exactly the same. It’s a way that you can define your report once, yet render it in many different formats.

You can render it on the Web in HTML, create rich e-mails, Excel spreadsheets, PDF files, Word documents, and make them available on the Web or, if you want, e-mail them around as part of batch delivery on the back end. It’s an area where, frankly, we’ve had to hold back customers going into production, because they’re so excited about this, and it really makes developing apps so much easier.

So with that, I’d like to bring Tom to give us a demo of “Yukon.”

Tom?

TOM RIZZO: I’m coming.

GORDON MANGIONE: Good morning, Tom.

TOM RIZZO: Good morning, Gord. So Eric Rudder got fat-man — you’ve got fat man standing right here.

GORDON MANGIONE: OK, short jokes are OK. Fat jokes are definitely out of the question.

TOM RIZZO: That’s right. You know, standing next to Gord, I realized why he loves me doing his demos. We must see eye to eye on every single demo here.

GORDON MANGIONE: No, it’s just because you make me look tall, buddy.

TOM RIZZO: That’s right. Oh, it’s killing me up here, it’s killing me. It’s like beauty and the beast. Anyone else calls him beast, I’ll beat him up. So anyway, let’s get to talking about “Yukon.” I know you’re all excited about “Yukon.” There’s three key things we’re doing inside of “Yukon” for you, as a developer.

Now, one of the key things is integration with XML and XML Web services, as Gord talked about. We’re also doing a lot inside of “Yukon” to integrate in with Visual Studio “Whidbey,” the .NET Framework, so that you can write your stored procedures, your user-defined functions, user-defined aggregates, all in the language of your choice, C#, BB. You decide what you want; we’ll make it work.

GORDON MANGIONE: So this is the work base that brings together all the tools that we used to have as separate tools in SQL 2000.

TOM RIZZO: Exactly. We’re going to show you a scenario where I work at Kontoso Health Care. They connect doctors to health care providers, do all of the claims processing, all of the billing. Imagine I’m sitting in my office as a developer, my boss comes in, says, ‘Hey, smarty pants. Mr. Developer. Tom, you went to the PDC, you learned about some great technologies, things like “Yukon,” “Whidbey,” Long John, even Whitehead you learned about.’

And we know every boss gets the code names wrong. So, you know, they say, ‘Show me how I can be more productive, you can build applications that make us more productive as Kontoso Health Care.’

GORDON MANGIONE: Great.

TOM RIZZO: So I’m going to go and do that. As you mentioned, this is the new workbench. This is entirely written in C#. It’s the new management interface for “Yukon,” and you’ll see a couple of things here. We wanted to bring all of the disparate experiences you had in SQL Server 2000, in terms of writing your code and managing your servers, into one simple interface.

So directly from here, I can create my T-SQL queries, I can create my analysis services queries. I can even use the new XML data type and X-query right inside of the same interface.

GORDON MANGIONE: We’d love to get feedback from you on this start page too. We’re trying to get some of the most common tasks that DBAs and developers have to do onto this start page, including connecting to your communities. But actually start to play with it; we’d love to hear your feedback on what should be on this page.

TOM RIZZO: Exactly. And you pointed it right out. Communities or bill writing, you have a question on how to do something in X-query, T-SQL, whatever, you can go out, look at the newsgroups, and be able to ask those questions and get your answers, and be productive as a developer.

But I need to go, and I have all these different data sources. I need to be able to access those. The first thing I’m going to do is I’m going to write a T-SQL stored procedure. And you’ll see a little bit different user interface here. One of the things you told us, as developers, was that a lot of times it was hard to write all the end parameters, out parameters, set default values. There was all this metadata stuff around your stored procedures.

You want it off to simplify that process and make it much, much more seamless for you. So you’ll see we’ve added in a level of graphical — just point and click to change things. So watch my alter procedure down here. I’m going to go, and maybe I want provider ID to come in and set it as an out parameter, so I just select it there. And maybe I want to change the data type to something else. We have all the different data types, including the XML data type.

I won’t touch that, and maybe I’ll do a default value of 56. When I click outside, suddenly, it’s changed down below.

GORDON MANGIONE: Yeah, we realize you have a big investment in T-SQL. So not only have we done things in the editors to make it easier to work with, we’ve also done things like the added structure exception handling to make it easier to debug these applications. And obviously, new data types like XML, that are really at the center of a lot of what we’ve done for developers, are equally available in T-SQL as they are in the new languages.

TOM RIZZO: Exactly. And then, if we scroll down and look at my stored procedure, you can see, hey, that’s a select statement. And then, what is this down here? You’re probably saying to yourself, ‘Hey, Jim Allchin got Don Box, great coder. I’ve got Tom Rizzo — can’t even write T-SQL here. What is this crazy looking thing? Is it Klingon? What is it?’

GORDON MANGIONE: I think the X-query guys even struggle at times, writing those queries.

TOM RIZZO: Exactly. It’s X-query, as you pointed out, inside of here. And this is really great, because we have this idea of cross-domain queries inside of “Yukon.” We have a new XML data type. You can store your XML natively inside of SQL Server, get it back out, and then you can even query into that XML data type using the new X-query standard. So right here, we’re combining relational data and X-query data all in one single query.

GORDON MANGIONE: You can index it, content index it, just like you would any other data type that’s part of the system.

TOM RIZZO: So now, we’ve got our stored procedure; we’ve created our stored procedure. We need to be able to get out that stored procedure. And you talked about the new Web services integration directly inside of “Yukon.” And here’s an example of how easy it is to create Web services definitions for your stored procedures directly in “Yukon.”

You see here, all we do is create what’s called an endpoint. We set some metadata around here of what security we’ll want, what port we want. And then, right here, we say, here’s the stored procedure that I created in T-SQL earlier — what method I want to call it and how I want to create it. So instead of talking about it, let’s actually run it.

Wow. Good demo, right? Its command completed successfully. That’s always good. But let’s actually show using it now. Now, we could go into Visual Studio and add a Web reference and use our Web service like that, but instead I’m going to show you the new InfoPath clients. And InfoPath understands XML data and understands Web services data.

So I’m going to create a simple application for my information workers to be able to get doctor information and be able to set that information from a client that they understand. So I’m going to design a form in InfoPath, and I’m going to say I’m creating a new data source. And obviously, InfoPath supports static XML and databases through ADO and ADO.NET, but we’re going to use that Web service we just created, and then I’m going to tell it I want to receive and submit data.

So you can not only read data, you can actually send data back. And let me type in the URL to my Web service.

GORDON MANGIONE: Now, we’re showing InfoPath to show you how to consume this. But if you have any development tool that knows how to connect up to a Web service, you could now connect directly to “Yukon” to get off that data.

TOM RIZZO: Exactly. It looks at my stored procedures, says hey, there’s two methods defined in you stored procedures. There’s a select one, which I’m going to use to select my data, and then I tell it to insert my data using this other method. And guess what? It looks at the Window, looks at the stored procedures, says, hey, Tom, there’s a bunch of complex information in here. If you want to map it to a field in InfoPath, you can do that.

It realizes my XML column is a complex type. I can go through and modify my information here and map it to my XML inside of my InfoPath form. So I’m just going to do that here, and then I’m done. And now, let’s actually quickly create a form and show how easy it is to use Web services, create stored procedures, and use them off of InfoPath.

So the first thing I need to do is I need a query form. This is my query form to enter in my information to run against the database. I’ll just drag and drop the provider IDs, the unique identifier for the doctor. And then, I’m going to go to my data entry view. And if the query is successful, it will return back all the data into my data entry view. I’ll select my data source again. And remember, this is a complex set of data coming back — I’ll just select it all. We’ll be crazy. We do crazy things here.

We’ll select it all; we’ll drag it. Now, normally, if you drop this onto other types of forms packages, it wouldn’t know what to do with it. InfoPath — smart. It says, ‘Hey, that’s a lot of data. What do you want to do with it? How do you want me to lay it out for you?’ I’ll say, ‘InfoPath, lay it out as a section with controls.’

GORDON MANGIONE: So the self-describing nature of XML tells it which fields to use and what controls to put on there.

TOM RIZZO: Exactly, exactly. It realizes the XML can be a repeating section with repeating tables in it. So enough talking about how we designed the form. Let’s actually go and use the form.

So I’m going to go back and preview my form in InfoPath, put in doctor ID 56, and bam.

GORDON MANGIONE: Very cool.

TOM RIZZO: We’ve used our Web service from Yukon that calls our T-SQL stored procedure. (Applause.)

GORDON MANGIONE: Wow, fantastic.

TOM RIZZO: But we’re just getting warmed up, just getting warmed up.

I know you’re all sitting out there saying, ‘But what about Whidbey? What about Visual Studio? I’m a Visual Studio guy/girl; I want to see code inside of Visual Studio.’ Well, before we get to there, I’m going to make you wait a little bit. We have this portal that we created where the doctors can come up and do a little bit of self-service, and you can see here the boards and credit information. This is going to call a Web service that has event information so that the doctors can go and see what future events are available to meet their board certifications. This is an extra value that we’re adding as Kontoso Healthcare.

And we are pulling the doctor information using our T-SQL stored procedure. So T-SQL has no idea how to work with Web services. I don’t even know the function we would ever write in T-SQL to actually call out to a Web service and then combine that with SQL data.

GORDON MANGIONE: You’d have to write an extended stored procedure and you’d have to write an awful lot of code.

With Yukon there is no reason to ever write an extended stored procedure again. Please use managed code where you used to use extended stored procedures in the past, because you’re going to get better security, you’re going to run more in a sandbox, you’re going to be running in managed code, you’re not going to have to worry about pointers and everything else spraying in the data. It’s going to give you better security, it’s going to give you a safer environment and you’re going to be able to debug your code in a much, much richer way.

TOM RIZZO: So, since T-SQL doesn’t understand Web services, we’re going to write a user defined function inside of C#. Now, this could be VB for the VB developers out there. We just decided to use C#. And you can see the first weird thing up here is what is this usingsystem.data.SQLserver. A lot of you are probably used to SQL client, but since we have an in proc provider and the CLR is running within process inside of Yukon, we’ve decided to streamline the operation to make that provider not have to go through network layers or any of those sorts of things. It’s actually very, very fast to access with the CLR and SQL Server running in process together.

GORDON MANGIONE: Really what this does is it allows you to run the scope of the transaction of the incoming call, the impersonated user. You don’t have to round-trip to get back into the database. It’s really an optimization. It’s exactly the same method but it’s just an optimization so that you don’t have to round trip out of the server.

TOM RIZZO: Now as a developer, you probably know ADO.NET. So what we’re going to use is a little ADO.NET code running within the SQL Server process. And then you can see here we’re going to call that T-SQL stored procedure that we wrote earlier. And then we’re going to go through and actually do our Web services call, do a little bit of mathematical functions to create that portal that we showed you earlier.

Now, a lot of you may be saying, ‘OK, C# calling T-SQL in process and SQL Server, hmm, that’s probably not a great experience for debugging.’

Well, I’m going to prove you wrong right here. Let’s just kick it up. We’ve got our break point inside of here. First, you’re going to see it actually compile all of my information and then deploy it and set the properties in SQL Server to automatically register this C# function for us as a user defined function.

So there we go. We hit our debug point. Now let’s step into our code, and you’ll see over here the languages are C# and T-SQL. We’ll execute it. This is a bug in the current build that you have. We have to come down here and double click. Bam, cross-language debugging from C# to T-SQL. Isn’t that cool? (Applause.)

GORDON MANGIONE: Wow.

TOM RIZZO: Now, I’ve put in some watches over here inside of my C# code. I’m going to step into this as well, and we’ll pop back to C# and our objects are filled in and it’s just a seamless experience. We can keep going through and run inside of our function here.

GORDON MANGIONE: Wow, very cool.

TOM RIZZO: So the final piece I wanted to show you is in the portal, we have some other information that we want to get out to our doctors. We want to show them their claims information, whether they’re in process, whether they’ve been paid or whether they’ve actually been rejected. So we’ll go back into our claims report here and this is using the new reporting services that you talked about.

A lot of you generate reports either using ASP.NET or other products. We’ve decided to try and make it as easy as possible for you to build reports using the tool that you use, which is Visual Studio.

So here’s an example of a report, and I can come through and drill into the information as a doctor on the portal and see all the different clinics that I work at and we see where all of my different claims are and then I can even double-click on any of these claims and see whether it’s been received, rejected or is in the middle of being processed.

GORDON MANGIONE: Wow, so I just went into the report designer and created that report. It’s automatically bound to the data, collected based on whatever parameters were inside these hierarchical reports and flows back the appropriate info.

TOM RIZZO: Now, you’re an executive.

GORDON MANGIONE: Keep it simple.

TOM RIZZO: Keep it simple, right, KIS, as they call it.

GORDON MANGIONE: Exactly.

TOM RIZZO: We’re going to go into the report designer. I’m going to switch machines quickly to show you the report design interface. And executives love charts and these doctors love charts as well. They’re all chart people.

So what we’ve done is you can see the designer, the report designer in here. We’ve integrated it directly into Visual Studio .NET so you don’t have to leave your development environment to create very rich reports. We’ve extended it though with report items built right in so you can drag and drop tables, images, even charts.

So I’m going to customize the report by dragging in a chart and then we even integrate it in field so you can drag and drop your database fields directly into your reports and be able to get very, very productive in generating these reports.

GORDON MANGIONE: So I could save this as part of my source control, project management just inside Visual Studio?

TOM RIZZO: Exactly. So I’m going to take the claim state, drop it into the data field, take my claim state here, drop it into my series. You’ll see the chart will start changing a little bit in the design view. And then I’m going to take the supplier name and put that into my category.

Now, I know you like WYSIWYG graphics. You saw the “Avalon” stuff yesterday. I have nothing as cool as that, but I do have pastel colors for you, since that will make you feel nice and relaxed.

GORDON MANGIONE: Ohh, excellent.

TOM RIZZO: And then I even can do 3-D effects. So I come in, I can do a clustered cylinder, hit OK, and you know what, deploying this report is as easy as developing it. You go right into Visual Studio, you click on the Deploy Solution and down in the output you see we’ll compile it, deploy it out to my server and then I’ve automatically got this new report on my Web site.

So if I go back to my portal page, we refresh our page here, you’ll say, ‘Hey, Tom, that’s not different.’

GORDON MANGIONE: Yeah, it looks the same.

TOM RIZZO: It looks the same, but if I come up here and I click Refresh My Report, bam —

GORDON MANGIONE: Very cool.

TOM RIZZO: — chart’s directly in there, but it gets even better than this. If I drill or I do any of the things I want, we’ll still get all that rich drilldown and all that rich information. But we can export all these reports out to many different formats without you writing a single line of code. It’s built into reporting services. So we support XML, we support TIF, Adobe. We even support Office formats like Excel.

So let me click here and export out our report to Excel, and you’ll see some of the great integration that the reporting services team has done with the Office team to show very, very rich reports. So we’re in Excel. You get all of the collapsible sections that you did inside of the Web page.

GORDON MANGIONE: Wow, very cool. (Applause.) Thanks, Tom.

TOM RIZZO: Thank you, Gordon.

GORDON MANGIONE: That’s just a taste of some of the things you can do. And, of course, we showed you the Web interfaces to get into reporting services. Where I think it really takes off is when you start to do parameterized reports over e-mail. Imagine now you could, on a scheduled basis, send out individualized reports in rich formats like Excel to your end users, driven all off of your back-end data sources.

OK, let’s switch gears. Let’s start talking about the client and “WinFS” and “Longhorn.” You all know about Moore’s Law; 18 months we double the number of transistors on a chip. There’s no abatement coming here. It just seems to keep going and going and going.

What’s remarkable though is disks are literally outpacing Moore’s Law. We triple the size of a hard disk every 18 months. We’re literally a very short time away from the time when each of you have a terabyte hard disk in your local machine. Heaven help us all if “find-first, find-next” is the way we’re going to find data on that box. (Laughter.)

You know, that poor guy talking about his wife being able to find stuff on his machine; I’m not going to be able to find stuff on my machine if I have a terabyte on my local hard disk.

And then, you couple that with how much data is being borne digitally these days; it’s just truly amazing. The University of Berkeley yesterday released a study that said five exabytes of data was born digitally in the year 2002. Now, you’re all out there going, “What the hell is an exabyte and how big is that?” That’s five million terabytes of digital data; 400,000 terabytes of e-mail was exchanged last year, a lot of it get rich quick schemes, trying to sell you stuff, inviting you to parties, but still, 400,000 terabytes.

Just managing all that data, and we’re getting richer and richer data types all of the time. Video is becoming more and more commonplace. I just saw an amazing demo when I was over in Italy of someone that had built a data warehouse and taken all of their Nielsen data at the television station, and for the past ten years, you can go and look at all their Nielsen data, click on any point on the chart and stream out from NetShow what was playing on television at that exact instant in time.

We’re going to start to see richer and richer formats coming onto the machine and instead of having to manage thousands of files on our desktop we’re going to be managing millions and millions of files and items on our desktop.

Clearly, the technology that’s served us well for the past 20 years isn’t going to take us forward into the future, and that’s really where “WinFS” comes to play. There are three key tenets that we have in “WinFS.” Help users find their information: get rich metadata associated with their application, get all of the ways that you want to expose your data as elements and items inside the system; give users the ability to relate and developers the opportunity to relate items together so that users can have very, very rich navigation schemas for finding their data; and make the information active, allowing you to do synchronization, allowing you to be able to set up filters and Information Agents to alert you to incoming feeds as they’re coming forward.

Perhaps the best way I can describe this is to think what a PDC would look like once “Longhorn” is pervasive and running on our machines. All of the data that goes into this PDC is borne digitally: PowerPoint slides, e-mail invitations, blogs that you participate in, photos that you take of the feature sets and keynotes, Web casts of the keynote. All of this stuff is borne digitally. And at some point, you get a CD with a bunch of files on it, and you can go and peruse that data, and that helps once you get the CD immediately. But what if six months from now you want to go back and say, ‘Gee, I listened to this one talk at the PDC and I really want to find the code sample that was used on stage.’ Well, that’s really hard to do today unless you can formulate it as a query using full text to go and look for the data.

Ultimately, what we’d like to be able to say is, go to your calendar. You were at the PDC. You’ve related the PDC to the information you received, the sessions you attended, the blogs that you participated in, the e-mails you sent, the photos your sent, and do that in a very, very rich way on your machine.

You also frankly probably want to get notifications for sessions that you’re interested in, and that’s really where Information Agent comes into play. It can allow you to deal with all of the influx of information and really allow you to take control over how your screen real estate is used. No longer do you have to worry about every app trying to pop up toast or bringing up popup dialogues or sending you information; you are in control of all that data, because all of that data goes through Information Agent.

You know, everyone in this room is going to have to create two reports when they get back to their office. Your boss is going to make you do a trip report and unfortunately your trip reports are probably going to be watered down info of what you actually participated in. You know, with “WinFS” you’d be able to go back an say, ‘Here’s all the material I created, here’s the relationships that I’ve created through that data.’ And then you could share that data with your coworkers, including your own annotations, your own ink markups, your own data that you’ve collected as you participated in the show.

You’re also frankly going to probably have to create an expense report. And today, with expense reports in IT organizations, we’ve driven millions of dollars of cost out of the expense report generation system by having folks go to the Web. Most of you probably now have expense report systems where you go back to the Web, you key in all of your information. Some of you are probably lucky enough to be able to fill out an Excel spreadsheet and then mail it in afterwards, but most IT apps have moved to the Web. And they’ve done so for a very prudent reason: It allows them to integrate in with their workflow, it allows them to do consistency checking, it allows them to enforce their policies when you go to the Web page that’s very, very hard to do today if you’re downloading an application.

With “WinFS,” you could have an expense report with an application that comes down or even just a customized spreadsheet that allows you to fill out that data. You could synchronize down your policies, the rules. Information Agent could fire to let you know if you’ve exceeded too much on your dinner last week. It could allow you to relate your contacts that you went out to business dinners with and really start to relate a lot of that information together.

But, you know, let’s get down to the brass tax: how have we actually built this thing. You’ve seen this presentation. Jim showed you the full “Longhorn” architecture. I’m really going to drill down on “WinFS” and how we actually built “WinFS.”

First and foremost, we built “WinFS” on top of NTFS. We have 15 years investment in building streams on NTFS, doing things like byte range locking, op lock, streaming support out of the kernel, transmit file, ACL support. There’s no way we’re going to throw all of that out and start over again. Clearly, we had to start at the deepest part of the engine and embed it very, very tightly with NTFS.

At the same time, NTFS gives you the 17 columns of metadata, which you have absolutely no control over, and unless you could put it in the title of the document you’re saving you don’t get a lot of flavor of what you can actually put in the system.

So we had to couple together and really marry tightly together relational engine technologies with NTFS that allows us to create very, very rich metadata associated with streams.

You’ve probably all created an application — I know I have in previous jobs — where you store a bunch of information inside a stream, and then you create what I affectionately call a “turd file” sitting next to the stream. The demos look great. The demos look fantastic. You go and you deploy this, everything looks great; but backup, and what do I do when the app crashes, and how do I build it you end up building all of these utilities from recreating these turd files when things go wrong.

All of that disappears with “WinFS.” You now have a consistent programming model, a consistent transaction model, everything you’ve come to expect inside a database available for you in this rich, rich file system.

But that really wasn’t enough. In order for applications to truly be able to share data, there has to be a data model that allows you to have concepts like containment, hierarchy, security, relations and that’s really where the data model comes into play. Every single element inside “WinFS” derives from items. Item has concepts like security, it has concepts like relations. You can inherit from items. You can even inherit from things that inherit from items to go and build richer systems, and that’s at the core of the data model, and there’s a great drill down in one of the schema talks on “WinFS” where you’ll be able to dive in and provide us great feedback on the data model.

We also felt that wasn’t enough, because there’s going to be certain core schemas that you’re just not going to want to be different on the system. Things like documents, people, groups, multimedia all have schemas that are built and available as part of the operating system that your end users can use. But more importantly, you can relate to or you can extend to provide your own data.

You know, as an example, think about contacts. You probably all have 17 different ways that contacts are described on our local hard disk today. We have ways to synchronize with our cell phones. We have things in our CRM applications. We have our e-mail applications with addresses that we go to. And there are some apps that just have to add contacts because they need name fields.

With “WinFS,” all of those applications could derive and extend from contacts in order to be able to incorporate that into your application. So now your CRM app automatically gets something that’s going to synchronize with your cell phone. Automatically, it has other applications pointing to the contacts that are in there because you can go in as developers and extend those concepts.

At the same time, we needed a set of services that you can depend on being there and that’s really things like synchronization between “WinFS” systems, but more importantly synchronization from “WinFS” to your existing data sources.

There’s a great opportunity for you to go and build what we call synch adaptors that connect out to other systems and synchronize the data down onto the client. This allows you to interoperate with what you already have that gets back to a model where we’re at the speed of the local hard disk determines what the user experiences with their system.

Eric talked a little bit about Exchange 2003 and what we’ve done with the recent version of Outlook. That was all about trickle synching in the background down to Outlook and moving the user experience to the speed of the hard disk rather than the speed of how you connect to the server; huge feature, going to be a great time saver.

The thing it really helps you with is you no longer are held at the mercy of what your connection speed is, or whether the network is up, or whether your server is able to be connected to. Those same concepts get drilled down into NTFS.

Information Agent. You’re going to see a demo here in a couple of minutes. Information Agent puts you as an end user back in control of the information that comes into your machine. Think of it as inbox rules on steroids. But more importantly, it actually works on the metadata of the items that are stored in the system. The way you define the way Information Agent works is you define schemas that tell you how to interpret inbound events, how to compare it against state that’s on your machine, and trigger custom actions as a result of it.

So, what does this all mean for you as a developer? Well, first and foremost, it’s all exposed as part of the “WinFS” System, so you have the object models to go and manipulate these items. You realize you also want to do queries, so there will be SQL access where you can go and do queries against the system as well. And you’ll be able to use XML to get data into and out of the system.

So, as a developer, the first thing you can do is really get in there and extend the Windows type, go and build your applications to build relations against those types, and be able to go in and augment and provide new capabilities. You can also go and build your own types. It’s a great opportunity for you to come in and define new rich ways of looking at your applications.

The other thing we’re encouraging everyone to do, because back-level compatibility is so important, not only do we have this richer file system that’s on there, but we expose it to your existing file APIs through Win32. The goal here is pretty simple. The user productivity apps that work today on top of NTFS work on top of “WinFS.” What we really want to encourage folks to do, though, is build metadata handlers that actually promote information out of those existing apps so that you get a much, much richer experience inside “WinFS.” Information Agent is a great way for you to go and actually provide new functionality to help end users filter their information.

Synchronization Agents to your existing information sources I think is one of the key, key features you’re going to see in “WinFS.” Being able to take a little bit of your company with you when you go on the road by sinking it down into “WinFS” is going to be a great, great end user productivity win.

Finally, as all this data is in there, and you start to think about the experience in the “Longhorn” shell, and what you can do with “Avalon,” there’s just an amazing amount of applications that could get written. One of the questions I always get is, ‘Gord, what’s going to be the core application on top of “WinFS?”‘ To be honest, I have no idea at this point. I’m more of a data-head guy and a platform guy. Someone out there in the audience right now is thinking up new, wild and wonderful things that you’re able to do with this system. There’s going to be killer apps that come out of this system, and users are going to demand it, because frankly find first/find next isn’t going to cut it. It’s not the way we’re going to be able to move forward.

So, with that, why don’t we bring out Tom, and let’s do a couple of demos of “WinFS.”

TOM RIZZO: All right. We’re back.

GORDON MANGIONE: Tom.

TOM RIZZO: Let’s bring up the demo machine, can you guys see it? There we go. So, Hillel showed you some great demos yesterday of “WinFS” and the “Longhorn” shell, and how you can use glass and group sort filter. What I’m really going to show you here is how, as developers, “WinFS” really relates to you: how you build custom schemas, how you create relationships programmatically, and then how you use the Information Agent.

Now, there are a couple of key things you have to remember, it’s all about find, relate and act when we’re talking about “WinFS.” You see here a list of all different items contained inside of our “Longhorn” shell. If we wanted to make this list really, really big, we could have added all the candidates for California Governor who ran in the recall election. You were probably one of them. And so, the first thing I’m going to show you here is how we actually extend the schema.

So, we’re going to pop into “Whidbey” here, and we use an XML definition language to extend the schemas inside of “WinFS.” So, if we scroll down, we build we’re using a legal scenario, we’re using the case builder scenario from yesterday. And you see the first thing is this legal case type. This is not a Windows type. Windows doesn’t know what a legal case is. As a developer, I’ve created and defined this type, and it extends base.item, which is the base item type inside of Windows, and then I can do interesting and new things with this type because I own it.

Now, we could have extended the built-in Windows types like contacts or documents, but we decided to create our own new type inside of here. Now, you’ll see here we’ve added some properties to our type, things like the case number, the case description, common sorts of schema that you would expect inside of a legal scenario.

Let me scroll down a little bit, and there’s a bunch of other things in here, and you can look at it in your leisure, one of the other key things is relationships. You talk about how all your data is related, contacts are related to documents which are related to e-mail which are related to meetings, related to notes from those meetings. We believe that as developers you will leverage the capabilities of the built-in relationships inside of “WinFS” and extend those for your own needs in your applications.

So, the first thing you’ll see here is, we have a relationship that we can even name these relationships in our type, and we’re relating our legal case to the built-in contact data type. We have some other things in here like case precedent, case document, so you can leverage the power of relationships directly from the schema.

But instead of talking about custom schemas, let’s go actually see how as a developer when you do this you can leverage them in the “Longhorn” shell. So, we’re back to our view. The first thing is, we’re going to switch our view to the detailed view. You’ll see that it will change in terms of what we actually see inside of the UI. There we go. We’ll sort.

And you see now, this looks like just standard stuff, you can see that we have some custom types inside of here, our summons, our declarations. Let’s get a little jiggy with it. Let’s go in, go into our choose details, and the first thing you’ll notice is, I’m going to remove the size, but you’ll see my schema right in the field chooser. So, I can come in and do the case name, and the settlement amount, just click OK, and the “Longhorn” shell will automatically bring that into my view, and then I can group, source, filter, do whatever I want on my custom schema as a developer.

GORDON MANGIONE: Now, there’s going to be some data fields I’m not going to want to expose to end users, so I’ll be able to mark those as private and they won’t necessarily show up to the end user?

TOM RIZZO: You control all the security for your data. Don’t worry about Windows showing things that you don’t want to be shown.

Now, we’ve added in this custom schema. The thing that we can do then is, we can do interesting things directly inside of “Longhorn.” The first thing I may want to do is filter down. So I’m going to start typing evidence, and we see it starts filtering, filtering, and I get just to all my evidence documents. As a lawyer, I want to quickly find these. But, that shows you how you can search on type. Let’s search on a custom property that we’ve created. So I’m going to start typing in “Tom,” and you see it automatically filters down using the case name instead of anything inside the document name.

GORDON MANGIONE: We can dismiss this if this demo goes well.

TOM RIZZO: Right, if this demo doesn’t go well, that $3.699 million, that’s going to be a lot to pay you. So we saw here how you can quickly use custom schemas, but the gentleman was talking earlier in the video about how his wife likes to find things in drawers, and she likes to search, and those sorts of things, well, as a developer you may want to leverage your custom schemas, but take advantage of some new search capabilities in Windows.

We have this new capability called “natural language search” inside of Windows, so if you don’t like to group sort filter, find your information in that way, you can just type a natural language search.

GORDON MANGIONE: It’s interesting, because a lot of folks say, ‘I can just do this if I build content indexing on top of all my existing data sources,’ and while that may help you with find, it’s very difficult to build applications on top of that architecture. You need the transaction model, you need the synchronization, you need information agents, you need a consistent programming model that isn’t asynchronous across multiple async stores to go and build those apps. And that’s why this is so much more powerful, the find and the searching that we’re going to demonstrate.

TOM RIZZO: Exactly. So let’s type in a natural language search. My mom, she has no idea how to group sort filter, so I’m going to say, show me all summons with settlement amount — which is in our custom schema — around, we don’t know exactly, we’ll say around $1.6 million. So we can come here, and we can even filter it down via our custom schema, but we’re going to live on the edge. We’re going to live on the edge, we’re not going to touch that. Let’s just go, enough talking. So we see here it returns back a document. I open it up, you see the settlement amount was $1.5 million. So the natural language search turns it into a search that was much, much richer than probably we could do by grouping, sorting and filtering.

GORDON MANGIONE: That’s very cool.

TOM RIZZO: So let’s move a little bit from finding your information, and what you can do as a developer to help your end users find their information through your custom schemas, let’s move to actually relating information. So you saw this case builder application, Hillel showed you this, as well. If we go into the Jeff Ho, Sean Dunlap remember yesterday, you hover over things, it reveals relationships in yellow of the things that you hover over. Now, you can drag and drop relationships into this, but you’re all coders, you don’t want to see me drag and drop, you don’t want to see all that stuff. You want to see code. So let’s pop in, hopefully my short little stubby fingers will not fail me here today. We’re going to write some code.

So you’re seeing here the “WinFS” API, it’s a very rich API. You saw a little bit of the item context yesterday that we use inside of “WinFS,” where we can contain all of our operations that we write against the API. We’re finding a folder, which is our case-builder contact folder. We’re going to go now, we’re going to find a settlement amount, the case with the largest settlement amount, because I want to be on that, because I’ll get 33 percent of that settlement amount if it actually wins, without doing any work, and I’m going to add myself to the contact, and then relate that all together. So let’s start writing some code.

GORDON MANGIONE: Wait a minute, you’re going to do that, but presumably if this was a real app there’d be security on there, so you wouldn’t be able to go and do that.

TOM RIZZO: Exactly, but let’s suspend reality for a minute, and let’s actually go and do this.

So I’m going to come in here, and the first thing I’m going to use is the item searcher, and this is built on every “WinFS” type inside of the system, even your custom types. And I’m going to say, let’s use that legal case type that I created, and you know what, built in Intellisense for even your custom types as a developer. So you see all the common methods, all of your properties, all the things that you’ve created in your custom schema, I’m going to do the get searcher method, which will return back to me this searcher, and I just pass in that item context.

Now, the next step I need to do is we’re going to do a set based operation. We’re going to filter down to all of our cases with a settlement amount over a certain amount. So the next thing we’ll do here is we’ll take our searcher, and of course it’s case sensitive, .filters.add, and then we’re going to add in on our custom type here, settlement amount greater than $1 million. Got to count my zeros, got to make sure it’s right, otherwise the code will fail. And then, we go in and the next thing we do, once we get this filter, we may want to sort it in rich and interesting ways. So I’m going to do the sort option, SO, and we’re going to create a new one of these. So we’re going to say sort option, and you know what, “WinFS” is going to say, what do you want to sort on? And I’m going to say sort on my custom type, so I’ll say settlement amount, my custom property, and then I’m just going to use an enumeration built right in to the API here, so I’m going to say system.storage, the default “WinFS” namespace, .sortorder.descending, so that the largest one pops up to the top, and I can add myself to that one.

Now, we’ll go get our legal case, using our searcher method here, so our search objects, I’ll say, hey, go find that legal case, and I’ll say, searcher, do my work for me, find one, and then we’ll pass in our sort options, and we’ll say, we have to type cast this as legal case to return it back. OK. So we’ll find the case with the largest settlement amount.

Now I’m going to create a new contact inside of Windows. All we do here is we create a contact ref, and we’ll do it as CR, as new contact ref. You can see I’m really typing quite badly up here. I type with two fingers, so no one follow my example, do what I say, don’t do what I do. And then, we’ll come up here and we’ll say, create a new person object, which is a built in type inside of the system, and we’ll say, new person, and you see all the Intellisense is helping me out here, pass in the folder where I want to save this person, which is case contacts, and then pass in the display name of that actual contact. And then I can go through and take a look at some of the built-in properties on top of the person. So I’ll say, person.primaryname.givenname is Thomas, and then we can do p.primaryname.surname, and look here, I’ve got to make this joke, because it’s so funny.

GORDON MANGIONE: It’s so redundant.

TOM RIZZO: Pronunciation “sir name,” they always get these Italian names wrong, Mr. Mangione, and Mr. Rizzo. So we’ll do the surname inside here, so I select that, equals Rizzo. And then we just set the target of our contact relationship, CR.targetitem is that person P, and then all we have to do is add ourselves with one little method inside of here. So we’ll say LC 1, remember that relationship, we named case contacts back in our schema, it just appears right in Intellisense, add it, and guess what, the final step we need to do is update everything in our contacts, so it flushes it out from memory to “WinFS.”

Now if I’ve done it correctly, it should compile and run. Everyone hold your breath, waiting, tenseness in the room, everyone is hungry, waiting for lunch. There we go, an exciting application. What we’re going to do here is this is my little Win forms app. I’m going to go back to the case builder, bring up my app, and now if this works, you should see Tom Rizzo pop inside of the UI related to the Joplin versus Carson case, because that is the case with the largest settlement amount. So let’s give it a try, wait a couple of seconds, bam.

GORDON MANGIONE: Very cool.

TOM RIZZO: As easy as that.

(Applause.)

But, it doesn’t stop there. It doesn’t stop, I can’t stop, I hover over, you see the relationship there, but if I hover over this you see its relationship, that case’s relationship, as long as I have permission, it automatically relates that altogether, so you can traverse relationships between many different items, and I can even go into a rich view, and see this all graphically, all my relationships.

So we talked about relationships, and one of the great things you can do with “WinFS,” once you have all of your data working with “WinFS,” all your relationships in there, is you can have smart agents, programmable agents that you can use to work on that data on your behalf. Now, we’ve worked with Unisys and Motorola, I have a Motorola phone up here. We’re going to launch a custom information agent that we created, so that as a lawyer when my big fish clients call me on the phone, an agent will give them a personalized greeting, and look at my calendar, and tell them the next time I’m free, and so I can give them a buzz back, and give them some great personalized service.

The first thing I need to do is I need to set some preferences inside of my application. My first preference here is, whenever I get an incoming call form a client with an open case, it’s going to look at my custom type that I created, look at the caller ID, relate that to a contact, look at all the cases related to that contact, see if that case is open for that actual contact. The next condition I’m going to set is, if the settlement amount if over a certain size. So here let’s get a little greedy, let’s do $2 million.

GORDON MANGIONE: So this is a great example of, again, using the schema not only to define the meta data, but the rules that you can go and apply against the metadata.

TOM RIZZO: Exactly, as a developer, you can do all of this with “WinFS” and all the great integration that we have. Then finally, we’re going to have a look at some built-in “WinFS” information, which is my calendar. So if my calendar shows me it’s busy, if all these conditions are met, perform an action for me, and that action will be respond by a voice with my next available time. So if I’m free at 5:00, tell the caller I’ll call them back at 5:00, and I could even have it do an SMS message, send a meeting request. I’m just going to have it respond back with my net available time. I’ll say, this is my big dollars clients in there, and then I’m done. That preference is set. Let’s give it a try. So the Motorola smart phone is right up here, we’ll pop it open. Let’s assume I’m a client that’s calling Jeff Hoag, who has an open case with a settlement amount above $2 million, and I’m going to call my lawyer. So let’s just do it.

GORDON MANGIONE: So this doesn’t require the smartphone, because you’re integrated in with the backend voice mail system that’s is then going to communicate.

VOICE: Hi, Jeff, I am unavailable at this time, but I will call you back at 2:00 p.m., when I get out of my meeting.

TOM RIZZO: There you go, text to speech, looked at my calendar, looked at all my schema information, looked at all my relationships, information agent made my clients have a great experience.

GORDON MANGIONE: Very cool.

TOM RIZZO: Thanks, Gordon.

GORDON MANGIONE: That voice mail app just blows me away, that’s the second time I’ve seen it. What’s really going on here is a Web service is calling in, you’re getting it into information agent, it’s then running that rule, going out and figuring out what’s on your calendar, sending text back to the voice mail system, that’s in turn doing text to speech all while the caller is on the line. It’s a pretty amazing application.

OK. To wrap things up, really what I want to see you do as a result of this is go get “Yukon,” start pounding on it, get your existing apps running on top of it, give us feedback about how it’s working, start to kick the tires on “WinFS,” give us feedback on the schemas, the services that are in there, but have a fantastic PDC, my name is Gord Mangione, my e-mail address is [email protected]. If you have any concerns please give me a ring.

Thanks a bunch.