ANNOUNCER: Please welcome Corporate Vice President, Microsoft Corporation, Quentin Clark. (Applause, cheers, music.)
QUENTIN CLARK: Good morning. Thank you. The video shows the tremendous momentum we’re starting to build on the SQL Server 2012 product, which brings me to the first business of the day, which is to welcome you to day two of the PASS Summit and to thank you for being part of this incredibly vibrant community.
The momentum of SQL Server 2012 is because of the work that you do. SQL is your product and it’s your community.
For the last two years in this keynote, I’ve spent time stumping, right, talking about SQL Server 2012. Two years ago, starting to give the previews of what’s going to be coming, what kind of value we’re going to create. And then last year, in fact, I chose 12 favorites to dive into more detail and work people through what was going to be in the upcoming release of the product.
This year in the keynote, I wanted to show you how this whole thing comes together end to end. We’re going to pick one kind of scenario, one business problem, we’re going to work it all the way through the entire data life cycle.
So, first, let me set that stage just a little bit. Yesterday Ted talked about the opportunity we have together around the changing value of information in all businesses. The opportunity that we have is to reintegrate and rewire the economy around information.
Let me give you some examples. Even yesterday we could see this. If you were paying attention to the post-election stuff that gets talked about, you’ll see that there’s analysis on the analysis. Anybody hear about this? Right? There’s a lot of this discussion around the various models that were built, the analytical models that were build around the election predictions and how the winning party, I’ll just leave it at that, was able to better predict — in fact there were models that almost county to county had the prediction very accurate.
Well, that’s a big data problem, right? You’re looking at multiple signals, pretty large-scale data, it’s not just one data source, there’s some stuff that you’re doing originally, some things are coming from outside. And that’s an example of how information is changing how a business works, in that case the business of running an election.
So lots of examples of this. And the job I do, I have the privilege of getting in front of customers a lot and learning about their work and learning about their challenges and where they’re going.
I was sitting recently with a large hotel chain. They have boutique hotels and large-scale across the world kind of hotels. One of the things they’re doing is they’re starting to experiment with RFID. So they’re putting RFID chips into the hotel keys so that the customer experience is really cool, you just do the proximity thing instead of the magnetic reader.
But what they’re also doing is they’re putting RFID readers throughout the hotel because they’re looking to see who goes to the gym, do people go to breakfast even if it’s not written down on their bill and that kind of stuff. Because they want to look at that behavior inside the hotel, pull that together with their transactional information, movies you rented or room service, that kind of stuff, and get a better profile and understanding of their guests.
And they’re particularly trying to move forward the loyalty programs that they have. But they want to go further; they want to look out to social media. So if you liked them as a hotel on Facebook, they want to reach into your preferences and see what your interest and activities are so they can completely customize and tailor the experience in the hotel.
So they want to allow these scenarios where a guest walks in, gets checked in, and maybe the person checking them in says, “Hey, here’s a running map, it’s going to be a beautiful day in the morning if you want to take a run.” Well, how do you get to that moment? Well, you have to understand the weather, you have to understand this person likes outdoor activities, you have to maybe see some history that they’ve been to the gym so they’re kind of into the health thing.
But you have to do that at scale. You can’t have the college kid who’s working there part time, right, at the front desk, they’re not the ones to be looking through guest by guest and trying to figure out what their interests may be. This has to be done at scale. That’s a big data problem.
Another example is we work with large retail, the large-box retail chains. And the stories about how data warehousing has transformed the operations of retail, those are pretty well known. But now the next generation is coming. They’re looking to do things now more in real time. So they want to change the music playing in the store based on who’s in the store. Again, RFID tags and the club cards, you walk in through the same gates that are checking to see if you’ve stolen stuff will also see who’s coming in.
And so with an understanding of demographics coming out of their traditional business data warehouse based on purchase history, they can know something about the people that are in the store and they can do something about the music.
Well, that’s a different relationship with their music providers. They don’t have DJs back there, by the way. Right? That all comes as a service as well. So they need to be able to interact with that service in real time based on analytics they’re doing in real time.
They want to take things further. They’re now starting to think about putting digital displays, which are now becoming basically super cheap, putting digital displays all throughout the store. And based on who’s in the store and where they are, they want to run real-time auctions for that digital real estate with product companies.
So if you’re standing there in front of the Colgate and the Crest, that digital display that’s sitting there may be over the aisle right next to it. They want to know if they can get those companies interested in advertising or providing coupons in real time based on who’s standing there. That’s a very different plumbing of the economy.
One last example I’ll give. We work with a large package shipping company — incredibly good SQL Server shop. They actually run all their distribution centers on SQL Server and they’ve gotten a ton of value out of the ease of use and the availability and remote management capabilities of the product.
So they have an incredible automation system and processes around data warehousing and they use all this information, all this historical information to optimize their routes and to optimize their business.
But it’s interesting. I was talking to them at one point, and they revealed to me that they have an interesting side business going which is taking all the package history, you know, how many packages to what addresses, how much did they weigh, that kind of stuff, and doing analytics on it and providing the results of those analytics to financial companies who are managing loans and credit ratings and that kind of stuff. Because it turns out that for a business, especially a business that is in the business of selling goods, the comings and goings of stuff turns out to be a really interesting metric on business health.
And so here’s suddenly this new revenue stream, this new opportunity for them, that’s not really their core business, it’s the digital exhaust of their core business after they’re done leveraging it to improve their efficiency and operations, they can then find it of value to other companies. So you wouldn’t have imagined that particular relationship, right?
So these are the kinds of things I’m talking about when we talk about these new opportunities and how that information, in fact, is going to change our economy, right, and change — there’s another generational change coming moving from that mainframe era to microprocessors now moving into what the cloud provides.
It turns out that all these scenarios require a complete data platform to pull off. You know, if you look out in the world and you look at all the startup community efforts and all this stuff, you’ll find claims. This new visualization tool, that’s a solution of big data. This analytics function, that’s a solution of big data. We have a distribution of a large-scale MapReduce product, that’s big data.
Well, you look at all those stories, and it needs all of these pieces. It’s not enough to have any one of these competencies; you have to have a complete data platform. What we’re going to do today is we’re going to take a scenario, we’re going to walk through that whole platform and talk about the life cycle the data has and how it evolves to produce the insights that ultimately change a business.
Starts with managing the data, and this really means all the data, right? The data that is coming out of manufacturing systems or logs or whatever it is as well as the traditional transactional stuff. There’s the stuff that you have to take that data and shape into something that an information worker can understand and get value out of. And then provide that to them in a way and give them tools to have that magnetic, fun, visceral engagement with the information. So that the information worker can listen to the data, can hear what the data is trying to tell them, and that’s a lot of what our efforts that you saw yesterday in Amir’s demo is about: providing that kind of experience.
And then it’s not enough for one person to have that “ah-hah” moment, you need to be able to collaborate, share those stories. And then insights need to be operationalized. It’s one thing to have the “ah-hah” moment once, it’s another thing to ensure that there’s business process backing that up, and also provide the capability of ensuring that those insights are repeatable. As they become critical to the business, they have to be repeatable. You have to have the business continuity kind of solutions in order to continue to make that run.
So in order for me to do this, I need to bring a partner up. I’d like to welcome program manager on my team Julie Strauss to come join me and give me a hand. (Applause, cheers.)
JULIE STRAUSS: Thank you. Good morning.
QUENTIN CLARK: So before we begin, I thought maybe we should set this up a little bit, set the stage a little bit and talk about the scenario. So we have a pretty classic Microsoft, Contoso, Acme, you know, one of these kind of things of a movie theater company. OK? So they own a chain of movie theaters across the U.S. and Europe and they’ve been in the practice of understanding capacity needs based on new releases for many, many years. And they’ve actually gotten quite good at it.
But in the last few years, they’ve noticed that they’re not as good at it as they used to be. And somewhere in the business management chain had this process of maybe it’s the social media starting to change how successful movies are and aren’t as they get released. And we can have no idea about that signal. So maybe we should figure out how we can incorporate that signal into our capacity planning for new releases.
So that’s the basic setup. Ultimately, we’re after this “ah-hah” moment that lets us understand whether or not social activity on new releases affects the performance of a movie. And understanding that, then, would affect how it is that we do capacity planning, right?
JULIE STRAUSS: Yes.
QUENTIN CLARK: So the first step of this is going to be down at the bottom of this life cycle around managing data because they have today our traditional transaction processing systems, of course, to run their operations, and they take in also a bunch of signals from the industry around overall movie performance worldwide and all that. So they have a traditional business warehouse, data warehouse that manages all this data for them. But they also need to reach out and get a bunch of social analytics data, in this case we pointed them to taking on Twitter information.
So Julie’s going to put on an IT hat first. Later she’ll put on the IW hat, but let’s start with the IT hat and dive down to the bottom and see how to bring all this information together. OK?
JULIE STRAUSS: Let’s do it.
QUENTIN CLARK: All right, Julie.
JULIE STRAUSS: Are you ready for the journey? (Cheers.) Not really? But, OK, you probably will get.
So throughout this demo, I’ll be introducing data coming from multiple sources. Up front here in the first arc, I’m going to introduce three of these data sources: an enterprise data warehouse, data sitting in Hadoop, and data sitting in PDW. So let’s take a look at this.
So here I have the data sitting in my enterprise data warehouse. I’m going to zoom in a little bit so you can better see. It basically represents data coming from within my organization and this is the data warehouse that all my business users are using for analyzing business performance. It’s pretty standard relational data warehouse.
QUENTIN CLARK: This is where we draw the current prediction models?
JULIE STRAUSS: Yes.
QUENTIN CLARK: This is the data we use for that.
JULIE STRAUSS: Yes. And this is data we’ve been working with for a while; we’ve filled out the models pretty well. You see it contains information about the movies, about revenue, geographies we’re operating in, access, et cetera. So it’s a pretty standard data warehouse. And the important piece here is to note that all this data is born from within the organization. And it has worked pretty well so far, right?
So that leads me to the second data source, which is data sitting in Hadoop. I want to go up here, we have, to Quentin’s point before, we have this hunch that social media could impact behavior of people. So if your friends are tweeting on a particular movie, positive on one and negative on another, there’s a pretty good chance I’ll go to the better one, unless there’s something wrong with me. So that would be my choice.
But we also have this choice that looking at just that data and the volume of it that can potentially help us improve performance.
So here you’re looking at my Hadoop cluster, you see the capacity a little more than 20 terabytes of capacity. You see I have ten live nodes. Pretty standard Hadoop.
QUENTIN CLARK: So what do I have inside the Hadoop cluster?
JULIE STRAUSS: What you have inside the Hadoop cluster, we have all the different tweet logs, they’re sitting in JSON files on HDFS, the Hadoop Distributed File System. The interesting point here is, though, due to HDInsight, this Hadoop cluster is sitting on Windows Server. It could be sitting in Azure and Windows Azure run as a service if I wanted to put it in the cloud, but in this instance, it’s sitting on-premises.
And I have another one, I have the MapReduce administration counsel here — console, not counsel. Get used to it, there will be a lot of that. (Laughter.)
So here you see all the completed jobs, and this is basically the jobs I’m doing to extract the information out of all the logs that is relevant for the analysis I want to do, and we’ll come back to this in just a second.
So that is Hadoop. Then the last piece of data that we want to work with is data sitting in PDW. So here’s my management console, you see the Performance Manager, there you go, there’s no performance apparently, just yet. There you go. No performance is good performance. So my Performance Manager, I have the different queries and I have the state of the appliance, which look overall pretty healthy, right? And Christian Kleinerman walked you through this yesterday in Ted’s keynote.
So the point is, here is all the data. Normally, my PDW appliance, which basically was delivered to my door pre-configured with software, hardware, and networking capabilities ready to just plug in, it’s normally used for my heavy-duty data warehouse workloads. But in this example, just to make it a little bit simpler, I have reduced the number of tables that I’m working with.
So if you look in here, I basically have just a select statement here if the columns in my database, I’m just working with eight of them.
So the next step, what I’m going to do is I’m going to take the data from the Twitter logs. I’m going to merge it with some of the columns from the PDW tables, and I’m going to leverage the PolyBase technology that Ted announced yesterday to generate a new external table.
QUENTIN CLARK: So the important piece, though, with PDW is that we’ve already pulled in sentiment data, right?
JULIE STRAUSS: Yes.
QUENTIN CLARK: So there’s a corpus of information we’ve pulled into PDW that understands the semantics around sentiment. So we can look at various words and understand whether or not they’re favorable or not favorable.
JULIE STRAUSS: Yes.
QUENTIN CLARK: So looking at the Twitter data and looking at the semantics, we can bring those together and say good things, bad things were said about various topics.
JULIE STRAUSS: Yeah. So the table has words like “fantastic, wonderful, gorgeous.” That gets all the positive numbers. Then you have “good, average, OK” kind of the zero to one scale. And then you have words like “horrible, disastrous” and even more profound words that I would never say out loud.
QUENTIN CLARK: Profound.
JULIE STRAUSS: In particular not with Quentin next to me. So you’ll have to use your imagination on those, but they will be a negative score, OK?
So why don’t we start generating the table?
QUENTIN CLARK: Yeah, that’s a good idea.
JULIE STRAUSS: Let’s do some work here.
QUENTIN CLARK: So it’s important to know at this stage we now have three data sources, right? We have one which is our traditional business data warehouse that we’ve had for a long time. We’ve reached out and we’ve gone out into the Internet, we’ve pulled down a sample of Twitter data. And if people have worked with this stuff before, you realize you get a subsample over a time window. And that’s what we’ve done here. We’ve picked the appropriate time window to tell us things about particular movies that we’re interested in looking at and analyzing.
And we’ve also reached out and found this sentiment information to understand the semantics of English relative to what that means in terms of sentiment that a person has in what they write. So we have these three data sources now.
JULIE STRAUSS: Relative to what I’m speaking? Is that what you’re saying?
QUENTIN CLARK: Yes.
JULIE STRAUSS: OK. So let’s have a look. I executed the query. Not very exciting on its own. So let’s walk through and see what is actually happening.
So, first, I’m telling PDW the information — the details about the Hadoop cluster, right? So just connection details, go find your way. The next piece is just defining the file format that I’m going to use for parsing the Twitter logs, the unstructured text.
Next section, as you see here, I am creating a table. I’m actually generating a view on top of the Hadoop data. This is still sitting in Hadoop. Here you see all the columns that I will end up with, but the data is still back in Hadoop, I haven’t moved it yet. And then I’m going to do my MapReduce job, as you saw from before where I had the history of my previous jobs, what I’m doing here is picking the key words of the phrases I want to work with.
So we have selected some movies. We have selected some actors. So we have movies like The Avengers, Hunger Games, The Vow, and don’t you even start making fun of how I pronounce that one. Just so you know, The Vow is super hard to say for a non-American person like me. And I will get it wrong at the same time, so you’ll have something to laugh about later.
QUENTIN CLARK: You’d think a Viking wouldn’t have trouble pronouncing “V.” (Laughter.)
JULIE STRAUSS: It’s hard. We don’t talk, we just do. (Laughter.) So let’s go. (Applause.)
Next section. Sorry, I couldn’t help it. (Laughter.)
Next section. We’re going to create the table. So here what is happening, we’re taking that view that we generated and we’re going to mash it up with the sentiment column that we have sitting in PDW. And then eventually the table is generated, it’s persisted back into PDW, and I’m just showing the result set here.
QUENTIN CLARK: So this is sort of the key PolyBase moment, right? Where I have data sitting in Hadoop that I’ve now defined logical tables for, and I have data sitting in the relational store, in this case PDW, and PolyBase now looks at this and is able to join that data together into one result set.
JULIE STRAUSS: Yeah. So you see the table. I have the phrases, the time stamp, the tweet, and then you have sentiment. So this is basically what came from PDW.
So with that, we want to make this available for analysis a little bit later on, but I’m going to give the word back to you.
QUENTIN CLARK: Great. Thanks. So what we saw here was the value at the bottom part of this life cycle, right? The first stage of this life cycle, we have to bring all the data together. Whether it’s relational or non-relational data, and so in this case we have JSON files that are being managed in our Hadoop, but you can imagine any number of different kinds of data structures that are born into that form. And part of the value here of a solution like PolyBase is to allow the data to live in the form that it’s born into in the first place, right?
So if data is born as JSON records, as is Twitter data, or if data is born as log data or logs coming off of manufacturing processes or coming out of shipping systems, that data should be allowed to live in the form it’s natively born into and be managed by the systems that create that data in the first place, the applications that create the data in the first place. And then provide the capability of reaching in, understanding what data’s in there, and pulling it out more holistically in a consistent way.
Also, importantly, we reached out to the roles data at this stage as well. We went and found a bunch of Twitter feeds, we brought those in, we went out and we found this semantic table that we’re also using. And so my solution, ultimately, I’m trying to get to the point, right, of where the analytics understanding — so the analytics of social impact on movie performance, I had to go out and reach out for the world’s information because I don’t own Facebook, it turns out, as this movie theater, I had to go and reach for it and find all this stuff.
And then the last piece is providing a consistent sort of model and capability of analyzing pulling the data out. So PolyBase is the very first step in providing that consistent overall experience over all the data that’s under management.
So the next — well, a hopefully they can fix that back there. The next step in the life cycle is going to be around refining, discovering and refining data. I’ll click around, maybe they’ll help me get this right. Here we go. Sorry. If that’s our only glitch, we’re doing good because we’ve got a lot of technology in play here. So she’s hoping that —
JULIE STRAUSS: Feel the pressure.
QUENTIN CLARK: — my glitches were the glitches, not hers.
So our next step is really to refine that data so they can be used very easily by information workers. So still staying in our IT hat, we’re going to look down on these data sets and shape the data into views that the information worker can very easily and rapidly find and understand how to use. So with that introduction, we’ll take a look at the next step.
JULIE STRAUSS: OK. So I’m in Excel. I’m actually still wearing my IT hat, but my business users really want to work in Excel. So I’m going to go here to make the sources available.
So what I’m going to do, I’m going to connect to my database, in this case PDW. And just connect to it. And what I’m going to do in this arc is really going back, finding that file, and making it available to all the end users. I should hit the controls, that works better, there you go.
QUENTIN CLARK: Zoom it.
JULIE STRAUSS: Yeah. I’ll do that in one second. So here you see the database that I’m connecting to. And I’ll just go in here and use this one and we’ll get to the actual tables. Here is that movie tweet and sentiment table that I generated before. So you saw I had eight tables, now I actually have ten, but I forgot to show you, but there it is.
So that is the table that I generated using the PolyBase technology, OK? So I’m going to select this one — woops, not that one, the right one — and we will go ahead and just use that one. I can just expand this a little bit so you can better see. I’ll start by giving it a proper name. Tweet and sentiment. I’ll try that. I apparently have been practicing, there’s already one with that name. Tweets and sentiment. That should work. There you go.
So now I could go ahead and just publish this data. However, this came from pure text files. So it’s not necessarily the right shape. So I’m going to reshape it a little bit. If you look at this source table, you will see that it has a mix of uppercase, lower case, et cetera. When my end users start laying this out of the report, that’s how it would look like. And by that time, it’s in a semantic model. They’re not really going to be able to modify it. So it’s going to look a little bit sloppy maybe.
So I’m going to fix that for them. I can just right-click this column. I can choose to transform and capitalize each word. Voila.
QUENTIN CLARK: So this data production step is not changing the underlying — it’s not going back into the JSON files and recapitalizing things, it’s instead reflecting how the data process view will be consumed by the information worker side.
JULIE STRAUSS: Yeah. I know they were not really impressed, but it was pretty simple to do. (Laughter.)
So, OK, well, I’ll try harder. So here’s my date column. Date is pretty important when you start building your BI solution because that’s very often what you want to analyze the revenue over time, right? The way this came in with an offset, it will land in my model as a text string or text column. That’s not really going to be useful, it’s not going to be operating like text.
So instead of giving this problem to my information workers, I might as well fix it here, and it will give me one more chance to show you that this is pretty slick.
QUENTIN CLARK: Because in my analysis, what I’m really looking for is the tweet day. Like, what day the tweet was made. And we have to provide that to the information worker in a way that they can just see that and understand what it means.
JULIE STRAUSS: Yeah. So I’m right-clicking the column, and I’m going to split it by the limiter. I’m going to choose the space, simply because that was how their column appeared. And I’m going to apply this one. What will happen now is — I have many date columns — I have a lot of different columns. For example, I’ve spread this one out. I could delete it now, but I don’t really know what my end users want to do later on, so I don’t really want to modify it yet. I can keep it there, I can hide it later on and remove it. But for now, I’m just focusing on the columns that are relevant. So I’m picking the month, the day, and the year. All I have to do now is merge the three, merge columns, and I’m going to use space again, and voila, I have a merge style. You see. (Cheers, applause.)
Now I’m going to call it “Tweet Date” because “merged” is not going to be very helpful either. So now I’m good to go. When I import this table now or this column, the whole table, this column will appear as a date column and will be super useful.
I could keep going, but I won’t. So I’ll go ahead and say done. Now I’ve modified the query, it’s in the shape that I think will be beneficial from our end users. I have a preview now in Excel so I can see what I’ve done, and last thing I want to go and do is go down here at the end and share a copy.
Now, I choose to include the preview because when my end users later on want to search for these, instead of having to execute each query, they can just mouse over them and see the result without having to run the query. And I’m going to share it and that’s it.
QUENTIN CLARK: OK. So what we’ve shown there is the activity of the IT side taking the raw data sets and shaping them into forms that are much more consumable by the BI tools, effectively. And it’s really about unlocking the value around the information that’s relevant, right? Teeing up that whole end-to-end solution so the information workers have all the right information to go and to find.
The information set there is coming again from the rest of the world, right? So it’s the Twitter data, it’s the sentiments analysis, and it’s all that stuff that PolyBase pulled together and shaped into one relevant form.
The next piece, then, is how it is that information workers are going to be able to reach in, find that information, and visualize and collaborate on insights. This is where we’re leading to the “ah-hah” moment. So now we have a data set that reflects sentiment as it relates to movies that information workers need to be able to find and need to be able to explore in relationship to any existing data they have and have the kinds of experiences that lead them to finding and discovering those insights. So let’s dive into that and see where that takes us.
JULIE STRAUSS: Yes. Let’s go. Actually, before I go and find, I want to just take a brief moment. This is my favorite topic, so bear with me.
So you saw yesterday in Ted’s keynote some of the integration work we’ve done with Excel to integrate the xVelocity Engine and PowerPivot and Power View into Excel. So you saw a lot of good stuff there. I want to just go a little bit behind the scenes.
So what does it really mean? Some of the core issues that some of your users may have had as well when you work with Excel you can work with one table at a time, or you can work with multiple and you can go and do VLOOKUP, but that can quickly become a little bit complicated — at least if you don’t do it on a regular basis.
So because of the integration of the xVelocity Engine, Excel now has a whole lot of other capabilities. So let me just give you a quick example.
I’ll go to my data tab. At this point, I may not even have enabled PowerPivot. I’ll go directly to my data tab, you see as well this type of data sources has been expanded, but obviously I’m going to pick SQL, how could I not? And go next. Now, this looks exactly like what you’re used to seeing in Excel, right? But there’s one exception. This little line enables selection of multiple tables. This doesn’t look like a lot, but I’m telling you, there’s a few hours put into this little control.
So I’ll say enable, and voila, now I can select multiple tables and I can, as well, import the relationships that has been defined by IT in my back-end data source. Yeah, you can clap. (Applause.)
QUENTIN CLARK: That’s another example of connecting the IT activity and the IW activity. As IT understands and operationalizes relationships between data sets, they can actually expose that in the very initial experiences that information workers see in Excel.
JULIE STRAUSS: Yeah, what it also means is as a business user, I can start my BI process even earlier without even understanding I’m doing BI. I’m just working with tables, as you saw me just insert a view, and you’re basically building a semantic model behind the scenes.
I’m sure you all would be excited to see data import, but I’m going to cancel out of this one. Cancel, I thought I did. And I’ll go into PowerPivot instead, because that’s another favorite product of mine.
So in addition to embedding the xVelocity Engine, we’re also shipping PowerPivot and Power View in the Excel box. And that’s also pretty neat. So why would I use the add-in if I can do all this goodness in Excel? Well, the add-in represents a lot of advanced modeling capabilities. So you can do the basic stuff in Excel, right, import tables, create relationships, but you can use the PowerPivot window to further enrich that model with business logic, right? You can create relationships using drag and drop, create KPIs, perspectives, et cetera. And then you can use Power View to further enhance those visualizations.
QUENTIN CLARK: So the model we’re looking at here is my existing model on my existing business data warehouse?
JULIE STRAUSS: Yeah. So the tables you see here, you may recognize, that’s what I showed you initially coming from the data warehouse. And these I could have imported from Excel or from in here. It’s all collaborating on one data model, one Excel data model. And most of the work for the SQL team was really focused on the integration, but we did manage to sneak in a few features. So let me show you that, just because I can’t help it.
I have a geography table down here. Let me go to that. We have added a few categories over here. So you see I have geography table, I have city, state, and country. And I have suggested. Suggested in brackets means the system suggested it for me, I can go and define my own.
QUENTIN CLARK: This is the magic behind Amir’s demos.
JULIE STRAUSS: Yes. So it all is magic and it shows up in miraculous ways. All that miracle work really is empowered by all the semantics that is worked into the model. It doesn’t mean you have to go and open PowerPivot to make it work, but it just means all the magic happens because of the semantics in the model.
QUENTIN CLARK: So the BI tools here in Excel basically can understand the kind of data that you’re looking at and provide metadata on the data that then things like Power View pick up on to know that things can be put out on a map or put over time and that kind of thing. So these hints here end up being important to those experiences.
JULIE STRAUSS: Exactly. But we actually managed to leverage those hints for something else as well. So if you look at this data, honestly, it’s kind of a pathetic table for geography — city, state, and country. I need postal codes, for example. So what am I going to do now? Spend all afternoon going to search for postal codes? No, I’m not going to do that, that’s not what I’m paid to do.
So I’ll go out here to select, suggest related data. What happens now is the system is reading from those defined categories, go out to the Windows Azure marketplace and detect other data sets that, given what I already have, might be relevant for my analysis. That’s pretty neat. (Applause.)
So because even if I had the time, for me to maintain that just doesn’t make sense. And if I go to IT, they would laugh at me even more than what they do to begin with. So they’re not going to do that for me either. So here I can just subscribe to a perfectly well-managed data set.
QUENTIN CLARK: So here we’re on our third example of reaching out for information in the world that can help me at the particular step of the business analytics I’m trying to achieve.
The Windows Azure Marketplace – DataMarket is the hub we’re building for this kind of information exchange and a new market for the value of information. And when you’ve done these steps here, you’re reaching into that marketplace to see what relevant information may be out there to help you.
JULIE STRAUSS: Yes. OK. So let’s go back to the scenario. Now I could go ahead, obviously, insert PivotTable. But since we do have a new, shiny object, we have to show you Power View. So let’s do that.
So now I’m going to go back and dive into the model that I have. Oh, I can’t do that yet because I have forgotten to discover something. I need to go and find the tweet data.
QUENTIN CLARK: That’s right.
JULIE STRAUSS: That’s a good idea.
QUENTIN CLARK: You spent a lot of time curating and our data explorer experiences, this information set out of IT. Now we need to go find it.
JULIE STRAUSS: Yeah. So I have all the data from my data warehouse, now I need to go and find the PolyBase table that we created with the tweets in it.
QUENTIN CLARK: And integrate it into your existing model.
JULIE STRAUSS: Yes. And I don’t know the connection string, I don’t know anything. I don’t even know that my appliance sits in Madison — I know, that’s why I’m sweating. (Laughter.)
Now I can go ahead and search for sentiments and see what happens. Yep, this was the one I created before, and I get the preview, that’s pretty simple. That’s easier than a connection string. (Applause.) Thank you.
So now I can go ahead and choose to use this one. And now it’s going back and it’s only a query, it’s not really the data that sits there. Now it’s been added to my list of queries, and apparently I’ve been doing this before, there it is. Now I have the option to add — I can either reshape it. So if my nice IT person hadn’t reshaped the date and everything else, I could go and do that. I know that is now done, so I can just go and add it to my sheet.
Now, if I had not been using PowerPivot at all and I wasn’t really aware that I’m building a semantic model, I probably just would add this to my Excel sheet. It still goes to the model, but I would just add it to the sheet. I am, however, doing a model. So I don’t want to have it listed in my Excel sheet, I’ll just go and create just a connection. What happens now, it’s pushing this directly into the data model behind the scenes, with or without the PowerPivot add-in enabled.
So let me just go and show that. If I go back into PowerPivot, I can go over here and should have landed to the right, there’s my query. That is what I landed. Now I can just use drag and drop to easily create my relationships. That’s pretty neat as well. But now I would go and further enhance that with calculations, et cetera. But as the good cook that I am, I have pre-prepared for you a table coming from the same source.
The reason I have pre-prepared it is that I have added quite a bit of logic where I’m treating the sentiment. I’m saying if it’s from above three, it’s positive, if it’s minus, it’s negative, et cetera. So I’ve added — I’ve used the DAX language to go and find the business logic for me. And since this is not a session in expression building, I thought I would pre-prepare it for you.
QUENTIN CLARK: So now I have this moment where the sentiment information that’s coming out of my Hadoop system that’s being merged with this language analytics data that’s in PDW and the business information that’s in my original data warehouse for the business, through PolyBase, is being added into my existing model and I get one aggregated view across that whole set of information. And I did it without connection strings and everything else. I was able to allow IT to publish the information in a form that was useful, and then the information worker was able to, with a key word, search and find that information set and bring it in just like that.
JULIE STRAUSS: Yeah. So now we can go where I wanted to go before, to my shiny object. Power View. So now I’m going to connect, and now you will actually see this. I can just minimize this one. Here, you see the table that I found from PolyBase, right? So first thing I want to go — is have a look at the movies, or at the volumes of the tweets, OK?
So I’m going to pick my phrase. And you’ll see now, these are the phrases that I extracted in my MapReduce job when you saw that query, so these are the same. What happened before that, we scanned through the data to see like what are the high-volume areas that we want to look at, and this is basically it.
Next, I want to add the tweets because I want the volume. Yeah, this is volume all right, it’s a lot of words; I can’t really use it for a whole lot. I’m not really interested in every individual tweet. Its not giving me a whole lot of information apart from some of this is pretty entertaining.
What I want to do is see the aggregated view. I want to see the volume for each of these. So I’ll just quickly go and change to account. Single click and it’s much more useful. I can keep clicking and I’m going to change this into another visualization.
So now we see we can’t really get around Brad Pitt, can we? (Laughter.) He’s everywhere. So you see I have movie actors, three actors, and I have three movies. We did get a whole lot of actor information yesterday, so I think I can skip that today.
QUENTIN CLARK: Yes.
JULIE STRAUSS: And since I am a professional business operating a movie theater, I will focus on these three movies, that I’m not going to pronounce the name of.
So there you go. Again, you saw yesterday Amir gave you the grand tour of slicing and dicing. So I can spare you some of that as well. Obviously, I will encourage you to go and play with it yourself, because it is a whole lot of fun to do. But I have prepared four visualizations that are basically serving the insights for you. So let’s take a tour.
First one here, using the nice feature of popping up information, which you would also use when you sit and present to important people in your company trying to convey your findings, that’s super helpful. So here I’m looking at no longer the volumes, but the revenue, right? So here I’m pulling the phrases from Twitter, I’m pulling the revenue from my enterprise data warehouse. And then I’m using some of the business logic I used to create the timing of this. So the most important — you can see here, The Avengers, obviously, have the most revenue, followed by Hunger Games and then the third movie.
Another important, which is a key finding, is the percentage of the revenue coming in the first week — it’s half of it. So my first week of a movie is essential, that’s half my revenue.
QUENTIN CLARK: Determines its fate basically.
JULIE STRAUSS: Yeah. So we’d better be sharp that week.
QUENTIN CLARK: So that’s all data from my existing business data warehouse.
JULIE STRAUSS: Yes. Tied to the phrases that we pulled from PolyBase.
QUENTIN CLARK: Right.
JULIE STRAUSS: OK? Next one, I’m going to — now we are looking at what is actually behind the tweets, right? Is it positive or negative? This is really the core value of the work PolyBase did for us because it put a number on each of the tweets as positive or negative. Right? In a scale.
So here I see The Avengers doesn’t have as much tweeting as The Hunger Games relative to revenue. Right? And then the other important thing is prior to release, this is taking a snapshot two weeks prior to release, right? And here you see Hunger Games overwhelmingly positive, so is The Avengers. The Vow is split half and half, so it’s not really that good. Maybe it’s not just me who doesn’t like that. (Laughter.)
So I can split that down. We’re going to save that finding a little bit. But the proportions are important. Down here, it’s just to spell out how bad this guy is doing really. It’s overall sentiment, and it looks pretty nice, purple and everything.
This one is an important one where we pull everything together, right? So here you have the size of the revenue and size of the bubble. You see The Avengers is doing well. And then you have on the axis as well, the revenue of the first week and the revenue of the last week, and the number of the sentiment. So you see a clear correlation between the number of tweets or the revenue in the first week and how positive the tweets are.
QUENTIN CLARK: On overall revenue.
JULIE STRAUSS: On overall revenue.
QUENTIN CLARK: So that’s an important correlation. This is one of our moments of insight where we recognize, gosh, there really is a correlation here between how we can analyze the sentiment information from the social sphere to how the movie actually performed in the theaters.
So now we need to take this and figure out what else it can mean for us.
JULIE STRAUSS: Do something with it.
QUENTIN CLARK: Yes.
JULIE STRAUSS: Not just pretty bubbles. So what we started out saying, we do have a capacity planning model that we’ve been using for a long time that used to work well, now we wanted to see how we can use this.
So let’s go have a look at that capacity planning model that I should have out here, sitting on SharePoint in the PowerPivot gallery to share information with my other friends in my department.
So here is my movie theater. You see I have lots of different workbooks, obviously, that are shared out here. And I can go and open this movie capacity planning sheet. There you go. So what we’re looking at here is my historic data for how we’ve typically done capacity planning. All the grays are my estimations. They can be estimations coming from externally as well, but they’re estimations we have typically used to say based on seats, historic data, et cetera, how do we predict each movie will do?
QUENTIN CLARK: Yeah. So, traditionally, there’s an understanding of how the movie producers have done, various actors, what they’ve pulled through, how much advertising has been done, and I have a pretty rich model that does that. And I’ve gotten these predictions.
JULIE STRAUSS: We see here that for all of our movies for this year, for our top sellers, let’s say that. We can see that we’re walking away from money. Right? We predicted this number of seats, but we could have had this.
QUENTIN CLARK: Yeah, we had to turn people away at the theater.
JULIE STRAUSS: Yeah, that’s not good. That’s not a way to operate. So what are we going to use this for? We have a new movie coming up?
QUENTIN CLARK: We do. I think there’s a new movie that’s been released last night, right?
JULIE STRAUSS: Tomorrow. Skyfall.
QUENTIN CLARK: Today.
JULIE STRAUSS: Tomorrow.
QUENTIN CLARK: Something like that.
JULIE STRAUSS: Some day this week, a new movie is being released.
QUENTIN CLARK: I’m in European time. (Laughter.)
JULIE STRAUSS: Don’t worry, we’ll get you there. So we have Skyfall, and obviously, it hasn’t been released yet, despite what Quentin believes, so I only have the gray, right? And what I can see from this is we’re expecting Skyfall to do almost as well as Hunger Games, do we think that sounds reasonable?
QUENTIN CLARK: Sure.
JULIE STRAUSS: Sure. OK, we think that.
QUENTIN CLARK: That’s what my model says.
JULIE STRAUSS: So let’s go back and look at the live data. So I’ll go in here. I have, again, pre-prepared for you little graphics here. This is the same view I showed you in the previous where I have used the pre-release positive and negative for our three top movies. Now I would like to compare that with data representing Skyfall.
QUENTIN CLARK: So we have in — remember, down in Hadoop, I’ve taken this sample set of tweets, right, and I’ve pulled them in and I’ve done the analytics on them around sentiment. Those tweets represent the two weeks leading up to a movie’s release.
JULIE STRAUSS: Yes.
QUENTIN CLARK: And we have gone out and gotten another sample set, again, same representation sample set of tweets for the two weeks leading up to tomorrow — today — sometime — for Skyfall. And so I put that data down in HDFS as well and I should be able to tap in through the same exact pipeline that I already set up.
JULIE STRAUSS: Yeah. The only change I had to make was to extract Skyfall instead of the other six arguments. So I’ve laid it out. Let’s just change the visualization. Let’s go and have a look here. And voila.
So the volume of tweets, first of all, is far exceeding the other movies. That’s a hint. It’s probably more popular. But more importantly, if you look at the purple area, this top area, that is proportionally huge.
QUENTIN CLARK: Yeah. Sentiment is super high.
JULIE STRAUSS: Yeah, super-high sentiment. So I think somebody needs to pay for this because if you go back to this model, this gray bar should be not only higher than our prediction for Hunger Games, it likely should be exceeding what we actually got for Hunger Games and so now my team has until tomorrow night to find some more seats. So I suggest we publish this up to SharePoint and —
QUENTIN CLARK: Get people looking at it.
JULIE STRAUSS: — ask them to collaborate a little bit and see what they can do.
QUENTIN CLARK: So while we move from that, you know, we saw this insight. I mean, it’s visceral, you can see it. And part of what the BI tools and the work we do to bring BI into Office and into Excel is provide these immersive experiences that allow you to have the data to speak to you, that allow you to find these things.
Now, there’s a moment here where the tools were just reached into and used, so the Excel itself is pervasive. The strategy has been by using Excel, by using SharePoint to allow everyone access to these solutions. And this can all be done, of course, on-premises, but all throughout Office 365 we’re also enabling the same capabilities. So, again, we have the situation where on-prem or in the cloud, the value is still apparent.
Now, in order to take that realization forward that the social influence on movies is really changing the shape of revenue and to build up my model, I’m going to of course need to formalize some things. Even though I could see it very easily in the data, at some point I’m going to need to really build that more formally into the model so I can drive better prediction of their capacity.
The other piece of operationalizing, though, has to do with making sure that the results are repeatable. It’s not enough just to have the “ah-hah” moment. It’s not even enough to find a way to plug that into business process, whether that’s done by development, writing code to run business process operations or even by the information workers, one of the advantages we have, of course, integrating with SharePoint, is the workflow features in SharePoint, the alerting that we’ve done even on reporting, allows the information worker, even, to in a self-service way start to operationalize some of these insights.
So for both IT and information workers, we have that capability. But now back down in IT, the repeatability of these insights is becoming incredibly important. We’re finding these situations where these insights are derived out of the BI tools and then for various reasons because of the popularity of using certain models and certain ways of looking at data, the underlying systems themselves may not even be prepared to support that.
And so we’re going to shift gears again. Have Julie put back on her IT hat and talk about how we’re providing the capabilities back into the administrator’s world to allow them to embrace the self-service meme and still ensure business continuity.
JULIE STRAUSS: Yes. So what happened was we shared — we published this to SharePoint and asked the team to go and collaborate. So what happens over time is, though, this model is kind of given to be quite popular — must be because we asked them to, how would they not obey?
So what I see now over time is here I’m in the management dashboard of the PowerPivot gallery. So this is my IT tool to gain insight into the activity of all these workbooks. Right? I can see this is running on a SharePoint farm, so I can see the CPU usage, the strain on the server, et cetera. And the key piece here, though, is the workbook activity. Each of these small bubbles, I’m sure you’ve seen this before, represents a workbook. And, obviously, the green, shiny object here is the workbook that represents the analysis that I published, right? So now there’s a lot of activity.
So this went from something I did in my office for the fun of it to something that the whole company now is rallying about, they need to work on.
So what I want to do with this is potentially I want to take it out of the SharePoint farm to make sure I don’t have strain on it. It may not be what that server farm was designed for, right? So I want to take this workbook now and I want to take the model and restore it directly on Analysis Services and deploy it and have the users who work on this project go directly against that. Right?
So that sounds pretty complex.
QUENTIN CLARK: Yeah. So I have the BI, if you will, for my BI, right? And that’s let me in IT understand, hey, I actually have things becoming now more business critical, more and more people using it, and I need to find a way to operationalize it much more deeply and support it much more — by the back-end systems than just in the front end.
JULIE STRAUSS: Yeah. Oh, and talking BI on BI, so what is also a piece of this is it is a PowerPivot workbook that lives behind the dashboard. So what I can also go and do is track where the different sources are coming from in my workbook. And I discovered something else not so good before. We’ll come back to that, keep that thought.
But, yes, for now, let’s go and operationalize this one. Take the Excel workbook and make the model available on Analysis Services.
So I’ll go to SSDT and start a new BI project. And all I have to do is select import from PowerPivot. I’ll just connect to my local instance of Analysis Services. I sure hope I will, but we’ll see. So what is important to note here is we’ve had this capability for a while. We had it in the previous release as well. But the big difference now is, have in mind, now the data model is shared between Excel and PowerPivot and Power View. So it’s actually the initial creation of that model is done by Excel.
So here’s the file. I put it out where, obviously, Analysis Services are allowed to get it. And I can just open it. That’s all I have to do and it will go grab that ABF file that still lives inside Excel and restore that, open it in here, all the business logic, if I created hierarchies, all my calculations, et cetera, are just showing up in here.
QUENTIN CLARK: So here, again, we’ve bridged between the information worker and IT. The information worker in a self-service way is able to create these models, but that same structure is consumable by the professional tools here in Visual Studio SQL Server Data Tools, we’re able to bring that same model in and work on it here as well as figure out how to deploy and operationalize it in Analysis Server.
JULIE STRAUSS: Yeah. So it really does two things for me. It takes the strain away from my SharePoint farm, but it also allows me to do more professional-level activities. So I can, for example, define role-level security, partitions, I could do better data management of how my data move in and out. So that’s all good. Now, all I had to do is deploy this to Analysis Services.
But I did mention a little problem kind of. Let’s have a look at that. So when I was out in the management dashboard and I did my analysis, I realized something else. Not all the data were really coming from where we thought they were coming from. So we said that all the data were coming from the data warehouse, the enterprise data warehouse that I introduced to begin with.
QUENTIN CLARK: My existing business revenue data?
JULIE STRAUSS: Yes.
QUENTIN CLARK: And that’s not true?
JULIE STRAUSS: That’s not really true because it proves that somebody in the team, apparently, has some friends in Europe. So the European subsidiary has an international sales database, sales system, where they have data about foreign movies. And so it proves that two of these tables actually come from kind of a sales system.
QUENTIN CLARK: A transactional sales system.
JULIE STRAUSS: A transactional sales system.
QUENTIN CLARK: In Europe.
JULIE STRAUSS: In Europe. And so I have to say, the Europeans are not too impressed with us right now because they’re not — their performance isn’t what it was a week ago.
QUENTIN CLARK: Because they’re now seeing this traffic because of the model being used more often, so now they’re seeing the impact on their transactional system?
JULIE STRAUSS: Yeah.
QUENTIN CLARK: So we’ve got to get you on a plane out to Europe?
JULIE STRAUSS: We can do that. No, it’s going to be much easier than that.
So what I have done, you see here’s the database, actually, international sales is where two of the tables are coming from. The tables are still super useful. So I don’t want to remove them from the model. Actually, without them, there’s a lot falling apart. And I could take a backup and restore it, but I want the data to be maintained and be live, right?
So what I’m going to do instead, the European subsidiary has an always on availability solution already up and running. And you see here I have in my availability group I have two availability replicas. What I’m going to do is I’m going to create another — a readable secondary — in the cloud that I can give my U.S. team access to for their analysis. I’ll leave the Europeans alone.
QUENTIN CLARK: And you can do this all — because we can do it in Azure, because we can do it in the cloud, we don’t have to go and disturb the infrastructure that’s there currently running.
JULIE STRAUSS: Exactly. So let’s do that. And I can actually do all of that — I haven’t created an image anywhere, I can do all of this directly from SSMS.
So I’ll go in here. Right click my availability group and add a new replica. This wizard you probably already know. And I’ll go ahead and connect to my secondary, my existing secondary, and say next. So now what you will see here is I have a new control here called “add Azure replica.”
QUENTIN CLARK: So this is a standard Always On configuration interface in SSMS that’s now changing as we get infrastructure-as-a-service to GA as part of Azure.
JULIE STRAUSS: Yes. So let’s see what this can do for us. So I’ll go ahead and add my certificate to connect to Windows Azure, and first thing I want to do is I’m going to select the image I want to use.
QUENTIN CLARK: So that’s from the standard Azure image libraries for VMs for infrastructure-as-a-service?
JULIE STRAUSS: Yes. So I’m taking SQL Server 2012. I’ll pick medium just for the fun of it, a medium-sized VM. And my location — so the database or the system is in Europe, but this is really for use for us in the U.S. So I’m going to pick a datacenter close to us — that was not close — there you go, in the west U.S. I’ll give it a name. I’ll call that PASS VM. And I’ll choose the network, and all I have to do now is enter my password and say next.
And now what you’ll see, eventually, is a new instance showing up here. Here I have my selection point for changing this to be a readable secondary. All I have to do is choose yes and go ahead — we need to synchronize these. So I pick my synchronization option to make sure that the first time I’m moving this we get a full database and backup. OK. Choose next. It’s just going to run through a quick validation and then the usual confirmation page to make sure I’ve done nothing wrong. How could I?
So I’ll click “finish.” And now, essentially, a Windows Azure VM is being provisioned for me in the cloud.
QUENTIN CLARK: Wow. (Applause.)
JULIE STRAUSS: So that’s going to run for a little while. Ten minutes. Oh, God, that’s a long time for getting up and running, huh? But it may be a little bit too long for you guys to sit here and watch the screen. So, again, I have pre-prepared for you a replica, this one is actually sitting in the cloud. So how can you know? Well, hopefully I can prove it to you. Maybe not. No, somebody closed it down.
Well, I have —
QUENTIN CLARK: You’ll be able to find it in the Azure portal.
JULIE STRAUSS: I can find an Azure portal. But I don’t have a mouse — that’s kind of useful to have.
QUENTIN CLARK: Let’s take a pause there. You’ll have to just trust us that we’ll be able to bring up the Azure portal at some point and show that. But the value we’re providing here is that governance control and that business continuity, you can now reach out into the cloud to support. By simply reaching into the infrastructure-as-a-service features that we’re building for Azure, you’re able to provide that secondary in a way that doesn’t impact any existing operations. Thereby taking the load off the main transactional system and bringing it onto a secondary that’s running out of Azure. Simple as using these existing tools that are inside SSMS.
That operational step, providing IT the insights to know what’s going on, which information sets are being used, how they’re being used, where the data is coming from, provides a kind of governance control over everything.
The scalability aspect, whether it’s bringing the model from Excel and bringing it in and operationalizing it in larger-scale, backend Analysis Services systems or whether it’s using the Always On features to provide that kind of protection for the databases are all pieces of operationalizing those insights.
Sometimes we don’t talk about it that way, but once you have these insights, you need to be able to repeat them and provide the kind of continuity around those solutions as a whole.
So with that, we’ll go back quickly and look at the portal.
JULIE STRAUSS: We can look at the portal. The portal is still warming up a little bit, but at least there is a portal. (Laughter.) There you go. So, yeah, the image is sitting up there in Azure.
QUENTIN CLARK: OK. So what we’ve shown is all the way through this life cycle, starting with reaching out and bridging into new data sets, but providing consistent queryability experience and modeling experience over that data, allowing us to shape that data in a form that’s relevant for information workers, giving your information workers the tools right in Excel and right in SharePoint to visualize and understand what the data is trying to tell them, collaborating on that through our SharePoint integration, and also start to operationalize the new insights.
And then back in our IT role, taking that and ensuring we have the right kind of business continuity end to end. This is the value of what Microsoft is trying to accomplish with you with our big data efforts, a complete platform solution, not just the piece parts.
And to show that we actually have this whole solution working as a whole, we’re going to do one last little bit.
JULIE STRAUSS: Let’s do it. One last little bit. Oh, by the way, there’s my portal. It’s not fake. There’s nothing fake about this.
So let’s go in here. You know, I’ve kind of been beating on this third movie for a while. And obviously we don’t get along that well. So it proves, though, it’s not doing as bad as it shows here.
QUENTIN CLARK: Why is that?
JULIE STRAUSS: I was sitting last night, playing with T-SQL, I kind of got carried away, felt all this power, so I kind of massaged the data a little bit. But that’s good because now I have an opportunity to show you how all the data just flows through the system.
So let me go back into PDW and hopefully correct my wrongdoing. So I’ll just run a single select. I’m going to remove the little minus that I accidentally put before some of the sentiment scores. And let’s run this. There you go.
So now I go back into Power View and all I should have to do now —
QUENTIN CLARK: Is refresh.
JULIE STRAUSS: — is refresh. So what is important to note here is, first of all, that this is sitting in — well, there you go. That this is sitting far away, but there you go.
I didn’t have all the steps that I went through from publishing, finding, importing, putting into PowerPivot, into the data model. In previous versions, you could have expected you’d have to open up stuff and refresh along the way, right? I don’t even have to open PowerPivot to refresh. This could have been a PivotTable directly in Excel, my data changed, I just do a right click, refresh the PivotTable, and the model refreshes automatically. (Applause.) That’s it.
QUENTIN CLARK: Julie, fantastic job. Thank you very much.
JULIE STRAUSS: Thank you. (Applause.)
QUENTIN CLARK: So in the course of that example scenario, we were after new kinds of information, created new kinds of insights. And reaching out into the world to find the new information that’s going to change how I run my business.
The entire life cycle of data is important in this. Again, it’s not just about one tool or another, you have to have the complete thing integrated end to end. Starting with managing data, really embracing that there’s new data in the world, even if you can talk to it in relational terms, thanks to the work we’re doing in PolyBase, you still need to be able to embrace and hold in a very diverse set of information. To be able to take the information sets and figure out what’s going to be of value.
Whether you’re applying, shaping, or whether it’s going to be analytics, to go and discover what is going to be interesting in the data and useful for people running the business, providing the business the tools, engaging the visceral tools that let them understand what the data is trying to say, what the data is capable of helping them with their business, and sharing those insights across. And then operationalizing all the activities that BI creates, both from a standpoint of business process, but also from a standpoint of business continuity, and assuring the reliability and resiliency of these new information flows and insights.
So I’d really like to have you leaving the room this morning thinking about how your business is going to change. What new information is going to make the difference for your business, get you thinking about how the business is going to work differently? What information from within your industry, from outside your industry, data that you already have but maybe you’re not leveraging correctly are all going to start to have an impact on how you reshape your business as part of our economic change going forward, embracing the new value of information.
With that, I’d like to thank you very much, enjoy the rest of the conference. (Applause.)
END