Skip to Main Content

Microsoft’s AI vision, rooted in research, conversations

Microsoft has been investing in the promise of artificial intelligence for more than 25 years — and this vision is coming to life with new chatbot Zo, Cortana Devices SDK and Skills Kit, and expansion of intelligence tools.

“Across several industry benchmarks, our computer vision algorithms have surpassed others in the industry — even humans,” said Harry Shum, executive vice president of Microsoft’s Artificial Intelligence (AI) and Research group, at a small gathering on AI in San Francisco in December. “But what’s more exciting to me is that our vision progress is showing up in our products like HoloLens and with customers like Uber building apps to use these capabilities.”

When Bill Gates created Microsoft Research in 1991, he had a vision that computers would one day see, hear and understand human beings — and this notion attracted some of the best and brightest minds to the company’s labs.

Last year, Microsoft became the first in the industry to reach parity with humans in speech recognition. There’s also been groundbreaking work with Skype Translator — now available in nine languages — an example of accelerating the pipeline from research to product. With Skype Translator, Microsoft has enabled people to understand each other, in real time, while talking to others in all corners of the world. But what about the dream of face-to-face, real-time translation? Using the company’s new intelligent language and speech recognition capability, Microsoft Translator can now simultaneously translate between groups speaking multiple languages in-person, in real-time, connecting people and overcoming barriers.

Microsoft has also built perhaps the world’s biggest knowledge graph. Thanks to work in Bing and Office 365, it’s possible to understand billions of entities — people, places and things. We now have the opportunity to connect this “world knowledge” with peoples’ “work knowledge.”

Microsoft’s vision is bold and broad — to build systems that have true artificial intelligence across agents, applications, services and infrastructure. This vision is also inclusive. Microsoft aims to make AI accessible to all — consumers, businesses, developers — so that everyone can realize its benefits.

“We have always sought to democratize technology. With AI, we will do that in two ways — by infusing into products such as Office 365, while creating a platform on which others can build and innovate,” said Shum, describing himself as a “researcher-turned-product guy.”

That platform includes 25 APIs called Cognitive Services that provide intelligence capabilities such as speech, language, knowledge and search.

Success in this new wave of innovation requires deep partnerships.

“I was here in this very room just a couple of weeks ago with Sam Altman working on the announcement of our collaboration with OpenAI. It’s so great to work with and build on the work of others,” Shum said.

We are now beginning to witness the early days of a transition to the next big platform shift in computing,  one that is fueled by advances in AI and built around a behavior that is most natural to humans — conversations. It’s a new era where digital experiences mirror the way people interact with one another and we move from a world where we have to understand computers to a world where they understand us and our intent, and can be proactive.

Meet Zo

There are two sides to conversational AI — the task-completion or productivity side and the emotional side. You need both to truly realize the promise of AI.

Microsoft’s long-term strategy is that agents like Cortana will not only have IQ but also have EQ, and that idea has fueled some groundbreaking work the company is doing with chatbots.

The next chapter in this evolution is Zo.

Zo is a social chatbot, built upon the technology stack that powers Xiaoice and Rinna — successful Microsoft AI chatbots in China and Japan. You can engage with her on Kik now in the same way you would interact with a friend, and in the future Microsoft plans to bring her to other social and conversational channels such as Skype and Facebook Messenger.

Zo is built using the vast social content of the Internet. She learns from human interactions to respond emotionally and intelligently, providing a unique viewpoint, along with manners and emotional expressions. But she also has strong checks and balances in place to protect her from exploitation.

Microsoft’s journey with chatbots started in May 2014 in China with Xiaoice. She has over 40 million users — more than the entire population of California. And she averages 23 conversation turns per session with users — nearly 10 times the industry average. Xiaoice is the first AI chatbot to have a real TV broadcasting job on Dragon TV, one of China’s biggest TV stations in Shanghai, which has a viewer population of more than 800 million. Building on the success of Xiaoice, in July 2015 in Japan, Microsoft launched Rinna. Today, Rinna has had regular conversations with 20 percent of Japan’s population.

Zo has already held conversations with over 100,000 people in the U.S. To date, more than 5,000 users have had over an hour-long conversation with Zo, and she holds Microsoft’s longest continual chatbot conversation: 1,229 turns, lasting 9 hours and 53 minutes.

“It’s a very personal experience,” Shum said. “We’re really moving from a world where we have to understand computers to a world where they will understand us and our intent, from machine-centric to human-centric, from perceptive to cognitive and from rational to emotional.”

Building on the Bot Framework

Microsoft is providing the Bot Framework and all of its tools, cloud services and data so developers and customers can build and experiment together with Microsoft technology — big corporations and small businesses alike.

We have learned how customers are using the breadth of our cloud services to create advanced bots to streamline processes and better serve their customers: The bank of Kochi in Japan is developing a receptionist bot; Rockwell Automation, a bot to automate productions; the Department of Human Services in Australia, a bot to improve customer engagement.

“We talk about the notion of the bot brain, and in many respects it is the software development challenge of this decade,” said Lili Cheng, distinguished engineer and general manager of Microsoft’s FUSE Labs. “Our vision for the Bot Framework and our development offerings are not just to make it easier for people to get started, but also to put these futuristic scenarios within reach.”

More than 67,000 developers are now using Microsoft’s Bot Framework and Cognitive Services. Updates are coming, including new bot connectors for Microsoft Teams and Cortana Bing Location, as well as the new QnA Maker service, which takes the most common questions businesses receive and enables even non-developers to build their own bot to answer them with ease.

“Leveraging the Microsoft Bot Framework tools and harnessing the Microsoft  Graph, we are delivering new, innovative scenarios for people in their personal and professional lives,” said Amritansh Raghav, corporate vice president, Skype. “These innovations are showing up across apps, email, chat platforms, mobile devices and connected devices.”

One of the ways customers are looking to take advantage of the opportunity these bots provide is within experiences like Skype and Microsoft Teams. With the general availability of the Skype calling API, Skype now delivers talking bots as well as the tools for partners to build rich media cards that allow users to add video, animated GIFs and audio to these bots.

Partners are important to help build new and engaging customer experiences, and Hipmunk was one of Microsoft’s earliest, launching their bot on Skype last spring.

“We exist to take the agony out of travel,” said Adam Goldstein, co-founder and CEO of Hipmunk. “We see virtual assistants as a natural extension of this — possibly the best way yet to help people plan and book travel easily. Discoverability is important, and Skype’s bot directory makes it easy to find the bots people need and want in order to streamline their lives.”

An intelligent agent for all

When Microsoft thinks about the promise of conversational AI, another important piece is the role of agents, such as Cortana. More than 145 million people across 13 countries are using Cortana today. It is unbound, across platforms and across connected devices.

Everyone deserves to have their own personal assistant. An assistant that helps us cope as we battle to stay on top of everything.

“To deliver on that promise, we need to focus on what we can take off your plate. Half of us send ourselves action items or reminders regularly over email. Many of us keep to-do lists. I used to wallpaper my office with sticky notes,” said Marcus Ash, Microsoft’s partner group program manager. “So, we’re working to take the friction out of staying on top of things.”

Cortana works across mobile platforms, and now in email through the new service, as well as through new skills like the one Expedia has built for travel or the new service Capital One has for banking. The next step in this journey is making Cortana available to all computer and device manufacturers to build smarter, more useful devices on all platforms.  That’s where the Cortana Devices SDK comes in.

Microsoft is working with partners across a range of device categories to integrate Cortana into their connected devices. The Devices SDK carries through Cortana skills in productivity, music, home automation and device control.

Soon, Cortana will be present in a new way, in your home, thanks to a partnership with Harman Kardon, maker of premium audio gear.

What’s next …

At Microsoft, we believe breakthrough technology is created by constant experimentation, fearless exploration and a commitment to long-term innovation.

While there’s been a lot of progress, there are still many tough questions around AI yet to be resolved. In the early days of other new waves of technology — the Internet, mobile, app economies — there were many growing pains — and AI is no exception.

“We will push the boundaries and we will learn,” Shum said. “We’ll share our learnings with the industry, with you — so we can democratize AI and hopefully accelerate its benefits for our society.”

Related News