…and what tasks can they solve?
Mykhaylo Shmyelov, Director for National Technology Strategies and Policies for 24 Countries in Central and Eastern Europe at Microsoft, talked with Kazakhstani media Kursiv.kz about cognitive services and how they help governments become genuinely digital.
Mykhaylo, please give a simple definition – what are cognitive technologies?
Let me refer to the definition of Deloitte, which was formulated several years ago. Cognitive technologies are the products of artificial intelligence. They can perform tasks that previously could only be done by humans. Examples of cognitive technologies include computer vision, machine learning, natural language processing, speech recognition, and robotics.
You need to understand: when we talk about cognitive technologies, we mean the use of the capabilities of artificial intelligence (AI)
AI can be compared to a fundamental scientific discipline, and cognitive technologies are already specific tools that use AI to solve particular problems. They can see, hear, understand, and reason like humans. Thus, it would be correct to even talk about cognitive services because they offer specific scenarios for using AI and are already quite available now. If you are creating a presentation in PowerPoint, you are already using the power of AI. The program recommends design formats, substitutes templates – this AI working in the cloud offers you a suggested design based on the analysis of hundreds of thousands of users’ actions. If you interact with your smartphone, then with a 99% probability, you are using cognitive services – be it biometrics, Cortana, Siri, or any other voice assistant.
“Affordable enough” means accessibility not only for users but also for developers. Microsoft’s Cognitive Services is a unique set of AI algorithms and application programming interfaces (APIs) that allow developers to add AI capabilities to their websites, applications, or AI agents (the most common of which are chatbots). In other words, Cognitive Services allow companies without data scientists (AI researchers), infrastructure, or proper budget to build-in cognitive services into their business processes and applications.
You are talking about specific scenarios for using the capabilities of artificial intelligence. What are those scenarios?
Today, we can talk about five main categories of cognitive services from Microsoft. The first category is the decision support services. In this category, I would mention the scenario of detecting anomalies and increasing the reliability of any systems through the early detection of problems.
For example, such a cognitive service can analyze the customer’s business data. Seeing some deviations highlights them and transmits them to the expert since they “fall out” from the general harmonious data rhythm.
Content filtering falls into the same category – this is the detection of potentially offensive images, profanity, undesirable text, adult content, and obscene content videos. All this is as a service in the cloud, which can be connected to any developed application.
The second category is services that help you make meaning out of the unstructured text. One example is immersive reading, which helps a reader of all skill levels understand the text using audio and visual cues. It is actively used in education.
This category also includes an understanding of natural language. This feature provides natural language understanding in bot applications and IoT devices. Many companies have switched to the electronic form of communication with customers. I think you have repeatedly seen in online stores the pop-up window “I am Asel, the manager, how can I help you?”. A person begins to write his question in natural language, but Asel is not there. It is a bot, a cognitive service. Who understands what is being written to him, because, based on recognizing an unstructured natural language, he concludes what you ask. In the same category, text analysis, identification of key-phrases and essences, and translators are also here – today, we have a corresponding cognitive service that translates from more than 60 languages.
The third category is the integration of speech processing into applications and services. Here, speech recognition and converting it to text, speech synthesis, and converting text to speech, recognition, and simultaneous translation of speech are integrated in real-time into the necessary applications. This also includes identifying the speaker – it can be used, for example, at large online conferences, automatically identifying the speaker by voice. Or you can use it merely for voice identification.
The fourth broad category is services for content identification and analysis (images, video, scanned text). The first is computer vision, which analyzes the image. A trained mathematical model determines what exactly is depicted based on the description of key points to determine whether this is a car and this is a dog. The second subcategory here is custom vision and recognition. Let me explain with an example – a vast number of freight train cars pass along the railway. A camera aimed at a specific place will make it possible to recognize the numbers of passing cars in order to understand where this car is now, which station it has passed, and who is renting it.
Face recognition – identifying and detecting people and emotions in images – such a cognitive service also exists. If it has to be trained to recognize specific people, it can recognize emotions regardless of whose photo the model was trained on. This category also includes recognizing and processing printed or scanner forms, such as quickly processing ballots after voting.
And the last subcategory here is the analysis of visual and audio channels and content indexing. This is an automatic categorization. That is, such a cognitive service can be directed to a considerable number of videos. In this video, he will say birds are singing, on this – military exercises, and even more deeply analyze, identify tanks, vehicles, people, and put this metadata when describing the video. Such services are very important for photo and video banks.
And the fifth category is the smart search built into Microsoft Bing. Search by image, by sound files, contextual search – all the power of search engines – this is also a cognitive service that Microsoft provides in the form of a ready-to-use interface and allows developers to embed it into their systems.
Now, 75% of applications use AI technology and cognitive services in one form or another – we don’t notice it. And the explosive development of such intelligent applications is due precisely to the fact that now developers do not need to create these services themselves – they take Microsoft cognitive services as raw materials, as a ready-made brick. Such a service is continuously connected and uses the Microsoft Azure public cloud’s full power for efficiency.
How can governments use cognitive services, and for what?
In the current situation, at least to support employees working from home and serving citizens seamlessly. Today, the state’s digital transformation is not about replacing old computers with new ones and not buying a new Microsoft Office or a new OS. It is setting up processes to automate the entire routine. And for this, in more than 90% of cases, it is necessary to use the capabilities of AI, which means – cognitive services from the Microsoft cloud, immediately ready for use by our partners and customers. Again, there is no need for both business and government to invest in AI-related RnD. This is not their core function. Since the early 90s of the last century, we have been creating such tools – it was then that Microsoft Research appeared. There was the natural language processing group, the computer vision group, and the computer speech group in the first three research groups.
What cases of using Microsoft cognitive services at the state level already exist?
First of all, there are many kinds of e-government tools for interacting with people in a natural language form – chatbots or a robot conversation. This approach allows the government to reduce the cost of call centers significantly.
By the way, during the pandemic, international health organizations, inundated with patient requests, created and deployed 1,230 bots based on our cognitive services so that people can assess for themselves whether they have signs of COVID-19 or not. These apps served 18 million people.
Remember Hollywood movies, when the secret services find a person in a crowd through their cameras? Such opportunities also exist, although such technology depends on the legislative field in each country.
Among the cases are traffic control systems, and this is not only about cameras that monitor speed and violation of traffic rules. This includes forecasting the congestion of transport infrastructure, which allows both to control traffic lights and to make long-term planning of road reconstruction based on a visual analysis of the class of cars that drive this road. In the context of the coronavirus epidemic, Microsoft’s cognitive services are also used to analyze and predict the development of a pandemic situation.
Is there a universal algorithm for government digital transformation journey using cognitive services – in other words, where is the best place to start?
Each country, each government, is looking for its path based on its priorities, tasks, and legislation. And we are always there, ready to share the best practices. Microsoft has participated in many projects in different countries, and most of them are quite successful.
From our own experience, we can say that states’ digitalization begins with smart cities very often. Striking examples here are Barcelona or London – megalopolises that, due to their workload, are forced to make their transport systems intelligent. But small cities also use cognitive services, which allow, for example, to effectively control lighting – that is, to turn it on not by a timer, but depending on the fundamental level of illumination, which is recognized by the cognitive service.
We see that more and more states use cognitive services because they recognize the need. Citizens in the era of the fourth industrial revolution are becoming more demanding about public service quality. They want to receive them quickly and efficiently, even during a crisis due to a pandemic. The government must continue to deliver critical civil services securely — that is, electronically — and simultaneously respond to the crisis. And for this, it is necessary not to expand the bureaucratic apparatus, but to make the existing one more efficient, equipping civil servants with robust cognitive services.