Announcing new tools in Azure AI to help you build more secure and trustworthy generative AI applications

In the rapidly evolving landscape of generative AI, business leaders are trying to strike the right balance between innovation and risk management. Prompt injection attacks have emerged as a significant challenge, where malicious actors try to manipulate an AI system into doing something outside its intended purpose, such as producing harmful content or exfiltrating confidential data. In addition to mitigating these security risks, organizations are also concerned about quality and reliability. They want to ensure that their AI systems are not generating errors or adding information that isn’t substantiated in the application’s data sources, which can erode user trust.

To help customers meet these AI quality and safety challenges, we’re announcing new tools now available or coming soon to Azure AI Studio for generative AI app developers:

  • Prompt Shields to detect and block prompt injection attacks, including a new model for identifying indirect prompt attacks before they impact your model, coming soon and now available in preview in Azure AI Content Safety.
  • Safety evaluations to assess an application’s vulnerability to jailbreak attacks and to generating content risks, now available in preview.
  • Risk and safety monitoring to understand what model inputs, outputs, and end users are triggering content filters to inform mitigations, coming soon, and now available in preview in Azure OpenAI Service.

With these additions, Azure AI continues to provide our customers with innovative technologies to safeguard their applications across the generative AI lifecycle.

Safeguard your LLMs against prompt injection attacks with Prompt Shields 

A colorful illustration of red and blue cubes floating through a purple computer screen.

Prompt injection attacks, both direct attacks, known as jailbreaks, and indirect attacks, are emerging as significant threats to foundation model safety and security. Successful attacks that bypass an AI system’s safety mitigations can have severe consequences, such as personally identifiable information (PII) and intellectual property (IP) leakage.

A colorful illustration of red and blue cubes floating towards a purple computer screen. The red cubes are blocked by a shield in front of the computer screen.

To combat these threats, Microsoft has introduced Prompt Shields to detect suspicious inputs in real time and block them before they reach the foundation model. This proactive approach safeguards the integrity of large language model (LLM) systems and user interactions.

A colorful illustration depicting a user sending a red prompt that says “do anything now” towards a blue computer screen. The prompt is blocked by a blue shield.

Prompt Shield for Jailbreak Attacks: Jailbreak, direct prompt attacks, or user prompt injection attacks, refer to users manipulating prompts to inject harmful inputs into LLMs to distort actions and outputs. An example of a jailbreak command is a ‘DAN’ (Do Anything Now) attack, which can trick the LLM into inappropriate content generation or ignoring system-imposed restrictions. Our Prompt Shield for jailbreak attacks, released this past November as ‘jailbreak risk detection’, detects these attacks by analyzing prompts for malicious instructions and blocks their execution.

 

A colorful illustration of red and blue cubes floating from a yellow folder of files towards a blue rectangular screen. The red cubes are blocked by a blue shield.

Prompt Shield for Indirect Attacks: Indirect prompt injection attacks, although not as well-known as jailbreak attacks, present a unique challenge and threat. In these covert attacks, hackers aim to manipulate AI systems indirectly by altering input data, such as websites, emails, or uploaded documents. This allows hackers to trick the foundation model into performing unauthorized actions without directly tampering with the prompt or LLM. The consequences of which can lead to account takeover, defamatory or harassing content, and other malicious actions. To combat this, we’re introducing a Prompt Shield for indirect attacks, designed to detect and block these hidden attacks to support the security and integrity of your generative AI applications.

Identify LLM Hallucinations with Groundedness detection

A colorful illustration of a blue computer screen filled with lines of blue text and one squiggly line of red text, with a yellow alert symbol above it. Behind the computer screen are two blue databases with an arrow pointing to the computer screen.

‘Hallucinations’ in generative AI refer to instances when a model confidently generates outputs that misalign with common sense or lack grounding data. This issue can manifest in different ways, ranging from minor inaccuracies to starkly false outputs. Identifying hallucinations is crucial for enhancing the quality and trustworthiness of generative AI systems. Today, Microsoft is announcing Groundedness detection, a new feature designed to identify text-based hallucinations. This feature detects ‘ungrounded material’ in text to support the quality of LLM outputs.

 

Steer your application with an effective safety system message

In addition to adding safety systems like Azure AI Content Safety, prompt engineering is one of the most powerful and popular ways to improve the reliability of a generative AI system. Today, Azure AI enables users to ground foundation models on trusted data sources and build system messages that guide the optimal use of that grounding data and overall behavior (do this, not that). At Microsoft, we have found that even small changes to a system message can have a significant impact on an application’s quality and safety. To help customers build effective system messages, we’ll soon provide safety system message templates directly in the Azure AI Studio and Azure OpenAI Service playgrounds by default. Developed by Microsoft Research to mitigate harmful content generation and misuse, these templates can help developers start building high-quality applications in less time.

Evaluate your LLM application for risks and safety 

A colorful illustration of a blue screen with three bar charts, each with text descriptions beneath them. One bar chart is highlighted and shows two short bars in a blue color and one tall bar in a red color.

How do you know if your application and mitigations are working as intended? Today, many organizations lack the resources to stress test their generative AI applications so they can confidently progress from prototype to production. First, it can be challenging to build a high-quality test dataset that reflects a range of new and emerging risks, such as jailbreak attacks. Even with quality data, evaluations can be a complex and manual process, and development teams may find it difficult to interpret the results to inform effective mitigations.

Azure AI Studio provides robust, automated evaluations to help organizations systematically assess and improve their generative AI applications before deploying to production. While we currently support pre-built quality evaluation metrics such as groundedness, relevance, and fluency, today we’re announcing automated evaluations for new risk and safety metrics. These safety evaluations measure an application’s susceptibility to jailbreak attempts and to producing violent, sexual, self-harm-related, and hateful and unfair content. They also provide natural language explanations for evaluation results to help inform appropriate mitigations. Developers can evaluate an application using their own test dataset or simply generate a high-quality test dataset using adversarial prompt templates developed by Microsoft Research. With this capability, Azure AI Studio can also help augment and accelerate manual red-teaming efforts by enabling red teams to generate and automate adversarial prompts at scale.

Monitor your Azure OpenAI Service deployments for risks and safety in production 

A colorful illustration of a blue computer screen with different kinds of charts in a dashboard. A blue line graph is highlighted, with a peak on the y-axis in red.

Monitoring generative AI models in production is an essential part of the AI lifecycle. Today we are pleased to announce risk and safety monitoring in Azure OpenAI Service. Now, developers can visualize the volume, severity, and category of user inputs and model outputs that were blocked by their Azure OpenAI Service content filters and blocklists over time. In addition to content-level monitoring and insights, we are introducing reporting for potential abuse at the user level. Now, enterprise customers have greater visibility into trends where end-users continuously send risky or harmful requests to an Azure OpenAI Service model. If content from a user is flagged as harmful by a customer’s pre-configured content filters or blocklists, the service will use contextual signals to determine whether the user’s behavior qualifies as abuse of the AI system. With these new monitoring capabilities, organizations can better-understand trends in application and user behavior and apply those insights to adjust content filter configurations, blocklists, and overall application design.

Confidently scale the next generation of safe, responsible AI applications

Generative AI can be a force multiplier for every department, company, and industry. Azure AI customers are using this technology to operate more efficiently, improve customer experience, and build new pathways for innovation and growth. At the same time, foundation models introduce new challenges for security and safety that require novel mitigations and continuous learning.

At Microsoft, whether we are working on traditional machine learning or cutting-edge AI technologies, we ground our research, policy, and engineering efforts in our AI principles. We’ve built our Azure AI portfolio to help developers embed critical responsible AI practices directly into the AI development lifecycle. In this way, Azure AI provides a consistent, scalable platform for responsible innovation for our first-party copilots and for the thousands of customers building their own game-changing solutions with Azure AI. We’re excited to continue collaborating with customers and partners on novel ways to mitigate, evaluate, and monitor risks and help every organization realize their goals with generative AI with confidence.

Learn more about today’s announcements

Related Posts

Sieben Wege, wie KI unser Leben einfacher macht

Künstliche Intelligenz (KI) ist mehr als ein Werkzeug für Büroarbeiter*innen oder eine Produktivitätssoftware für Unternehmen. KI kann jederzeit und überall schnelle Antworten geben und dabei helfen, Ideen, Informationen und Inspiration für fast alle Situationen in Beruf und Freizeit zu entwickeln. Eine einfache Möglichkeit, KI zu nutzen, ist Microsoft Copilot auf dem iOS- oder Android-Smartphone zu installieren. Die kostenlose KI-Anwendung fasst Informationen aus dem Internet zusammen, hilft beim Recherchieren, Planen oder Erstellen von Texten oder Bildern auf Grundlage von Anfragen oder sogenannten “Prompts”. Hier sind sieben Tipps, um mit KI mehr aus dem Tag zu machen:  

Sulzer Schmid’s Blade Anomaly Detection AI with Microsoft Azure

Sulzer Schmid stands at the forefront of innovation in the energy service sector. Leveraging cutting-edge technology, the company’s rotor blade inspection process employs autonomous drones to capture repeatable and consistently high-quality images. The cloud-based 3DX™ Blade Platform offers a data driven approach, incorporating AI-enhanced analytics, providing customers with actionable insights to optimize performance of renewable energy assets.

Mit Copilot, Windows und Surface die neue Ära der Arbeit gestalten

Vor einem Jahr haben wir Copilot für Microsoft 365 vorgestellt. Die Daten aus unserer Work-Trend-Index-Studie zeigen, dass unsere Technologie Mitarbeitende produktiver und kreativer macht und manche dadurch bis zu 10 Stunden Arbeitszeit pro Monat sparen.[i] Deshalb arbeiten wir weiter an Innovationen und statten unser gesamtes Produktportfolio mit Copilot-Funktionen aus, einschliesslich all der Anwendungen und Dienste, auf die Unternehmen bauen – von Windows und Microsoft 365 bis hin zu Teams, Edge und mehr.

Advancing the new era of work with Copilot, Windows, and Surface

It’s been one year since we first introduced the world to Copilot for Microsoft 365, and data from our Work Trend Index research shows it’s already making employees more productive and creative, saving some as much as 10 hours per month.1 We’re continuing to innovate, bringing Copilot capabilities to our entire product portfolio, including the applications and services organizations are built on—from Windows and Microsoft 365 to Microsoft Teams, Edge, and more.