Microsoft AI Security Layers Review
As AI tools, models (LLMs), and apps become increasingly common, whether for standard use on different and diverse LLMs or for Cybersecurity benefits, Microsoft’s tools offer several categories. These categories are not yet among the recognized tools of Copilot for Security, but other layers.
While Microsoft Copilot (in general) does many things and actions for Microsoft Office tools, third-party tools, etc, and helps people achieve their goals quickly. The Microsoft AI Security Layers provide benefits and allow you to have visibility, shield, and gain better security across AI workloads and technologies. If I can take something from the field after all the latest announcements and innovations, it is the gaps, conflict, and the lack of understanding of what everyone does when everyone can fit in. Is it possible to combine them and other gaps that can mislead in choosing the correct use of the tools?
The present article will review the Microsoft AI security pillars and the options related to the different technologies, components, and environments, from “DLP-based Used LLM” to “Private LLM”. While this blog post raises many of the AI security features, it still does not cover all of them.
AI Threat Landscape in a Nutshell
A moment before adding any security layer or AI security tool, you MUST know the threat landscape. The threat landscape in AI includes a few levels and types. Knowing them is crucial to understanding which security tool and coverage is needed.
| The sentence “You Can’t Protect What You Don’t Know” is also valid in this scenario and the many use cases of AI security. |
Understanding the landscape of LLMs is crucial for leveraging their capabilities effectively and security. Understanding the LLM layers helps with the security approach, whether it’s the attack surface, where to put the security efforts, risks, threats, etc. When you consider working with LLM for the users, the “Browser Agent” will be the answer from a security perspective. The security approach will be totally different if it’s for Private LLM (Build and API). What could be the risks for each type?
- Private (Build LLMs): Risks: Custom-built LLMs for organizational use may face threats such as data poisoning, unauthorized access, etc.
- Developer (Used LLMs): Risks: Developer tools like Dev AI tools can introduce vulnerabilities if not properly secured, including code injection and inadvertent exposure of sensitive data.
- Native Client (Used LLMs): Risks: Integrated solutions like Microsoft 365 Copilot may be susceptible to supply chain attacks, data leakage, unauthorized access, etc.
Understanding these categories helps you select the right security policies and tools for your specific needs, whether for public use, custom organizational solutions, developer assistance, or native client integration.

Understanding the AI Threat Landscape. AI has revolutionized various industries, offering unprecedented text, image, and speech generation capabilities. The accompanying diagram highlights the multifaceted threat landscape associated with AI. Critical Threats in the AI Ecosystem User Interaction Risks:
User Interaction Risks:
- Direct Prompt Injection (UPIA): Malicious inputs can manipulate AI responses.
- Data Leakage: Sensitive data might be unintentionally exposed.
- Unauthorized Access/Oversharing: Inadequate access controls can lead to data breaches.
- Hallucination: AI generates inaccurate or fabricated information.
- Overreliance: Users may overly depend on AI outputs without validation.
- Denial of Service (DoS): AI systems can be targeted to degrade service availability.
- Wallet (GPU Abuse): High computational costs due to malicious usage.
AI Application Risks:
- Data Poisoning: Compromising the training data to skew AI outputs. Indirect Prompt Injection (XPIA): Manipulation through indirect data sources.
- Orchestration Vulnerability: Weaknesses in integrating AI services can be exploited.
- Supply Chain Risks: Dependencies on third-party components can introduce vulnerabilities.
AI Model Risks:
- Insecure Plugins/Skills Design: Poorly designed integrations can be exploited.
- Jailbreak: Techniques to bypass AI safety mechanisms.
- Model Theft: Unauthorized access to proprietary models.
- Data Poisoning: Similar to application risks, specifically targeting model training phases.
- Model Vulnerabilities: Intrinsic weaknesses in AI models that can be exploited.
Understanding these risks is crucial for secure and ethical AI deployment as we harness the power of AI.

More AI Security updates on my LinkedIn profile.
Microsoft AI Security Pillars
The Microsoft Microsoft AI Security Layers are divided into three primary categories:
- Defender for Cloud
- Azure AI with AI Safety
- Microsoft Purview with AI Hub
Each category provides a complete solution for its environment and components. If you can combine them, you have a big umbrella for AI security.
The following image shows the primary pillars of Microsoft AI security at a high level with their technologies.

This image was taken from the recent conference – Microsoft Build 2024.
Defender for Cloud
Microsoft Defender for Cloud provides a unique approach to AI security. It starts with the AI-SPM, Threat Protection for AI workloads, and then the integration with Defender XDR, Prompt Shield, and others. The goal is to secure AI applications from code to runtime, providing visibility and detection for LLM and AI workloads across cloud vendors.
AI-SPM – Posture Management
As organizations embrace AI platforms and usage, many accelerate adoption with pre-built AI applications. In contrast, others develop AI applications in-house, tailored to their unique use cases, security controls, and compliance requirements. With all the new components of AI workloads, such as models, SDKs, training, and ground data, the visibility into understanding these new components’ configurations and associated risks is more important than ever.
The Defender CSPM in Microsoft Defender for Cloud provides AI security posture management capabilities that secure enterprise-built, multi, or hybrid cloud (currently Azure and AWS) AI applications throughout the entire application lifecycle. Defender for Cloud reduces risk to cross-cloud AI workloads by:
- Discovering AI-BOM includes application components, data, and AI artifacts from code to the cloud.
- Strengthening AI application security posture with built-in recommendations and exploring and remediating security risks.
- Using the attack path analysis to identify and remediate risks.
The Microsoft Defender for Cloud discovers AI workloads and identifies details of your organization’s AI BOM. This visibility allows you to identify and address vulnerabilities and protect AI applications from threats. Defenders for Cloud automatically and continuously discover deployed AI workloads across the following services:
- Azure OpenAI Service
- Azure ML
- Amazon Bedrock
Defender for Cloud can also discover vulnerabilities within AI library dependencies such as TensorFlow, PyTorch, and Langchain by scanning source code for IaC misconfigurations and container images for vulnerabilities. Regularly updating or patching the libraries can prevent exploits, protect AI applications, and maintain their integrity.
With the AI Security Posture Management (AI-SPM) capabilities in Microsoft Defender for Cloud, security can continuously discover and inventory AI components across Azure OpenAI Service, Azure ML, and Amazon Bedrock—including models, SDKs, and data—as well as sensitive data used in grounding, training, and fine-tuning LLMs. Admins can find vulnerabilities, identify exploitable attack paths, and quickly remediate risks to get ahead of active threats.
By mapping out AI workloads and synthesizing security signals such as identity, data security, and internet exposure, Defender for Cloud will continuously surface contextualized security issues and exploitable attack paths and suggest risk-based security recommendations tailored to prioritize critical gaps across your AI workloads.

The Microsoft Defender for Cloud and its recommendations and alerts surfaced directly on the resource page of Azure OpenAI in the Azure portal, aiming to meet resource owners directly.

The AI-SPM posture capabilities in Defender CSPM discover AI artifacts and assets by scanning code repositories for IaC misconfigurations and container images for vulnerabilities. With this, security teams have complete visibility of their AI stack from code to cloud to detect and fix vulnerabilities and misconfigurations before deployment.
Threat Protection for AI Workloads
Threat protection for AI workloads in Microsoft Defender for Cloud continually identifies real-time threats to AI applications and assists in responding to security issues in these applications.
The threat protection offering leverages a native integration of Azure OpenAI Service, Azure AI Content Safety prompt shields, and Microsoft threat intelligence to deliver contextual and actionable security alerts. Threat protection for AI workloads allows security teams to monitor their Azure OpenAI-powered applications in runtime for malicious activity associated with direct and indirect prompt injection attacks, sensitive data leaks, data poisoning, wallet abuse, or denial of service attacks.
AI applications are commonly grounded in organizational data. If sensitive data is held in the same data store, it can accidentally be shared or solicited via the application. The alert below shows an attempt to exfiltrate sensitive data using direct prompt injection during an Azure OpenAI model deployment. By leveraging the evidence provided, SOC teams can investigate the alert, assess the impact, and take precautionary steps to limit user’s access to the application or remove the sensitive data from the grounding data source.

Attacks like Jailbreak, for example, aim to alter the model’s designed purpose, making the application susceptible to data breaches and denial-of-service attacks. With Defender for Cloud, SOC analysts will be alerted to blocked prompt injection attempts with context and evidence of the IP and activity, with action steps to follow. The alert also includes recommendations to prevent future attacks on the affected resources and strengthen the security posture of the AI application.
A scenario – Possible prompt induced credential theft attempts through your Azure Open AI model deployment were detected
When I performed many attack scenarios against Azure AI and Amazon BedRock, the Microsoft Defender for Cloud detected all of them and provided rich information for the incident, alerts, evidence, and valuable information to investigate the attack.
The scenario below, “Possible prompt-induced credential theft attempts,” is one of them, and the images below describe part of the attack. The Microsoft Defender for Cloud presented all the required information: the attack, detection, alerts, evidence, actions, correlation with other actions or attacks, and more.
First is the Defender for Cloud blade, which shows alerts and detection with the known description of Defender for Cloud.

The alert provides general information and detailed information.

The evidence shows the action was made by the attack, including specific commands and detailed actions.

Defender for Cloud has built-in integrations into Microsoft Defender XDR. It shows all the alerts, including those related to AI attacks. This alert includes all the known information: attack story, evidence, response, etc. It also provides a correlation to other alerts.


| TIP: Defender for Cloud’s AI threat protection integrates with Azure AI Content Safety Prompt Shields and Microsoft’s threat intelligence signals |
Defender for Cloud has built-in integrations into Microsoft Defender XDR, so security teams can view the new security alerts related to AI workloads using the Defender XDR portal. This gives more context to those alerts and allows correlations across cloud resources, devices, and identity alerts. Security teams can also use Defender XDR to understand the attack story and related malicious activities associated with their AI applications by exploring correlations of alerts and incidents.
Microsoft Purview with AI Hub
Microsoft Purview AI Hub provides easy-to-use graphical tools and reports to quickly gain insights into AI use within your organization. One-click policies help you protect your data and comply with regulatory requirements. Using the AI Hub in conjunction with other Microsoft Purview capabilities to strengthen your data security and compliance for Microsoft Copilot for Microsoft 365:
- Sensitivity labels
- Data classification
- Communication compliance
- Auditing
- Content search
- eDiscovery
- Retention and deletion
- And more
Microsoft Purview is the only solution that has its information protection capabilities built into Microsoft Copilot for Microsoft 365, helping strengthen the data security for Microsoft Copilot for Microsoft 365. Microsoft Copilot for Microsoft 365 is built on Microsoft’s comprehensive approach to security, compliance, privacy, and responsible AI—so it is enterprise-ready. With Microsoft Purview, you can get additional data security capabilities such as sensitivity label citation and inheritance.
Microsoft Copilot for Microsoft 365 understands and honors sensitivity labels from Microsoft Purview and the permissions that come with the labels, regardless of whether the documents were labeled manually or automatically. With this integration, Copilot conversations and responses automatically inherit the label from reference files and ensure they are applied to the AI-generated outputs. As these are the same sensitivity labels that other Microsoft Purview solutions are aware of, organizations can instantly benefit from Purview Data Loss Prevention, Insider Risk Management, and Adaptive Protection on these labeled documents.
The Microsoft Purview with AI Hub can cover many aspects of Microsoft 365, including the Microsoft Copilot for Microsoft 365 and other areas such as OneDrive for Business, SharePoint Online, files, and more.

Data Security for AI
Microsoft Copilot and the AI applications operate using an ‘on-behalf-of permission model.’ This means that the AI applications can access (or have access to) data that the user has permission to access. If a user unintentionally has access to sensitive data, AI applications may unintentionally expose it. Addressing the issue is crucial to preventing such occurrences. The approach is to ensure that users are granted permission only for the files within the scope of their responsibilities within the organization. Implementing appropriate file classification and labeling can significantly mitigate these risks.
What could be the mitigation? Below are the primary features and options.
Sensitivity labels in Copilot 365
How Copilot works with sensitivity labels is critical to improving data protection in Microsoft 365. Copilot can:
- Recognize and use the labels during user interactions, helping to keep labeled data secure and compliant.
- Respect the encryption specified by the labels, checking if users have the correct permissions before accessing labeled data.
- Identify and apply the appropriate labels to the content that it generates based on the data source and the user’s preference.
Use sensitivity labels with Copilot for Microsoft 365
Combining sensitivity labels with Microsoft 365 Copilot brings benefits in both security and productivity:
- Automated label inheritance: Copilot automatically adopts the sensitivity labels of its source files. When Copilot creates new content from these files, it inherits their labels and protection settings, keeping data security consistent in new documents.
- Data security: Copilot follows the protection settings of sensitivity labels, like encryption. Data security is upheld, even when using AI features to handle or analyze sensitive information.
- Compliance: Copilot manages sensitive data according to the organization’s security protocols and compliance standards.
DLP for Data Exposure
An essential feature of endpoint DLP is its capability to block the pasting of sensitive data into specific websites and applications. Organizations can configure DLP policies for enhanced data security. These policies can prevent users from copying and pasting. For example, they can restrict the transfer of personal data from internal databases or documents. Endpoint DLP prevents data pasting on various platforms, including:
- AI websites: These sites can process and store the data you input, potentially leading to unintentional data retention or exposure. By blocking data pasting on these platforms, endpoint DLP can help safeguard against inadvertently sharing sensitive data with external AI tools, which might not align with your organization’s data protection policies.
- Personal email accounts: These accounts might not have the same level of encryption and authentication as your work email and might be vulnerable to hacking or phishing.
- Social media sites: These sites might expose your data to the public or to third parties who might misuse it for advertising or other purposes.
By blocking data pasting on these platforms, endpoint DLP can help in the following scenarios:
- Prevent data exposure: Avoid accidentally or intentionally sharing sensitive data with unauthorized parties or platforms.
- Comply with data protection regulations: You can follow the rules and standards for your organization and industry regarding data security and privacy.
- Enhance data security: You can reduce the risk of data breaches, leaks, or losses that might harm your organization or customers.
Endpoint DLP allows administrators to group sensitive domains or websites and apply different restrictions to each group. For instance, suppose you have a document that contains confidential customer information, like names, addresses, and phone numbers. You can copy and paste this information into your work email or SharePoint site, which is protected by encryption and authentication.
However, the situation changes if you try to paste this information into a personal email account, such as Hotmail. The same applies if you attempt to paste it into an AI tool. In these cases, endpoint DLP intervenes. It blocks the action immediately. Additionally, it displays a warning message. The message might read: your organization’s data policy prohibits this action. Please contact your administrator for more information.
Insider Risk Management – Detect browsing AI Sites
An indicator within Microsoft Purview is a specific metric or signal the system uses to monitor and evaluate activities for potential risks. The browsed-to-AI sites indicator is integrated into the Risky browser usage policy template to track the specific usage of AI websites within an organization.
Imagine an employee frequently visiting an AI site that offers advanced data modeling tools. While these tools can benefit their work, the site also hosts forums with sensitive content. The browsed to AI sites indicator:
- Flags employee visits to AI sites.
- Allows review of these visits against internal policies.
- Helps determine the purpose of site visits.
Protection Beyond Copilot 365
A broad scope of AI applications can be used daily, posing varying risks to your organization and data. And, with how quickly users want to use AI applications, training them to better manage sensitive data can slow adoption and productivity. Research shows that 11% of all data in ChatGPT is confidential, making it critical that organizations have controls to prevent users from sending sensitive data to AI applications. Microsoft Purview extends protection beyond Copilot for Microsoft 365 – in hundreds of AI applications such as ChatGPT, Gemini, etc.
The Insider Risk Management for browsing AI sites can be used as an indicator to gain visibility into AI site usage, including the types of AI sites visited the frequency with which these sites are being used, and the types of users visiting them. With this new capability, organizations can proactively detect the potential risks associated with AI usage and take action to mitigate them.
This list of AI sites, powered by Netstar, is automatically updated as new sites are added or become more popular. User information is pseudonymized by default, and strong privacy controls protect end-user trust. Learn more about our Insider Risk Announcements in this blog.
The Microsoft Purview DLP can prevent users from pasting sensitive data in AI prompts when accessed through supported web browsers. You can also use Adaptive Protection to make these policies dynamic such that elevated-risk users are prevented from interacting with sensitive data in AI prompts while low-risk users can maintain productivity.
| Note: This feature is based on classified and known data and cannot replace a solution like the “Native AI Browser agent.” |
AI HUB
AI Hub is a recent addition to Microsoft Purview, offering insights into the organization’s AI interactions with data. It presents statistical data through graphical representations, illustrating various aspects such as sensitive data shared with AI (based on labels and classifications), risky usage of AI applications, unethical usage (e.g., prompts that deviate from content safety principles), and compliance with AI regulations.

| TIP: AI Hub uses Azure AI Content Safety to moderate the content of the user’s query and the response generated by our LLM (ChatGPT). |
Protection for Sensitive Information in AI Prompts and Responses
With Microsoft Purview and Defender XDR, you can block apps that pose a risk to your employees and protect sensitive data as they interact with those applications—both in AI prompts and responses. This ensures that sensitive data does not get into the adversary’s hands.
Microsoft Purview is a solution that has its information protection capabilities built into Microsoft Copilot for Microsoft 365, helping strengthen the data security for Microsoft Copilot for Microsoft 365. Microsoft Copilot for Microsoft 365 is built on Microsoft’s comprehensive approach to security, compliance, privacy, and responsible AI. With Microsoft Purview, customers can get additional data security capabilities such as sensitivity label citation and inheritance.
Microsoft Copilot for Microsoft 365 understands and honors sensitivity labels from Microsoft Purview and the associated permissions, regardless of whether the documents were labeled manually or automatically. With this integration, Copilot conversations and responses automatically inherit the label from reference files and ensure it is applied to the AI-generated outputs.
As these are the same sensitivity labels that other Microsoft Purview solutions are aware of, organizations can instantly benefit from Purview Data Loss Prevention, Insider Risk Management, and Adaptive Protection on these labeled documents. Some example scenarios are: When users reference a labeled file in a Copilot prompt or conversation, they can clearly see the document’s sensitivity label. This visual cue informs the user that Copilot is interacting with a sensitive document and that they should adhere to their organization’s data security policies.

When users reference a labeled document in a conversation, the Copilot responses inherit the sensitivity label from the referenced document. Similarly, if a user asks Copilot to create new content based on a labeled document, Copilot-created content automatically inherits the sensitivity label, along with all its protection, from the referenced file. When a user references multiple documents with different sensitivity labels, the Copilot conversation or the generated content inherits the most protective sensitivity label.

More information on Securing data in an AI-first world with Microsoft Purview
Azure AI Content Safety
Azure AI Content Safety detects harmful user-generated and AI-generated content in applications and services. It includes text and image APIs that allow you to detect harmful material. We also have an interactive Content Safety Studio that will enable you to view, explore, and try out sample codes to detect harmful content across different modalities.
Where it’s used? The following are a few scenarios in which a software developer or team would require a content moderation service:
- User prompts submitted to AI service.
- Content produced by AI models.
- Social messaging platforms that moderate images and text added by their users.
The moderator works for both text and image content. It can detect adult content, racy content, offensive content, and more. The service can moderate content from various sources, such as social media, public-facing communication tools, and enterprise applications.
- Language models analyze multilingual text, in short and long form, with an understanding of context and semantics.
- Vision models perform image recognition and detect objects in images using state-of-the-art Florence technology.
- AI content classifiers identify sexual, violent, hate, and self-harm content with high levels of granularity.
- Content moderation severity scores indicate the low to high scale content risk level.
Azure AI Content Safety Studio is an online tool designed to handle potentially offensive, risky, or undesirable content using cutting-edge content moderation ML models. It provides templates and customized workflows, enabling users to choose and build their own content moderation system. Users can upload or try their content using the provided sample content.
Different types of analysis are available from this service. The following table describes the currently available APIs.
| Feature | Functionality |
|---|---|
| Prompt Shields | Scans text for the risk of a User input attack on a Large Language Model. |
| Groundedness detection | Detects whether the text responses of large language models (LLMs) are grounded in the source materials provided by the users. |
| Protected material text detection | Scans AI-generated text for known text content (for example, song lyrics, articles, recipes, selected web content). |
| Custom categories API | It lets you create and train custom content categories and scan text for matches. |
| Custom categories (Rapid) API | Lets you define emerging harmful content patterns and scan text and images for matches. |
| Analyze text API | Scans text for sexual content, violence, hate, and self-harm with multi-severity levels. |
| Analyze image API | Scans images for sexual content, violence, hate, and self-harm with multi-severity levels. |
Prompt Shields
AI models can be exploited by malicious actors. To mitigate these risks, we integrate safety mechanisms to restrict the behavior of LLMs within a safe operational scope. However, despite these safeguards, LLMs can still be vulnerable to adversarial inputs bypassing integrated safety protocols. Prompt Shields is a unified API that analyzes LLM inputs and detects User Prompt attacks and Document attacks, two standard adversarial inputs.
Types of input attacks
The two types of input attacks that Prompt Shields detect are described in this table.
| Type | Attacker | Entry point | Method | Objective/impact | Resulting behavior |
|---|---|---|---|---|---|
| User Prompt attacks | User | User prompts | Ignoring system prompts/RLHF training | Altering intended LLM behavior | Performing restricted actions against training |
| Document attacks | Third-party | Third-party content (documents, emails) | Misinterpreting third-party content | Gaining unauthorized access or control | Executing unintended commands or actions |
Groundedness detection
The Groundedness Detection API detects whether the text responses of large language models (LLMs) are grounded in the source materials provided by the users. Ungroundedness refers to instances where the LLMs produce non-factual or inaccurate information from what was present in the source materials.
Key terms
- Retrieval Augmented Generation (RAG)
- Groundedness and Ungroundedness in LLMs
Groundedness detection features
- Domain Selection: Users can choose an established domain to ensure more tailored detection that aligns with the specific needs of their field. Currently, the available domains are
MEDICALandGENERIC. - Task Specification: This feature lets you select the task you’re doing, such as QnA and Summarization, with adjustable settings according to the task type.
- Speed vs. Interpretability: Two modes trade-off speed with result interpretability.
- Non-Reasoning mode: Offers fast detection capability and is easy to embed into online applications.
- Reasoning mode: Offers detailed explanations for detected ungrounded segments; better for understanding and mitigation.
Protected material detection
The Protected material text API flags known text content (for example, song lyrics, articles, recipes, and selected web content) that might be output by large language models. This guide details the kind of content the protected material API detects.
Custom Categories API
Azure AI Content Safety lets you create and manage content moderation categories for enhanced moderation and filtering that matches your specific policies or use cases.
The Custom Categories (standard) API enables customers to define categories specific to their needs, provide sample data, train a custom machine learning model, and use it to classify new content according to the learned categories.
This is the standard workflow for customizing machine learning models. The model can reach very good performance levels depending on the training data quality, but it can take several hours to train. This implementation works on text content, not image content.
Custom categories (Rapid) API
The Custom Categories (Rapid) API is designed to be quicker and more flexible than the standard method. It’s meant to identify, analyze, contain, eradicate, and recover from cyber incidents involving inappropriate or harmful content on online platforms.
An incident may involve a set of emerging content patterns (text, image, or other modalities) that violate Microsoft community guidelines or the customers’ policies and expectations. These incidents must be mitigated quickly and accurately to avoid potential live site issues or harm to users and communities. This implementation works on text content and image content.
Useful info
Microsoft Purview data security and compliance protections for Gen AI apps
Manage AI data security challenges with Microsoft Purview
Configure DLP policies for Copilots