The promise of corporate Artificial Intelligence is simple to explain but difficult to execute: helping teams work faster without sensitive data escaping the company’s perimeter. This week, Microsoft had to admit that this line was blurred for several weeks due to a bug in Microsoft 365 Copilot, its integrated assistant in tools like Outlook and the rest of the suite. According to the company, a bug caused Copilot to read and summarize emails labeled as “confidential”, even when organizations had enabled Data Loss Prevention (DLP) policies specifically designed to prevent this.
The incident, which Microsoft has internally tracked under the identifier CW1226324, was detected on January 21 and affected a specific feature: the Copilot chat in the work tab. The abnormal behavior involved the system “collecting” and processing messages in Sent Items and Drafts, including those with confidentiality labels used to impose restrictions on automated tools and reinforce compliance.
What exactly went wrong: when the label exists but the barrier doesn’t apply
In Microsoft 365 environments, confidentiality labels (sensitivity labels) are not just a “visual notice.” They are part of Microsoft Purview and are used to classify information. Based on the sensitivity level, they can apply protections that travel with the content (such as access limits or usage restrictions).
Adding to this layer is DLP, which in many companies acts as the last line of defense: policies that prevent sensitive information (documents, emails, identifiers, etc.) from being processed or shared in unauthorized contexts. In the case of Copilot and Copilot Chat, Microsoft offers specific policy locations to restrict processing when content has certain labels or matches protection rules.
The problem, according to Microsoft, was a coding error that allowed Copilot Chat to include items from Sent Items and Drafts even if confidentiality labels were present and a DLP policy was configured. The practical consequence is delicate: if an employee asked the chat about a topic, the assistant could summarize content that, by design, should be kept out of automation’s reach.
An incident occurring amid the rapid expansion of Copilot Chat
The timing of the incident is also significant. Microsoft began deploying Copilot Chat (Microsoft 365 Copilot Chat) in applications like Word, Excel, PowerPoint, Outlook, and OneNote for paid enterprise customers in September 2025, accelerating AI integration into daily workflows.
This broad rollout has turned Copilot into a “new employee” interacting with internal content. Any control failure — even if limited in scope — becomes particularly sensitive for regulated sectors (healthcare, legal, finance), teams working with NDAs, HR departments, or areas handling strategic data (business plans, mergers, security incidents). It’s not just a technical issue; it’s a matter of trust.
What Microsoft says about the scope and status of the fix
Microsoft has categorized the incident as a security advisory, a label typically used for issues that, according to the company, tend to have limited or confined impact. Still, it hasn’t specified how many organizations or users were affected and warned that the scope could change as the investigation progresses.
The company also indicated that it began deploying a fix in early February and, as of the latest update from specialized media, was still monitoring the rollout and reaching out to a subset of users to confirm that the fix was working as intended.
Another concern for compliance officers involves the fact that some information about the incident circulated via customer service notifications, heightening the sense that the market is becoming accustomed to “AI incidents” handled as operational issues, when in fact they can impact policies many companies consider non-negotiable.
Lessons for the market: security cannot rely on a single barrier
A broader takeaway from this event is that organizations are building their AI governance based on a combination of tagging, DLP policies, and access controls. If any of these elements fails due to an implementation slip, security shifts from being a system to an expectation.
In the corporate landscape, this doesn’t mean abandoning automation but reinforces a key idea: Artificial Intelligence cannot be added as a “magical layer” without accepting that incidents will happen and that the response must be quick, measurable, and auditable. In practice, many companies treat these episodes as they would a behavior change in a critical system: checking service status, validating policies, documenting internal tests, and monitoring compliance after each update.
As productivity and automation continue to drive progress, what’s at stake is not just whether Copilot summarizes an email correctly. It’s whether the rules a company defines as “confidential” still mean the same when Artificial Intelligence comes into play.
Frequently Asked Questions
What is a DLP policy in Microsoft 365, and how does it work with Copilot?
DLP (Data Loss Prevention) comprises policies within Microsoft Purview that help prevent sensitive information from being used or shared improperly. In Copilot, policies can be configured to restrict the assistant from processing certain content, such as emails or files with specific labels.
Which Outlook folders were impacted by the Copilot failure?
According to information released, the behavior affected emails located in Sent Items and Drafts, which could be processed by Copilot Chat despite having confidentiality labels.
How can a company find out if it was affected under CW1226324?
Microsoft handled it as a service advisory. Typically, in enterprise environments, it’s advised to review the Microsoft 365 Admin Center and the associated service health notices related to Copilot to see if the tenant was impacted.
What’s the difference between “confidentiality label” and “DLP” in Microsoft Purview?
A label classifies and may apply protections to content (e.g., restrictions based on sensitivity). DLP enforces rules to prevent unauthorized data use or transfer. In AI scenarios, they often work together to prevent processing sensitive-labeled content in responses or summaries.

