{"id":1155945,"date":"2025-11-25T09:00:00","date_gmt":"2025-11-25T17:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=1155945"},"modified":"2025-11-24T13:47:29","modified_gmt":"2025-11-24T21:47:29","slug":"reducing-privacy-leaks-in-ai-two-approaches-to-contextual-integrity","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/reducing-privacy-leaks-in-ai-two-approaches-to-contextual-integrity\/","title":{"rendered":"Reducing Privacy\u00a0leaks in AI: Two approaches to contextual integrity\u00a0"},"content":{"rendered":"\n
\"Four<\/figure>\n\n\n\n

As AI agents become more autonomous in handling tasks for users, it’s crucial they adhere to contextual norms around what information to share\u2014and what to keep private. The theory of contextual integrity frames privacy as the appropriateness of information flow within specific social contexts. Applied to AI agents, it means that what they share should fit the situation: who\u2019s involved, what the information is, and why it\u2019s being shared.<\/p>\n\n\n\n

For example, an AI assistant booking a medical appointment should share the patient\u2019s name and relevant history but not unnecessary details of their insurance coverage. Similarly, an AI assistant with access to a user\u2019s calendar and email should use available times and preferred restaurants when making lunch reservations. But it should not reveal personal emails or details about other appointments while looking for suitable times, making reservations, or sending invitations. Operating within these contextual boundaries is key to maintaining user trust.<\/p>\n\n\n\n

However, today\u2019s large language models (LLMs) often lack this contextual awareness and can potentially disclose sensitive information, even without a malicious prompt. This underscores a broader challenge: AI systems need stronger mechanisms to determine what information is suitable to include when processing a given task and when.  <\/p>\n\n\n\n

Researchers at Microsoft are working to give AI systems contextual integrity so that they manage information in ways that align with expectations given the scenario at hand. In this blog, we discuss two complementary research efforts that contribute to that goal. Each tackles contextual integrity from a different angle, but both aim to build directly into AI systems a greater sensitivity to information-sharing norms.<\/p>\n\n\n\n

Privacy in Action: Towards Realistic Privacy Mitigation and Evaluation for LLM-Powered Agents<\/a>,\u202faccepted at the EMNLP 2025, introduces PrivacyChecker (opens in new tab)<\/span><\/a>, a lightweight module that can be integrated into agents, helping make them more sensitive to contextual integrity. It enables a new evaluation approach, transforming static privacy benchmarks into dynamic environments that reveal substantially higher privacy risks in real-world agent interactions. Contextual Integrity in LLMs via Reasoning and Reinforcement Learning<\/a>, accepted at NeurIPS 2025<\/a>,\u202f\u202ftakes a different approach to applying contextual integrity. It treats it as a problem that requires careful reasoning about the context, the information, and who is involved to enforce privacy norms.<\/p>\n\n\n\n\t

\n\t\t\n\n\t\t

\n\t\tPODCAST SERIES<\/span>\n\t<\/p>\n\t\n\t

\n\t\t\t\t\t\t
\n\t\t\t\t\n\t\t\t\t\t\"Illustrated\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t
\n\n\t\t\t\t\t\t\t\t\t

AI Testing and Evaluation: Learnings from Science and Industry<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t

Discover how Microsoft is learning from other domains to advance evaluation and testing as a pillar of AI governance.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t

\n\t\t\t\t\t
\n\t\t\t\t\t\t\n\t\t\t\t\t\t\tListen now\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div>\n\t<\/div>\n\t<\/div>\n\t\n\n\n

Privacy in Action: Realistic mitigation and evaluation for agentic LLMs<\/h2>\n\n\n\n

Within a single prompt, PrivacyChecker extracts information flows (sender, recipient, subject, attribute, transmission principle), classifies each flow (allow\/withhold plus rationale), and applies optional policy guidelines (e.g., \u201ckeep phone number private\u201d) (Figure 1). It is model-agnostic and doesn\u2019t require retraining. On the static PrivacyLens (opens in new tab)<\/span><\/a> benchmark, PrivacyChecker was shown to reduce information leakage from 33.06% to 8.32% on GPT4o and from 36.08% to 7.30% on DeepSeekR1, while preserving the system\u2019s ability to complete its assigned task.<\/p>\n\n\n\n

\"The
Figure 1. (a) Agent workflow with a privacy-enhanced prompt. (b) Overview of the PrivacyChecker pipeline. PrivacyChecker enforces privacy awareness in the LLM agent at inference time through Information flow extraction, privacy judgment (i.e., a classification) per flow, and optional privacy guideline within a single prompt. <\/figcaption><\/figure>\n\n\n\n

PrivacyChecker integrates into agent systems in three ways: <\/p>\n\n\n\n