Knowledge in Microsoft Copilot Studio

What’s Under the Hood: RAG in Action for your enterprise knowledge.

Microsoft Copilot Studio enables the integration of various enterprise systems into your agents, with built-in security, governance, and unified validation experiences. It supports productivity data from Office applications, line-of-business data from sources such as Dataverse (Microsoft 365 Dynamics), Salesforce, ServiceNow, and local files. For a more detailed overview of supported knowledge sources and configuration, read the overview of different knowledge sources supported by Copilot Studio.

Knowledge in Copilot Studio

Enterprise data is used to ground your agents in your organization’s knowledge, ensuring accurate and relevant information. This process is known as Retrieval Augmented Generation (RAG). This document provides an in-depth look at how your data from enterprise systems is transformed into knowledge through RAG on the data sources mentioned below: 

  • Files – Using built-in semantic index of Copilot Studio 
  • Enterprise Systems – Dataverse Tables and Azure SQL, Azure AI Search index, Salesforce, ServiceNow, and more) through 1,400+ Power Platform connectors or leveraging Microsoft 365 semantic index created by Microsoft Graph connections

Files

Uploading files as knowledge helps makers enrich their agents with additional data, augmenting the LLM’s knowledge and grounding the agent in specific information provided by the maker. Makers can upload a variety of files (see types and limits here) which are semantically indexed as vector embeddings and then used as knowledge for agents. This knowledge used in agents can then be shared with authenticated and unauthenticated users of the agent.  

Unstructured Knowledge Sources

To improve agent’s responses, uploaded files are chunked into smaller pieces for faster processing and vector-indexed to provide semantic match with the user query. They are stored automatically in a built-in store for the agent. Indexing time depends on the file size. When a user queries through an agent, the Copilot Studio orchestrator uses the relevant chunks that match the query, and then the LLM summarizes the top chunks. 

Enterprise Systems

Ensuring secure access to knowledge sources is critical for managing and harnessing enterprise data effectively. The data in the enterprise systems are accessed in the context of the end user and the end user always get to view the current data in the tables, based on the security roles assigned to them. Also, when a modification is made to the underlying data in these tables, by any application, user or the agent, these changes are reflected in real time to the next query from the agent! 

Enterprise sources such as Dataverse tables contain data from Dynamics 365 business applications (i.e. Sales, Marketing, Finance) and custom business data useful for LOB agents. Makers can use existing Dataverse tables or create new tables via Data Workspace to provide knowledge for agents. Such new tables can be populated from Excel files, SharePoint lists and external systems using Dataflows built on Power Query

When Dataverse is used as Knowledge, the user’s query is translated into runtime query against Dataverse. The data and metadata in these tables are semantically indexed and vector embeddings help with linking the objects in the user query to schema elements and annotation of values. These annotations, along with synonyms and glossaries provided by the maker, determine the relevant columns to be used while generating a PowerFX query from the natural language. Synonyms and Glossaries are provided by the maker, inside Microsoft Copilot Studio while adding Dataverse tables to provide more business and organizational context to the models being used to fetch relevant knowledge. Thus, you can see why synonyms and glossaries play a big role in getting quality responses for queries from agent. At runtime, the rich Dataverse security model is enforced including table, row and column security. This means that the end-user will only be seeing the records which they have access to. The entire end to end process is illustrated below.

Copilot Studio to Dataverse

Other Enterprise sources

For other enterprise systems such as Salesforce, ServiceNow, Zendesk, and Azure SQL Server, the real-time RAG approach is implemented. In this approach, data from these enterprise sources are never ingested or indexed, but instead we leverage the metadata to generate a real-time query based on table and column names. Also in this model, customers do not have to worry about adding any additional layer of security within Copilot Studio. Copilot Studio uses only metadata, such as table and column names, when sources are selected, to create an index. The connectivity is established using the low code Power Platform connectors. When end-users use the agent, they are prompted to sign-in to establish a connection to the external system ensuring that they have the appropriate permissions to access the data, thereby establishing a secure link to access the knowledge needed, in real-time, for providing answers.

Security

When developing agents that are grounded on diverse knowledge sources, data security becomes a shared responsibility among the makers and administrators alike. 

When utilizing Copilot Studio to build agents, makers have the tooling available to implement robust security measures and adhere to best practices, creating intelligent solutions that are both powerful and secure while meeting the highest compliance standards. Adopting secure-by-default practices is essential; makers should ensure that agents are authenticated to restrict access exclusively to authorized users. Additionally, minimizing data exposure by connecting agents only to the necessary knowledge sources helps reduce risks and limits the overall attack surface. Implementing role-based access controls ensures that users receive only the permissions they require, preventing inadvertent overprivileged access to sensitive data. Continuous monitoring of access and usage aids in the prompt detection of any unusual or unauthorized activities, and regular security audits are vital to gain insight into discrepancies and anomalies that could indicate business concerns or suspected security violations. Before deploying to production environment, makers must conduct comprehensive compliance checks to ensure alignment with organizational policies and regulatory standards. 

Managed Security

Administrators play an equally crucial role in governing knowledge sources by leveraging managed security capabilities. These controls help protect data from threats, regulate access, prevent data exfiltration, and support custom encryption policies that reflect the organization’s requirements. By applying tailored data policies to systems such as SharePoint, public websites, or document repositories, administrators can ensure that connectors and API endpoints are used securely by developers building agents. Monitoring and analytics in the Power Platform admin center provide detailed insights into how resources—such as Dataverse—are accessed and utilized. Features like IP firewalls, IP cookie binding, and managed identities for Dataverse plug-ins further guarantee that only authorized users can access critical organizational resources. Moreover, robust compliance measures, including functionalities like Customer Lockbox and comprehensive auditing support for activities performed by makers, users, and administrators—further ensure that you get insights into organizational data events and streamline implementing industry regulations. 

We are continuously raising the bar on security and governance around data and knowledge that is being used by agents. To learn more about these features, see IT Governance Controls for Your Copilot agents – Microsoft Power Platform Blog 

Conclusion

Copilot Studio provides a robust and secure platform for building custom agents, using various RAG techniques for your enterprise systems. If you have not yet created an agent using Microsoft Copilot Studio, why wait? Start building your agents using Copilot Studio today, and experience how your enterprise data can be converted to knowledge for agents securely with a no-code seamless experience. Your data security is our top priority, allowing you to focus on creating exceptional, intelligent agents that meet your enterprise needs. 

Learn more about Microsoft Copilot Studio + Microsoft Dataverse: