Understand Microsoft Copilot security concerns
Microsoft Copilot raises security concerns around unauthorized or unintentional data access. Prevent leaks with vigilant oversight and comprehensive user access reviews.
Microsoft Copilot can improve end-user productivity, but it also has the potential to create security and data privacy issues.
Copilot streamlines workflows in Microsoft 365 applications. By accessing company data, it can automate repetitive tasks, generate new content and ideas, summarize reports and improve communication.
Productivity benefits depend on the data Copilot can access. But security and data privacy issues can arise if Copilot uses data that it shouldn't have access to. Understanding and mitigating various Copilot security concerns requires a high-level understanding of how Copilot for Microsoft 365 works.
How Copilot accesses company data
Like other AI chatbots, such as ChatGPT, users interact with Copilot via prompts. The prompt is displayed within Microsoft Office applications, such as Microsoft Word or Excel, or within the Microsoft 365 Web portal.
When a user enters a request into the prompt, Copilot uses a technique called grounding to improve the quality of the response it generates. The grounding process expands the user's prompt -- though this expansion is not visible to the end user -- based on Microsoft Graph and Microsoft Semantic Index. These components rewrite the user's prompt to include key words and data references that are most likely to generate the best results.
After modifying the prompt, Copilot sends it to a large language model. LLMs use natural language processing to interpret the modified prompt and enable Copilot to converse in written natural language with the user.
The LLM formulates a response to the end user's prompt based on the available data. Data can include internet data, if organization policies allow Copilot to use it. The response usually pulls from Microsoft 365 data. For example, a user can ask Copilot to summarize the document they currently have open. The LLM can formulate a response based on that document. If the user asks a more complex question that is not specific to one document, Copilot will likely pull data from multiple documents.
The LLM respects any data access controls the organization currently has in place. If a user does not have access to a particular document, Copilot should not reference that document when formulating a response.
Before the LLM sends a response to the user, Copilot performs post processing checks to review security, privacy and compliance. Depending on the outcome, the LLM either displays the response to the user or regenerates. The response is only displayed when it adheres to security, privacy and compliance requirements.
How Copilot threatens data privacy and security
Copilot can create data security or privacy concerns despite current safeguards.
The first potential issue is users having access to data that they shouldn't. The problem tends to be more common in larger organizations. As a user gets promoted or switches departments, they might retain previous access permissions that they no longer need.
It's possible that a user might not even realize they still have access to the data associated with their former role, but Copilot will. Copilot uses any data that is available to it, even if it's a resource that the user should not have access to.
A second concern is Copilot referencing legitimately accessed data that it shouldn't. For example, it might be better if Copilot is not able to formulate responses based upon documents containing your organization's confidential information. Confidential or sensitive data might include plans for mergers or acquisitions that have not been made public or data pertaining to future product launches.
An organization's data stays within its own Microsoft 365 tenant. Microsoft does not use an organization's data for the purpose of training Copilot. Even so, it's best to prevent Copilot from accessing the most sensitive data.
If a user has legitimate access to this sensitive data, it still can be harmful to let that user access it through Copilot. Some users who create and share Copilot-generated documents might not take the time to review them and could accidentally leak sensitive data.
Mitigate the security risks
Before adopting Copilot, organizations should engage in an extremely thorough access control review to determine who has access to what data. Security best practices stipulate that organizations should practice least user access. Normally, LUA is in response to compliance requirements or as a way of limiting the damage of a potential ransomware infection -- ransomware cannot encrypt anything that the user who triggered the infection does not have access to. In the case of a Copilot deployment, adopting the principles of LUA is the best option to ensure Copilot does not expose end users to any data that they should not have access to.
Restricting Copilot from accessing sensitive data can be a tricky process. Microsoft recommends applying sensitivity labels through Microsoft Purview. Configure the sensitivity labels to encrypt sensitive data and ensure users do not receive the Copy and Extract Content (EXTRACT) permission. EXTRACT prevents users from copying sensitive documents and blocks Copilot from referencing the document.
Brien Posey is a 22-time Microsoft MVP and a commercial astronaut candidate. In his more than 30 years in IT, he has served as a lead network engineer for the U.S. Department of Defense and a network administrator for some of the largest insurance companies in America.