Skip to main content
Microsoft 365
Subscribe

Search for sensitive content in SharePoint and OneDrive documents

Wesley Holley is a program manager on the Office 365 team and Shobhit Sahay is the technical product manager on the Office 365 team.

Responsible organizations today use a variety of controls and policies to keep their data safe and secure. These controls become even more crucial if the data involved is sensitive information, which can range from industry-wide data (such as credit card numbers, Social Security numbers, or customer information) to proprietary information (such as patents or confidential documents). Protecting this sensitive data is important because it enables organizations to comply with industry, government, and other regulations.

Office 365 already provides these necessary capabilities for email with Data loss prevention (DLP) in Exchange, Outlook, and OWA, along with a series of built-in sensitive information types that you can use for your searches. We’re pleased to announce that we are taking our first steps for DLP in SharePoint and OneDrive, thereby allowing you to use the same sensitive information types to search documents and sites across your organization.

With this new capability you’ll be able to:

  • Search for sensitive content across SharePoint Online and OneDrive for Business.
  • Leverage 51 built-in sensitive information types (credit cards, passport numbers, Social Security numbers, and more).
  • Identify offending documents, export a report, and adjust accordingly.

Let’s take a look at how this new capability can help you.

Search for sensitive content across SharePoint Online and OneDrive for Business

DLP for SharePoint Online and OneDrive for Business is now built into your existing Enterprise Search. It allows you to search for sensitive content in your existing eDiscovery Center, keeping content in place and enabling you to search in real time.

Display-sensitive-content-in-SharePoint-Online-and-OneDrive-for-Business
Query search results display sensitive content in SharePoint Online and OneDrive for Business from the eDiscovery Center in SharePoint Online.

Your compliance officer can enter simple or complex queries and program Search to crawl a variety of sources, including team sites and users’ OneDrive for Business folders. Once the query is run, the results appear under the SharePoint tab, where you can review them in place. You can simply adjust the query, adding indexed properties such as “author” or “date” to fine-tune your results. It is important to note that permission to use the eDiscovery Center is role protected to ensure that the right people—not everyone in your organization—have access to run these queries and review sensitive content.

Leverage 51 built-in sensitive information types

Office 365 provides a wide range of sensitive information types from different industry segments and geographies, such as credit card numbers, Social Security numbers (SSNs), bank account numbers, and other types, many of which you may already be using to search for sensitive content in email. These sensitive information types are detected based on pattern matching and are easy to set up. You will now be able to extend these same sensitive information types to search across SharePoint Online and OneDrive for Business by creating simple queries, as shown below.

SensitiveType=”U.S. Social Security Number (SSN)”To search for U.S. Social Security numbers
SensitiveType=”U.S. Social Security Number (SSN)” OR SensitiveType=”Spain National ID”To search for U.S. Social Security numbers or Spain National IDs
SensitiveType=”U.S. Social Security Number (SSN)|5..”To search for five or more U.S. Social Security numbers

Learn more about all 51 built-in sensitive information types.

51 built-in classifications v2
You can use one or more of the 51 built-in sensitive information types. Here you see a query using both the Credit Card Number and U.S. Social Security Number (SSN) sensitive information types. 

Identify offending documents, export a report, and adjust accordingly

You will now be able to review possible offending documents inline, in real time—right from the eDiscovery Center. You will also be able to export the list of documents for further review and then take manual actions such as adjust sharing permissions, remove data on shared sites, and more based on your review of the results.

Exporting the results is easy and you can download a copy of the files yourself with a report of the query results. You can save the query and then turn your attention to investigating the query results. Once you have saved the query, you can inspect the documents, check for false positives, and further hone or expand the query if needed. The location of each result is in the report and if you download the copy of the files, it copies the original file structure from SharePoint so all those paths are preserved in the downloaded copy.

Hovering on a result to display more about the discovered documents v2
Hovering on a result displays more about the discovered, in-place document(s),
which you can then open for further review, or you can export the entire result set.

To sum it all up

Searching for sensitive content in SharePoint and OneDrive is now available worldwide for your use in your Office 365 environment. With this new capability, you can be better informed about what and where sensitive documents exist in SharePoint Online and OneDrive for Business. And having this information will help you work better with content owners to ensure protection of sensitive data.

Later this year we’ll introduce additional capabilities that go beyond simply discovering and reviewing sensitive content. These future capabilities will allow you to create policies that automatically detect sensitive content and apply protection based on your organization’s needs, such as taking actions like deletion of content or quarantine until further reviewed. You’ll also see us unify the entire compliance experience for you across Office 365, providing you with consolidated, built-in security controls at your fingertips. You can find out more about future investments by watching Overview of Compliance in SharePoint Online and Office 365, the SharePoint Conference 2014 session in which we shared these.

To learn more, please read the Use DLP in SharePoint Online to identify sensitive data stored on sites article on TechNet.

Thanks,

Wesley and Shobhit

Frequently Asked Questions

Q. I’m not seeing any sensitive content for eDiscovery queries. What can I do to change this?

A. Documents need to be scanned before they will show up in query results. Our scan process is built into the SharePoint Online Search crawler, so as you upload or edit documents, they will be scanned or rescanned if a version of the document already exists. This article provides a comprehensive outline of content discovery and search in SharePoint. Many of the topics in the article may be of interest to you, but sections about crawls and managing the search schema may be of particular use for this topic.

Q. Why do my DLP queries have no results in Exchange?

A. We are prioritizing bringing the ability to perform DLP queries on data in SharePoint Online first, based on customer feedback, and will be considering Exchange for future updates.

Q. What are the different files types that can be detected with this capability?

A. You can find a list of the file types that we support in SharePoint here.

Q. Will this scan the OneDrive for Business documents as well?

A. Yes, the eDiscovery Center will get search results from Site Collections and any OneDrive where you have granted it access. This is covered in more depth in our documentation.

Q. Are these features available for on-premises SharePoint releases?

A. While today we are bringing this capability to our SharePoint Online customers, based on customer feedback we will be considering this capability for future SharePoint on-premises releases as well.

Q. What are the future plans for DLP in SharePoint?

A. DLP in SharePoint is an ongoing investment and with this new capability we are laying the stepping stone for more improvements. Later this year we will introduce additional capabilities that go beyond simply discovering and reviewing sensitive content. These new capabilities will allow you to create policies that automatically detect sensitive content and apply protection based on your organization’s needs.

Q. I would like to participate in the early adopters plan for DLP in SharePoint, especially for the features coming up later this year. What can I do to participate?

A. The best way to try the new features in Office 365 is via the Office 365 First Release program. If you would like to get access earlier, please work with your account team to get an even earlier preview and provide your valuable feedback!