More and more organizations are recognizing the power and value of data classification. By accurately classifying and labeling the information you store, you can:
- Strengthen security by identifying sensitive data across the enterprise and improving your controls around it.
- Ensure compliance with regulations such as CCPA and GDPR, and enable quick compliance with new regulations as they are add
- Enforce privacy, retention and confidentiality policies.
- Reduce storage and backup expenses by spotting and removing duplicate data.
- Enhance business processes by making searches faster and more effective
- Improve database architecture structure and security platform functionality.
If your organization is like most, you now rely on cloud platforms like SharePoint Online, OneDrive and Exchange Online, and you need to know exactly what types of data is being stored there so you can ensure sensitive content is properly protected. To help, this article explains how data classification works in Microsoft 365 — and how to overcome the key limitations.
How Data Classification Works: Overview
The Microsoft 365 data classification process involves the following core processes:
- Creating and publishing labels — Admins create sensitivity labels and configure their settings. They publish the labels internally, along with a policy that details how they should be used.
- Applying labels to content — As business users work with email or documents, they can apply labels provided by administrators to classify the content. Admins can also specify conditions under which a particular label will be assigned to content automatically.
Label use by M365 — Microsoft 365 enforces protection settings on email messages and documents based on the labels applied to them. Specifically, the following Microsoft 365 workloads recognize sensitivity labels:
- Office apps
- Microsoft 365 Groups
- SharePoint, OneDrive and Microsoft Teams
- Label use by other systems — Internal processes and third-party solutions can also use the labels. For example, your data loss prevention tool can warn or block users who try to share content with a specific sensitivity label, and your data retention policy can use the labels to determine when files should be deleted.
Creating and Publishing Microsoft 365 Sensitivity Labels
Classification labels in Microsoft 365 are essentially customizable stamps attached to documents and emails in the Microsoft cloud. They are stored in the file’s metadata, so even if content is created in a Microsoft Office application, for example, its labels remains intact even if the file is moved.
To create a label, open the Compliance Center, go to Classification > Sensitivity Labels and take the steps below. For more complete details, see this Microsoft page.
- Click Create a Label and accept the warning.
- Enter a name, tooltip and description for the label. Click Next.
- Configure the desired protection settings, expiration details, permissions, offline access options, and other options. For example, you can add encryption or watermarks to certain content, allow only certain users to open confidential documents, and prevent certain content from being forwarded via email.
- If you choose to turn on auto labeling, specify the conditions under which content will automatically be assigned the label you are creating. For example, you might add the label the content includes a passport or Social Security number. See the discussion below for more details on automatic and manual labeling.
- Review your settings and click Create.
- Repeat the previous steps for each additional label you choose to create for your organization.
- If desired, assign policies to your label. For example, you might want to ensure that specific sensitive information types (such as credit card numbers) are always labeled as Confidential. By default, labels appear in Microsoft 365 in this order: Confidential, Internal, and Public. You can change that order.
- If desired, add a sub-label to your label.
- Navigate to Publish Labels.
- Select the labels you want to publish and click Add > Done > Next.
- Review your settings, then select Publish.
- Create level priorities and sub-label groups as needed.
To use sensitivity labels in SharePoint and OneDrive, you need to opt in. You can do that using either the Compliance Center or PowerShell.
Microsoft 365 Compliance Center
Take the following steps:
- Sign in using an account with global administrator permissions.
- Navigate to Solutions > Information Protection. (If you don’t immediately see this option, first select Show all.)
- If you see a message to turn on the ability to process content in Office online files, select Turn on now. The data classification process will start immediately.
Although PowerShell is not usually the preferred way to enable sensitivity labels, sometimes it is the only option. For example, if you use Microsoft 365 Multi-Geo and want to enable sensitivity labels, you need to use PowerShell.
Here are the basic steps to take. For all specific steps that could be required, see this Microsoft page.
- Connect to SharePoint using a work or school account that has global administrator or SharePoint admin privileges in Microsoft 365.
- Run the following command:
Set-SPOTenant -EnableAIPIntegration $true
Applying Labels to Content
As noted earlier, users can apply labels to content manually, and you can also set up conditions under which a label is applied automatically.
From the document settings menu, users can apply manually labels to individual files or to an entire document library or email folder.
- Client-side labeling — When users work with documents or emails in Word, Excel, PowerPoint or Outlook, the system will examine the content and recommend labels, which the user can accept for reject.
- Service-side labeling — When content is already saved in SharePoint or OneDrive or sent or received by Exchange, the labels are applied automatically, since the user is not interacting with the content.
For complete details, see this Microsoft page.
Challenges with Microsoft 365 Data Classification
Document and email classification in Microsoft 365 adds useful structure and helps users make sense of data on their networks. However, it comes with several notable challenges.
Manual Labeling Issues
- Risk of errors — In a perfect world, users always adhere to the organization’s manual labeling process requirements. However, in the real world, organizations can’t assume employees will always assign labels correctly; human error will always be part of the cost of doing business.
- Risk of failing to label legacy content — Manual labeling is a cumbersome and time-consuming process, so the huge volume of content already stored in your systems will likely never get classified.
Automatic Labeling Issues
- Lack of precision — Classifying content based on keywords often yields poor results. Organizations report both false positives (documents that get a label they do not merit) and false negatives (files that fail to get a label they should be assigned).
- Limited applicability — Automatic labeling in Microsoft 365 has specific prerequisites and isn’t supported in older versions of Microsoft products. Also, this functionality is available only for specific subscription plans, such as Microsoft 365 E5 and Azure Information Protection Premium P2. Moreover, you can apply labels to Office files only.
An Alternative to Microsoft 365 Data Classification
To reduce the effort and cost involved in classifying your data and get more accurate results, consider investing in a third-party solution. For example, Netwrix Data Classification delivers:
- More accurate classification results, thanks to advanced techniques, such as compound term search and stemming
- Support for both structured and unstructured data
- Support for both on-premises and cloud data repositories
- Predefined taxonomies aligned with specific regulations and data types
- Custom taxonomies to handle company-specific content
- Support for a variety of file types, not just Office documents.