Data classification is a vital component of any information security and compliance program, especially if your organization stores large volumes of information. It’s impossible to maintain proper control if you don’t know what information you have and where it resides, and you can’t ensure the highest level of protection for your most critical assets if you don’t classify data according to its level of sensitivity and value.
The first step is to develop a data classification policy to define sensitive data and establish rules for its protection. Although this document is the basis for ensuring that sensitive data is handled appropriately, many data classification policies fall short for various reasons, such as the following:
- The policy uses complex language that is difficult for employees to understand and follow, leaving them with more questions than answers.
- It is not supported by training and does not fit into the organization’s workflows.
- Its goals are too ambitious and difficult to achieve.
- It fails to outline the responsibilities of all parties.
- It does not convince employees of the importance of data classification.
- Policies are written once and never reviewed and refined. As organizations evolve and business needs change, a data classification policy can become irrelevant within few years.
In this blog post, I explain what a data classification policy is and the steps you need to take to create and implement one. Along the way, I share best practices for ensuring your policy will actually help you better understand your sensitive data, allocate responsibilities properly and make better decisions about information security.
What is a data classification policy?
Generally, data classification policy is a document that includes a classification framework, a list of responsibilities for identifying sensitive data and descriptions of the various data classification levels.
Clark University specifies that a data classification standard helps organizations secure information from risks like unauthorized disclosure and access.
Note that the information classification policy should not include requirements for how the data must be handled. Rather, you should develop a separate document that defines the requirements for protecting each class of information. (You might also have specific information-handling documents if certain departments in your organization have unique needs.)
How does a good data classification policy look like?
Policies may differ in their aims and overall structure, but a good data classification standard will meet the following criteria:
- The classification criteria have to be straightforward to avoid ambiguity, but generic enough to apply to different assets in various contexts.
- It should be clear and written in simple language.
- It should fit in with the organization’s business
- It should be just a few pages length and have no more than three or four classification levels.
- It should contain point of contact for any possible edge cases and situations your employees may face. Depending on the organization structure, it can be a Security and Risk Manager, a Data Protection Officer, a Compliance Committee or any other relevant person.
- Finally, a good policy should contain a review schedule. Usually an annual review is enough, unless there are any external events, like new regulations coming into effect.
What are the steps to build an effective information classification policy?
To build a viable policy that provides a strong foundation for your data classification project, you need to take these seven steps:
Step #1. Seek help from C-level executives. Before starting to build a data classification policy, you need to seek the support of someone from C-level or the executive board who understands the importance of classification and the risks associated with data. Further you can use this help to work closely together with the business stakeholders on the next steps.
Step #2. Define the purpose of sensitive data classification. You need to articulate why you need a data classification policy. Depending your organization’s structure, business processes and other factors, your goal for creating data classification policy might be one or more of the following:
- To map data protection levels to the organization’s needs, budgets and resource constraints.
- To mitigate the risks associated with unauthorized disclosure and access.
- To comply with industry standards that require information classification (e.g., ISO 27001), retrieve specific information in a set timeframe (e.g., GDPR), or store information only in specific locations with limited access (e.g., PCI DSS).
Step #3. Define the policy scope. Prior to data sensitivity classification, you have to define the scope of your policy based on the amount of information resources you have in possession. Data can be in different forms and stored on various types of media; you might have electronic documents; databases and other information systems; paper documents; data on storage media like USBs and memory cards; emails; and much more.
Step #4. Outline responsibilities. Determine who will be responsible for maintaining proper data classification protocols for each piece of data. The policy might briefly describe the roles of data owners, data stewards and data users, as well as employees/staff, management, information owners, the legal department, records management and the compliance department.
Step #5. Define levels of data sensitivity. Specify the levels of data classification for all of your data sources, providing definitions and examples for each level. There is no strict standard for a data classification table; you need to develop it on your own, in accordance with complexity of your IT environment, industry requirements and other factors. However, it is highly recommended to keep the number of data classification levels to a minimum (no more than 4 levels), because it is extremely difficult to put a more complex scheme into practice. Generally, these levels are similar to the following:
- Level 1: Highly sensitive corporate or customer data
- Level 2: Sensitive internal data
- Level 3: Internal data that is not meant for public disclosure
- Level 4: Data that can be disclosed to the public
Step #6. Develop handling guidelines to ensure data security. Once you’ve nailed down your data classification levels, the next step is to develop a set of activities and rules that define how to protect each type of asset depending on its level of confidentiality. As noted earlier, this data handling table should be separate from your data classification policy; that way you minimize the complexity of the approval process every time you need to produce a new version of the handling guidelines because there’s a change in classification guidance or protection requirements.
Step #7. Review and refine. Finally, you need to remember that policies are not a thing that you do once and forget. You need to periodically review your data classification policies and handling guidelines and make changes if necessary.
Examples of data classification in government and commercial sectors
As you start to create your data classification table, you can consider using examples of well-known companies from government and commercial sectors as guides.
Example 1: Government data classification levels
The typical government data classification scheme used by federal, state and local governments assigns no more than three levels of sensitivity: top secret, secret and public data. However, for organizations with extremely complex structures, a scheme using just three government data classification levels is insufficient. The example of NATO’s Security Indoctrination document shows that data can be classified into six levels if it is necessary:
- Cosmic Top Secret
- NATO Secret
- NATO Confidential
- NATO Restricted
- NATO Unclassified (copyright)
- Non-sensitive information releasable to the public
Example 2: Commercial data classification levels
Typically, organizations that store and process commercial data use 4 levels of data classification, which include three confidential levels (secret, confidential, business use only) and one public level. Same rule works for medical and educational organizations; for example, Boston University defines the following categories for all the academic, administrative and other data they store and process:
- Restricted use data — Loss or modification of this information could have a catastrophic effect on the organization’s operations, assets and individuals. This category includes information that the university has a legal or regulatory obligation to safeguard in the most stringent manner. Examples include personally identifiable information (PII) covered under Massachusetts law (e.g., Social Security numbers, financial account numbers and credit card details); protected health information (PHI) covered by the Health Insurance Portability and Accountability Act (HIPAA); unencrypted passwords and keys; and criminal background data collected by application
- Confidential data — Loss of this information may “adversely affect individuals or the business of Boston University.” This category includes information covered by regulations like the Gramm-Leach-Bliley Act (GLBA) and FERPA (which requires protection of the records of current and former students); personally identifiable information (PII) that is not restricted data and employment
- Internal data — This includes any information that is potentially sensitive and is not intended to be shared with the public, such as memos, correspondence and contact lists.
- Public data — This is information that may be disclosed to any person regardless of their affiliation with the university, such as press releases, directory information that is not subject to the Family Educational Rights and Privacy Act (FERPA), course and educational catalogs, and application
What is next?
As soon as you have written a good data classification policy and specified handling guidelines, you need to make sure your policy really works and you have necessary controls to ensure information security. The next step will be tagging your data based on its levels of sensitivity and ownership. To launch this process, you will have to decide whether you will do it manually, or opt for automated data classification solution.
Each method has its benefits and challenges. If you choose to perform data classification manually, you will have to give employees extra training to improve consistency of their work. If you choose an automated system, you will have to seek for financial support from senior management and allocate responsibilities for managing and allocate responsibilities for managing the software.