Understanding Personally Identifiable Information (PII)
Personally identifiable information (PII) is any data that can be used on its own or in combination with other information to identify, contact, or locate an individual. Examples of PII include full name, Social Security number, passport number, driver’s license number,r and biometric data.
The term “personally identifiable information” gained prominence in the United States in the 1970s, as governments and organizations recognized the risks of unauthorized data use. One of the landmark moments was the 1973 report by the US Department of Health, Education, and Welfare, which outlined principles for fair information practices (FIPPs). This laid the foundation for data privacy regulation and the conceptual framework for what became known as PII.
As technology evolved, the scope of PII expanded:
- 1980 — The OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data formally addressed PII in an international context.
- 2000s — The rise of the internet and e-commerce highlighted the risks of online data exposure.
- 2010s and beyond — Regulations such as the General Data Protection Regulation (GDPR) in the EU and the California Consumer Privacy Act (CCPA) in the US redefined and broadened the understanding of what constitutes PII.
- Today — PII encompasses not just static identifiers but also dynamic data like online behavior, geolocation and device identifiers.
PII is critically important as it underpins privacy, security and trust in an increasingly connected world. The widespread collection and processing of personal data by businesses, governments and online platforms heighten the risks of identity theft, fraud and data breaches. Strong protection of PII is essential not only for regulatory compliance with laws like GDPR and CCPA but also for maintaining customer trust and safeguarding individual rights in an era of rapid technological advancement.
What Is Considered Personally Identifiable Information?
Information qualifies as PII when it meets either of the following conditions:
- Unique identification of an individual — The information can directly pinpoint an individual, as would be true for a Social Security number or email address.
- Linkability — The information, when combined with other accessible data, can reasonably identify an individual, such as combining zip code, birth date, and gender.
Context is critical when assessing whether information is PII:
- Data aggregation — Information that is not PII in isolation may become PII when combined with other datasets. For example, an IP address alone may not be PII, but when associated with a user account, it becomes PII.
- Purpose and access — Who has access to the data and how they intend to use it plays a role. For example, the location data of a person in a sensitive profession (such as law enforcement) may be considered PII.
- De-identification and re-identification risk — Even de-identified data can sometimes be re-identified using additional data sources, thus potentially regaining its status as PII.
Types of Personally Identifiable Information
PII can be broadly classified into two main categories based on the degree of identifiability: direct identifiers and indirect (or quasi-) identifiers.
Direct Identifiers
These are pieces of information that can directly and uniquely identify an individual without the need for additional data. Direct identifiers include:
- Full legal name, such as John A. Smith
- Social Security number (SSN)
- Passport number
- Driver’s license or state ID number
- Personal phone number
- Personal email address
- Biometric data (e.g., fingerprints, retina scans, facial recognition data)
- Credit card or bank account numbers
- Full-face photographs
Indirect (Quasi-) Identifiers
These pieces of information do not directly identify a person but can do so when combined with other data. Here are some common types of indirect identifiers with examples:
- Demographic info: Date of birth, gender, ethnicity, age or marital status
- Geographic info: Postal address, zip code, city of birth or travel history
- Educational history: School name or degree attained
- Employment data: Job title, company, office location
- Device information: IP address, MAC address, device ID (depending on context and jurisdiction)
- Behavioral info: Purchase history, browsing habits
Emerging Types of PII
With advances in technology, newer types of information are increasingly considered PII:
- Digital identifiers: Advertising IDs, MAC address
- Behavioral data: Browsing history, search queries
- Genetic information: DNA profiles
Sensitive vs. Non-Sensitive PII
PII can be sensitive or non-sensitive.
Sensitive PII
Sensitive PII is information that, if disclosed or compromised, could cause significant harm, embarrassment, financial loss or legal liability to an individual. Examples include:
- Social Security numbers
- Passport and driver’s license numbers
- Bank account and credit card details
- Medical records (also considered PHI), such as a diagnosis of diabetes in a healthcare system
- Legal records (criminal history, immigration status)
- Biometric data (fingerprints, retina scans)
- Full-face photographs (in certain contexts)
- Location data (GPS coordinates, real-time tracking)
Non-Sensitive PII
Non-sensitive PII is information that is generally publicly available or unlikely to cause serious harm if disclosed. Examples are:
- Full name (without other identifiers)
- Business contact information (work phone, company email)
- Job title or company affiliation
- General demographic data (race, gender when not combined with other identifiers)
Non-sensitive PII has a low standalone risk but can contribute to risk when combined with other data sets (data aggregation).
How Organizations Treat Sensitive PII and Non-Sensitive PII Differently
Sensitive PII Handling | Non-Sensitive PII Handling |
Strict access controls and encryption (both at rest and in transit)Mandatory user authentication and monitoringData minimization principles — collect and store only what is necessaryRegulatory compliance under GDPR, HIPAA, CCPAData breach notification requirements | Basic privacy measures (e.g., secure storage, access logging)Often excluded from breach notification laws if not linked to sensitive dataMay still be protected by internal company policies to prevent misuse |
Examples of Personally Identifiable Information
Common examples used in identity verification
Industry-specific instances (e.g., healthcare, finance)
Myths about what is or isn’t PII
What Is Not PII?
Not all data is considered PII. Non-PII refers to information that cannot be used on its own to identify a specific individual. However, in some cases, non-PII can become PII when combined with other data points.
Personal Data vs. PII
- PII is data that can uniquely identify an individual. This term is used mainly in US legal and cybersecurity contexts (e.g., NIST, HIPAA).
- Personal data is broader than PII. Under laws like GDPR, it includes any information related to an identifiable person, even indirectly (for example, device ID, IP address).
Examples of Non-PII
The following are typically not PII — unless they are linked to other identifying data:
- Browser type or device model
- Zip code (if not specific enough to identify a household)
- Age range
- Gender
- Anonymous survey responses
- Anonymized data
- Aggregated analytics data
- Randomly generated user IDs that are not traceable back to a person
When Non-PII Becomes PII
Non-PII can turn into PII when:
- It is linked with other data. For example, zip code + birth date + gender can often uniquely identify an individual.
- Re-identification is possible. If anonymized data can be de-anonymized using external information (like social media or public records), it is no longer non-PII.
- Data correlation occurs. Behavioral data (like clickstreams) may seem anonymous but can be analyzed to identify specific users.
How PII Is Collected and Stored
Organizations must manage PII responsibly to protect individuals and comply with regulations like GDPR, CCPA and HIPAA.
PII Collection Methods
Method | Examples |
Online collection | Web forms: Contact forms, sign-up pages, checkout processesCookies and trackers: Browser and behavioral dataSurveys and polls: Email address or phone numberMobile apps: App permissions may grant access to contacts, location, etc.Social media platforms: Profiles, interactions and uploaded content |
Offline (physical) collection | Paper forms: Applications, contracts, visitor logsFace-to-face interactions: Interviews, customer service interactionsSurveillance: CCTV footage (may include biometrics) |
Biometric data collection | Facial recognition systemsFingerprint scannersVoice recognitionIris/retina scans |
Where PII Is Stored
Most modern organizations store PII in cloud environments, such as AWS, Microsoft Entra, Google Cloud (public, private, or hybrid). PII is also often stored in SaaS platforms (CRMs, HRIS, ERPs).
However, some enterprises still use local servers and databases for sensitive workloads, as well as hybrid systems that combine local and cloud storage.
Regardless of location, PII must be encrypted (at rest and in transit), backed up securely, and audited regularly.
Consent and Transparency in PII Collection
User Consent
Laws like GDPR and CCPA require explicit, informed consent before collecting personal data. This means:
- Users must opt-in (not pre-checked boxes).
- Consent must be freely given, specific, informed and unambiguous.
- Consent withdrawal must be just as easy.
Transparency Requirements
Organizations must provide:
- Clear privacy policies on what data is collected, how it is used, how long it will be retained and any third-party sharing
- Notices at point of collection, such as cookie banners or inline disclaimers
- Easy ways for users to view, correct and delete their data
How PII Gets Compromised
PII is a prime target for cybercriminals because it can be used for financial fraud, identity theft and corporate espionage. Understanding the common ways PII is compromised helps organizations and individuals better protect it.
Common Threat Vectors
Attackers often gain access to PII using:
- Phishing — Fake emails or websites trick users into revealing login credentials or personal details. They are often disguised as legitimate communication from banks, employers, or service providers.
- Malware — Malicious software (e.g., keyloggers, spyware, ransomware) installed on a device can capture keystrokes, exfiltrate stored credentials or files, and open backdoors for future access.
- Social engineering —Attackers manipulate people by impersonating IT staff, using information from social media to gain trust, and convincing employees to grant system access or bypass protocols.
- Weak security practices — Adversaries can also gain access to PII by exploiting to poor password hygiene, unpatched or outdated software, insecure APIs or third-party services and misconfigured cloud storage (for example, open S3 buckets).
Data Breaches & Real-World Incidents
Data breaches occur when unauthorized individuals gain access to PII. The consequences can be severe for both the organization that was breached and the individuals who data was compromised.
Here are a few high-profile incidents:
- Equifax (2017) — This attack affected 147 million people, exposing their Social Security numbers, birth dates and more.
- Facebook (2019) — 540 million user records were exposed through third-party apps stored in unsecured cloud databases.
- Marriott (2018–2020) — Hackers stole data from 500 million guests, including passport numbers and booking details.
How Attackers Exploit Stolen PII
Stolen PII has immense black-market value and can be exploited in multiple ways. Even a small amount of PII (such as name, date of birth or Social Security number) is often enough to commit these acts.
Identity Theft & Fraud
Stolen PII enables attackers to impersonate victims and perform fraudulent actions such as:
- Opening credit cards or bank accounts
- Taking out loans
- Making unauthorized purchases
- Filing false tax returns to collect refunds
Credential Stuffing & Account Takeover
If the stolen PII includes usernames and passwords, attackers can run automated scripts to test these credentials across multiple platforms (banks, email, e-commerce). Due to password reuse, this often results in full account takeovers, granting access to sensitive data and stored funds.
Spear Phishing and Social Engineering
Having detailed PII about an individual enables attackers to target them more effectively in the future. With accurate details about things like the person’s employer or recent purchases), they can pretend to be a trusted contact or service provider and trick the victim into revealing more information or making financial transactions.
Medical and Insurance Fraud
Stolen health records or insurance information can be used to:
- File false medical claims
- Receive unauthorized treatments or prescriptions
- Steal someone’s medical identity, which can lead to dangerous errors in patient care
Blackmail or Extortion
Highly sensitive PII (such as private messages or health history) may be used to threaten public exposure unless a ransom is paid, as well as to target individuals in reputation-damaging campaigns.
Dark Web Sales and Data Resale
Sometimes, attackers sell the PII instead of using it themselves. “Fullz” packages (full identity profiles) can be sold for high prices, and health records, passports and credit card data fetch premium value on dark web marketplaces.
Long-Term Impact
Unlike a stolen credit card, PII cannot always be easily replaced. Victims may face:
- Years of monitoring and legal disputes
- Repeated targeting due to data circulation among criminals
- Difficulty securing credit or jobs due to compromised records
Regulations Governing Personally Identifiable Information
Let’s have a quick look at the regulations governing PII, covering US laws, international frameworks, and differences in global definitions and protections.
US Laws
Law | Quick Info |
Privacy Act of 1974 | Scope: Applies to federal government agencies (but not to private sector entities) Key Provisions: Restricts the collection, use and dissemination of personal information by federal agenciesGrants individuals the right to access and correct their personal records |
California Consumer Privacy Act (CCPA) | Scope: Applies to for-profit entities that meet certain thresholds (such as revenue, data volume) Key Provisions: Gives California residents the right to know what personal data is collected, request deletion, and opt out of data salesRequires businesses to disclose data practices Expansion: Strengthened by the California Privacy Rights Act (CPRA), effective 2023 |
Health Insurance Portability and Accountability Act (HIPAA) | Scope: Health care providers, insurers and their business associates Key Provisions: Protects medical records and other health-related PIIEnforces safeguards for data storage, access, and transmissionIncludes breach notification requirements |
Gramm-Leach-Bliley Act (GLBA) | Scope: Financial institutions Protects: Financial and personal consumer information Requires: Privacy notices, data security policies |
Family Educational Rights and Privacy Act (FERPA) | Scope: Educational institutions Protects: Student education records and PII |
International Frameworks
Law | Quick Info |
General Data Protection Regulation (GDPR) | Scope: Applies to all entities the process the data of EU residents, regardless of the entity’s location Key Provisions: Strong emphasis on consent, transparency, and purpose limitationRights include access, rectification, erasure (right to be forgotten) and data portabilityRequires Data Protection Officers and Data Processing AgreementsSevere penalties for non-compliance (up to €20 million or 4% of global revenue) |
Australian Privacy Act (1988) | Scope: Applies to most Australian businesses with revenue over AUD $3 million, with some exceptions Key Provisions: Includes 13 Australian Privacy Principles (APPs)Governs collection, use, storage, and disclosure of personal dataGrants rights to access and correct data |
Differences in PII Definitions and Protections by Country or Region
Aspect | United States | European Union | Australia |
Definition of PII | Narrower; often context-specific (e.g., SSN, email) | Broad; any data that can identify a person, directly or indirectly | Similar to GDPR; includes any info about an identified or identifiable individual |
Consent requirement | Often implicit; varies by law | Explicit, informed, freely given consent is critical | Requires consent in many cases, but often less stringent than GDPR |
Data subject rights | Varies widely by sector/law | Extensive and uniform across member states | Includes access, correction, limited deletion rights |
Cross-border transfer restrictions | Sector-specific (e.g., HIPAA data must remain in the US) | Transfers outside EU require adequate safeguards | Transfers allowed with adequate protection or consent |
Enforcement | Sector-specific agencies (FTC, OCR, etc.) | Centralized Data Protection Authorities | Office of the Australian Information Commissioner (OAIC) |
Protecting Personally Identifiable Information
Protecting PII is crucial in today’s digital age due to increasing threats from data breaches, identity theft and cyberattacks.
Best Practices for Individuals
Best Practice | Details |
Use strong, unique passwords. | Pick long passwords. Use passphrases when possible.Create complex passwords with a mix of letters, numbers and symbols.Never use the same password for multiple platforms. |
Enable multifactor authentication (MFA). | MFA adds a second layer of security beyond just a password. Common types include SMS codes, authenticator apps and biometrics. |
Encrypt personal data. | Use full-disk encryption on devices (e.g., BitLocker, FileVault).Encrypt sensitive files and communications (e.g., with PGP, end-to-end encrypted messaging apps). |
Be wary of phishing and social engineering. | Do not click unknown links or open suspicious attachments.Verify email sender details and URLs before responding. |
Secure devices. | Keep operating systems, browsers and antivirus software up to date.Use screen locks and auto-timeouts on mobile and desktop devices.Disable Bluetooth and location sharing when not needed. |
Corporate Data Handling Protocols
Area | Best Practices |
Data classification and access control | Classify data based on sensitivity (e.g., public, internal, confidential, restricted).Grant access based on the principle of least privilege. |
PII data mapping and inventory | Maintain an up-to-date inventory of where PII is stored, processed and transmitted.This helps ensure compliance with regulations like GDPR and CCPA. |
Employee training and awareness | Conduct regular training on data privacy, phishing and secure handling of PII.Include role-specific modules for HR, IT and customer service teams. |
Encryption in transit and at rest | Use TLS/SSL for secure data transmission.Encrypt databases and storage systems to protect data at rest. |
Audit trails and monitoring | Log access to sensitive data and monitor for anomalies.Use security information and event management (SIEM) systems for real-time alerts. |
Data minimization and retention policies | Collect only necessary data.Regularly review and delete outdated or unnecessary PII. |
Reducing the Attack Surface
Best practice | Details |
Minimize data collection. | Avoid collecting unnecessary PII; assess risk vs. value of each data point.Anonymize or pseudonymize data where possible. |
Limit data sharing. | Avoid sharing sensitive data across multiple systems or third parties without proper contracts and encryption.Use tokenization or masked identifiers where practical. |
Patch and update systems. | Keep software, firmware and security tools current to fix known vulnerabilities. |
Use network segmentation. | Separate networks for different business functions (such as finance vs. operations) to limit the spread of a breach. |
Adopt a Zero Trust security model. | Authenticate and authorize every user, device and app regardless of location.Assume breach and continuously verify trust. |
Consequences of Mishandling PII
Mishandling PII can lead to serious consequences for both organizations and individuals.
Consequences for Organizations that Mishandle PII
Fines and Regulatory Sanctions
Organizations that fail to comply with data protection laws may face substantial fines. Here are some examples:
- GDPR (EU) — Fines up to €20 million or 4% of global annual turnover (whichever is higher). Examples include British Airways (£20M fine) and Marriott International (£18.4M fine).
- CCPA (California) — Civil penalties of up to $2,500 per unintentional violation and $7,500 per intentional violation. Private right of action allows consumers to sue for data breaches.
- HIPAA (US healthcare sector) — Fines range from $100 to $50,000 per violation, up to $1.5 million annually per violation type.
Class Action Lawsuits and Settlements
Victims of data breaches often initiate class actions, where settlements can cost millions in payouts, legal fees and operational changes.
Remediation Costs
Post-breach expenses include incident response, forensics, public relations, customer notification and credit monitoring. According to the IBM Cost of a Data Breach Report 2024, the average cost of a breach is $4.45 million.
Reputation Damage and Loss of Customer Trust
- Brand erosion — News of a breach spreads quickly, damaging the company’s public image. In such a situation, loss of goodwill can take years to rebuild or simply be irreversible.
- Customer attrition — Customers may switch to competitors perceived as more secure. Businesses in industries like finance, healthcare, or e-commerce are especially vulnerable to churn.
- Loss of competitive advantage — Breached organizations may lose trade secrets, proprietary information or face disrupted operations.
- Decline in market value — Publicly traded companies often experience stock price drops following breach announcements. Shareholder lawsuits may follow, especially if negligence is involved.
Consequences for Individuals Whose PII Is Mishandled
- Identity theft — Criminals can use stolen PII to open bank accounts, take out loans or file fraudulent tax returns.
- Financial fraud — Stolen credit card and banking details can result in unauthorized transactions, loss of funds and damaged credit scores.
- Emotional and psychological distress — Victims may suffer from anxiety, stress and fear of further victimization. Moreover, rebuilding identity can be a lengthy, frustrating and intrusive process.
- Targeted scams and phishing — Leaked information can be used in highly personalized scams that are difficult to detect.
- Employment or social impact — Exposure of sensitive personal details can lead to professional embarrassment or social stigmatization.
Case Studies: Major PII Breaches
Equifax Breach (2017)
Equifax, one of the largest credit reporting agencies in the US, suffered a major cybersecurity incident that exposed the sensitive personal information of some 147.9 million people in the US — nearly half of the country’s population. It also affected people in Canada and the United Kingdom. The incident stands out as one of the largest breaches of sensitive data in history.
The breach occurred between May 13 and July 30, 2017. It went undetected for 76 days, allowing attackers to exfiltrate massive amounts of data. Equifax publicly disclosed it on September 7, 2017. The attackers exploited a vulnerability (CVE-2017-5638) in Apache Struts, an open-source web application framework. This vulnerability had been publicly disclosed and patched in March 2017, but Equifax did not apply the patch in time. Later reports indicated that some systems lacked proper encryption and security protocols.
Compromised data included names, Social Security numbers, birth dates, addresses, credit card information (for around 209,000 people), and dispute documents containing personal information (for 182,000 individuals). Overall, the breach posed a significant risk of identity theft and fraud.
Reputational damage aside, Equifax incurred $1.4 billion in costs related to the breach, including legal settlements, customer support and security improvements. The settlement also included free credit monitoring and identity theft protection for affected individuals. Equifax had to invest in overhauling its cybersecurity infrastructure, implementing stronger encryption, multifactor authentication and real-time threat monitoring. On the legal front, Equifax faced several lawsuits and investigations from regulators and private entities.
Facebook-Cambridge Analytica Scandal
The Facebook–Cambridge Analytica scandal came to light in early 2018, when it was revealed that the personal data of millions of Facebook users had been harvested without consent and used for political advertising purposes. The core of the issue revolved around a third-party app that collected data through a seemingly innocuous personality quiz. While only about 270,000 users directly interacted with the app, Facebook’s API at the time allowed the app to access not just their data but also the data of their friends — ultimately compromising the information of over 87 million users. This data was then passed on to Cambridge Analytica, a political consulting firm that used the profiles to build psychographic models and target users with highly personalized political messages, particularly during the 2016 US presidential election and the Brexit campaign.
Facebook faced intense scrutiny from governments and regulators worldwide, leading to CEO Mark Zuckerberg testifying before the US Congress and the European Parliament. The company’s stock value dropped significantly and its reputation suffered substantial damage. Regulatory consequences followed, including a $5 billion fine from the US Federal Trade Commission (FTC) in 2019 — the largest ever imposed for a privacy violation at that time.
Lessons Learned from High-Profile Incidents
Here are the key lessons learned from high-profile data privacy incidents such as the Equifax breach and the Facebook–Cambridge Analytica scandal:
- Prioritize data security and patch management. Organizations must maintain an up-to-date vulnerability management program and automate patch deployment wherever possible.
- Transparent and timely communication is crucial. Having a well-prepared incident response and communication plan helps maintain trust and ensures regulatory compliance during crises.
- Know what data you have — and why. Adopt data minimization and retention principles. Collect only what’s necessary, and clearly define the purpose and legal basis for doing so.
- Strengthen access controls and internal oversight. Implement role-based access controls (RBAC), enforce the principle of least privilege and monitor data access in real time.
- User consent must be informed and granular. Ensure clear, specific opt-in consent for data collection and sharing. Avoid using obscure terms buried in privacy policies.
- Regulatory compliance isn’t optional. Violations of GDPR, CCPA and other data protection laws can lead to massive fines and lawsuits. Treat privacy compliance as a core business function, not a checkbox exercise. Engage privacy officers, legal counsel and IT in data strategy development.
- Regularly test and audit systems. Many organizations discover weaknesses only after an incident occurs. It is a best practice to conduct routine security audits, penetration tests and risk assessments to identify vulnerabilities before attackers do.
- Trust is fragile and must be earned continuously. Once lost, trust — whether from customers, partners or regulators — is extremely hard to regain. Demonstrate privacy by design, transparency and ethical data use to build long-term credibility.
- Invest in privacy education and culture. Human error remains a significant factor in data mishandling. Take steps to foster a privacy-first culture through regular training, awareness programs and leadership engagement.
PII in the Workplace and Vendor Agreements
Here’s a detailed guide on managing PII in the workplace and in vendor agreements, with emphasis on identifying PII during third-party negotiations, employee responsibilities and vendor risk management.
Identifying PII During Third-Party Negotiations
Clear PII definition | Clearly specify what qualifies as PII in your organization’s context (for example, names, emails, national IDs, biometric data). Make a point to reference regulatory definitions (GDPR, CCPA, etc.) when drafting or reviewing contracts. |
Data flow mapping | Document how PII will be collected, processed, transmitted, and stored by third parties. Moreover, identify who the data controller and data processor are, along with their responsibilities. |
Key contractual clauses to include | Make sure you include the following: Data use limitations — Ensure the vendor cannot use PII beyond the specified purpose.Data retention and deletion — Define how long data is retained and how it should be securely deleted.Breach notification — Require vendors to notify your organization within a specific time frame in case of a data breach.Audit rights — Allow for periodic privacy and security audits. |
Standard agreements | Use data processing agreements (DPAs) and service level agreements (SLAs) to codify responsibilities. For international transfers, incorporate standard contractual clauses (SCCs) or use vendors certified under a framework like Privacy Shield (historical), UK IDTA or Binding Corporate Rules. |
Vendor Risk Assessment and Compliance Protocols
Perform a pre-contract risk assessment. | Use questionnaires or risk assessment templates to evaluate: Security certifications (e.g., ISO 27001, SOC 2)Data encryption practicesBreach historySub-processor management |
Perform due diligence. | Check vendor reputation, financial health, and regulatory history.Confirm alignment with frameworks like GDPR, CCPA, HIPAA, or PCI DSS, depending on industry. |
Monitor ongoing compliance. | Require annual compliance reports or certifications.Use third-party risk monitoring services if managing many vendors. |
Include exit and termination provisions. | Contracts should specify data return or destruction procedures when ending the relationship.Ensure vendors do not retain any PII beyond contract termination unless legally required. |
PII vs. PHI: What’s the Difference?
PII and protected health information (PHI) are both foundational concepts in data privacy and security. The following will help you distinguish between the two.
Personally Identifiable Information
PII is any information that can be used to identify, contact or locate a specific individual, either directly or indirectly, such as full name, Social Security number, email address, phone number, passport number or driver’s license number. PII applies broadly across sectors, such as finance, education and retail.
Protected Health Information
PHI is a subset of PII that specifically relates to an individual’s health status, medical care or payment for health services. PHI is defined and regulated under HIPAA in the United States. Examples of PHI include:
- Medical records
- Lab results
- Insurance information
- Doctor’s notes
- Appointment schedules
PII and PHI Overlap
PHI is a subset of PII, but not all PII is PHI. For example a name is PII, but a name combined with a medical diagnosis or insurance number becomes PHI.
Legal Distinctions
PII (general) | Regulated by: Privacy Act of 1974, CCPA, GDPR, FERPA and various state-level laws Breach consequences: Vary depending on sector and jurisdiction Use cases: Customer accounts, marketing, banking, education, etc. |
PHI (under HIPAA) | Regulated by: HIPAA Privacy Rule and HIPAA Security Rule and enforced by the US Department of Health and Human Services (HHS) Applies to: Covered entities (healthcare providers, insurers) and their business associates (e.g., billing companies, cloud providers handling PHI) Requirements include: Safeguards for electronic PHI (ePHI)Patient consent for use/disclosureBreach notification rules Penalties: Can reach up to $1.5 million per year per violation type, plus criminal liability in serious cases |
Use Cases Where Both Types Apply
- A telemedicine app that stores user names and email addresses (PII) as well as appointment records and prescriptions (PHI)
- A health insurance portal that collects login credentials and contact info (PII) as well as claim history and treatment details (PHI)
- An employee wellness program that gathers demographic data for incentive tracking (PII) along with health risk assessments and lab results (PHI)
- Research studies and clinical trials that use consent forms and contact info (PII) as well as health data from patient interviews and testing (PHI)
Future of PII: Trends and Emerging Challenges
Here are some trends and challenges regarding PII to watch for.
Impact of AI and Big Data
AI and big data analytics are transforming PII management but also amplifying risks. Key concerns include:
- Exploding data volumes — AI systems feed on massive datasets, often containing sensitive personal data. Even anonymized data can be re-identified with enough auxiliary information.
- Profiling & inferences — AI can infer sensitive information (for example, political views, health status) from seemingly innocuous data, creating new categories of “derived PII.”
- Automated decision-making — AI models influence important decisions in areas like employment, lending and healthcare. Concerns include fairness and explainability.
- Data minimization challenges — AI thrives on large datasets, which contradicts the principle of collecting only necessary data.
Core challenges will be balancing innovation with privacy, and ensuring proper oversight in AI-driven decisions involving PII
Rise of Biometric Identifiers
Biometric data — facial recognition, iris scans, fingerprints, voice prints, DNA — is increasingly used for authentication, surveillance and personalization. For example, this data is being used in smartphone security, airport clearance, workplace access and healthcare.
Risks include:
- Irrevocability — Unlike passwords, you can’t change your face or fingerprints if they are compromised.
- Mass surveillance — Government and corporate use of facial recognition has triggered civil liberties concerns.
- Data breaches — Biometric databases are prime targets due to the high value and permanence of the data.
A key challenge will be creating legal and technical safeguards for the ethical use and storage of biometrics.
Calls for Unified Global Privacy Frameworks
Many countries (as well as individual US states) are developing their own privacy laws, which increases complexity:
- Inconsistent laws hinder international data transfers and global business.
- Multinational companies face costly, redundant compliance efforts.
Efforts are underway to draft frameworks that address these concerns. They include:
- OECD and ISO efforts toward standardization
- Proposed UN conventions on digital privacy
- Discussions around a “Global Privacy Accord” or international data protection treaty
The challenge will be aligning privacy rights and enforcement mechanisms globally while respecting national sovereignty and cultural differences.
Looking Ahead
Key Emerging Trends
- Privacy by design — Integrating privacy into product development from the ground up
- Zero-Trust architectures —Moving beyond perimeter security to identity-centric models
- Synthetic data — Using AI-generated datasets to train models without using real PII
- Decentralized identity (DID) — Empowering users with control over their identity using blockchain or similar tech
Key Challenges
- Maintaining trust in a hyper-connected world
- Balancing innovation with individual rights
- Preventing surveillance capitalism and digital authoritarianism
About Netwrix Data Classification
Identifying and safeguarding PII is essential to avoid costly breaches and ensure compliance with strict data security regulations like GDPR and HIPAA. However, PII is often dispersed across various data environments, making manual identification both difficult and error-prone.
Netwrix Data Classification empowers organizations to accurately identify, categorize and secure sensitive data, including PII. This process helps them reduce data-related risks, ensure compliance and improve operational efficiency. Key capabilities related to PII include:
- Automated discovery — Uses predefined and customizable rules to detect various types of PII across on-premises file shares, SharePoint, cloud storage and more.
- Data categorization —Tags files based on sensitivity and content type (for example, financial data, health data), which helps prioritize protection efforts based on risk.
- Regulatory mapping —Supports compliance with regulations like GDPR, CCPA and HIPAA by mapping PII to legal requirements and streamlining audits and reporting.
- Risk mitigation — Identifies overexposed or mismanaged PII (for example, data stored in public folders or shared widely), and supports remediation by integrating with Netwrix Auditor and other DLP/IRM solutions.
- Process of data subject access requests (DSARs) — Accelerates DSAR handling by quickly locating all PII associated with an individual across repositories.
FAQ
What is the best definition of personally identifiable information (PII)?
PII is data that can reveal an individual’s identity. The following definition is widely accepted by privacy frameworks such as the US National Institute of Standards and Technology (NIST) and international regulations like GDPR and CCPA, though the specific terminology may vary slightly:
Any information that can be used to identify, contact, or locate a single individual, either directly or indirectly.
What are some examples of PII?
Some common examples of PII are:
- Full Name: Jane Smith, Ahmed Khan
- Government IDs: Social Security Number (SSN), passport number, driver’s license scan
- Contact details: Email address, phone number, mailing address
- Biometric data: Fingerprints, facial recognition, retina scans
- Financial info: Credit card number, bank account details
- Digital identifiers: National ID, taxpayer ID number, student ID
What are the two types of PII?
- Direct identifiers can uniquely identify an individual on their own. Examples include full name, passport number, phone number and credit card number.
- Indirect identifiers do not identify an individual when used alone but can when combined with other data. Examples of indirect identifiers include demographic info, employment data and device info.
See the Types of Personally Identifiable Information section for details.
What laws protect PII?
Regulations that protect PII vary by country and industry, but they all share the goal of ensuring that individuals have control over their personal data, and that organizations handle PII responsibly. Here are some of the most well-known laws.
Country / Region | Laws |
United States | HIPAA (Health Insurance Portability and Accountability Act)GLBA (Gramm-Leach-Bliley Act)FERPA (Family Educational Rights and Privacy Act)CCPA / CPRA (California Consumer Privacy Act / Privacy Rights Act) |
Canada | Personal Information Protection and Electronic Documents Act (PIPEDA) |
European Union | General Data Protection Regulation (GDPR) |
Singapore | Personal Data Protection Act (PDPA) |
See the Regulations Governing Personally Identifiable Information section for details.
Can non-sensitive data become PII?
Yes, non-sensitive data can become PII when it is combined with other data in a way that allows an individual to be identified, located or contacted. For example, a zip code alone is not PII, but combining it with date of birth and gender could uniquely identify a person.