PII vs PHI: What's the Difference?

A clear comparison of Personally Identifiable Information and Protected Health Information — definitions, legal frameworks, overlap, and use cases

Working with health documents? Use our free redaction tool to remove PII and PHI-related identifiers before using AI tools — 100% local, nothing uploaded.

Definitions at a Glance

PII

Personally Identifiable Information

Any data that can identify a specific individual — directly or in combination with other data. Includes names, email addresses, national IDs, financial account numbers, and IP addresses.

PII is the broader term, used across multiple legal frameworks and industries.

GDPR · CCPA · HIPAA · FTC · PIPEDA

PHI

Protected Health Information

A subset of PII specific to the US healthcare sector. PHI is any individually identifiable health information created, received, maintained, or transmitted by a HIPAA-covered entity or business associate.

PHI must relate to a person's past, present, or future health condition, care, or payment for care.

HIPAA (US only)

The key relationship: all PHI is PII, but not all PII is PHI. A person's email address is PII. Their email address combined with a diagnosis and stored in a hospital system is PHI. The same data in a retail loyalty programme is just PII — not PHI, because the holder is not a HIPAA-covered entity.

PII vs PHI — Full Comparison Table

Attribute PII PHI
Full name ✓ Always PII ✓ PHI when linked to health data
Email address ✓ Always PII ✓ PHI when in healthcare context
Phone number ✓ Always PII ✓ PHI when in healthcare context
Postal / home address ✓ Always PII ✓ PHI when in healthcare context
Date of birth ✓ Always PII ✓ PHI — one of 18 Safe Harbor identifiers
National ID / SSN ✓ Always PII ✓ PHI when in healthcare records
Medical diagnosis ✓ PII (GDPR special category) ✓ Always PHI
Prescription details ✓ PII (GDPR special category) ✓ Always PHI
Insurance policy number ✓ PII ✓ PHI — Safe Harbor identifier
Lab / test results ✓ PII (GDPR special category) ✓ Always PHI
Biometric data ✓ PII (GDPR special category) ✓ PHI when in healthcare records
IP address ✓ PII under GDPR ✓ PHI — Safe Harbor identifier
Credit card number ✓ Always PII ✗ Not PHI (financial, not health data)
IBAN / bank account ✓ Always PII ✗ Not PHI (unless linked to health billing)
Job title / employer ~ Context-dependent PII ✗ Not PHI
Vehicle registration ✓ PII ✓ PHI — Safe Harbor identifier
Dates of care / admission ~ PII if combined with identity ✓ Always PHI
Geographic data (sub-state) ✓ PII if specific enough ✓ PHI — Safe Harbor identifier
Political opinion / beliefs ✓ PII (GDPR special category) ✗ Not PHI
Race / ethnic origin ✓ PII (GDPR special category) ~ PHI if in health record context

Where PII and PHI Overlap

The two categories share significant common ground — especially for organisations operating in healthcare.

PII only
  • Credit card numbers
  • IBANs / bank accounts
  • Political opinions
  • Religious beliefs
  • Social media handles
  • Cookie IDs
  • VAT / Tax ID numbers
Both PII & PHI
  • Full name
  • Email address
  • Phone number
  • Address
  • Date of birth
  • National ID / SSN
  • IP address
  • Biometric data
  • Diagnosis
  • Prescriptions
  • Insurance number
  • Vehicle registration
PHI focus
  • Dates of admission
  • Discharge dates
  • Treating clinician
  • Health plan data
  • Medical record numbers
  • Certificate / licence numbers
  • Geographic substate data

The central overlap is substantial. Any data that appears in both a person's identity and their health record simultaneously qualifies as both PII and PHI — and must meet the requirements of both GDPR (or applicable law) and HIPAA if you operate across jurisdictions.

Legal Frameworks — GDPR vs HIPAA

GDPR (EU)

  • Applies to all personal data of EU residents
  • Health data = special category (Article 9)
  • Applies to any organisation worldwide serving EU users
  • Requires explicit consent or other legal basis
  • Fines: up to €20M or 4% global turnover
  • Enforced by national Data Protection Authorities

HIPAA (US)

  • Applies only to healthcare entities & business associates
  • PHI = 18 specific Safe Harbor identifiers + health context
  • US-only jurisdiction (but data subject may be anywhere)
  • Requires minimum necessary standard for disclosure
  • Fines: up to $1.9M per violation category per year
  • Enforced by HHS Office for Civil Rights
💡 Operating in both the US and EU? You need to satisfy both frameworks. GDPR's health data rules (Article 9) and HIPAA's Safe Harbor method address the same underlying risk — protecting individuals from harm caused by health data disclosure. Meeting both requires removing all 18 HIPAA identifiers and all GDPR special category data before sharing with AI tools or third parties.

The 18 HIPAA Safe Harbor Identifiers

Under HIPAA's Safe Harbor de-identification method, all 18 of the following must be removed for health data to be considered de-identified and no longer subject to HIPAA protections.

Names
Geographic data (sub-state)
All dates (except year)
Phone numbers
Fax numbers
Email addresses
Social Security Numbers
Medical record numbers
Health plan beneficiary numbers
Account numbers
Certificate / licence numbers
Vehicle identifiers / serial numbers
Device identifiers / serial numbers
Web URLs
IP addresses
Biometric identifiers
Full-face photographs
Any other unique identifying number or code
PrivacyPromptAI automatically detects names, dates, phone numbers, email addresses, SSNs, account numbers, vehicle identifiers, IP addresses, and biometric data — covering the majority of the Safe Harbor list. For healthcare documents, always verify the output manually against the full list before using AI tools. Try the free tool →

Real-World Use Cases

How the PII/PHI distinction plays out across different professional contexts.

PII + PHI

🏥 Hospital using ChatGPT to summarise patient notes

Patient notes contain both standard PII (name, DOB, address) and PHI (diagnosis, treatment dates, prescriptions). Both GDPR (Article 9) and HIPAA apply. All 18 Safe Harbor identifiers plus GDPR special category data must be removed before using any AI tool.

PII only

📄 HR team using Claude to draft an employment contract

Employment contracts contain PII (name, address, DOB, bank details, salary). No PHI is involved — the holder is not a healthcare entity. GDPR applies; HIPAA does not. Remove all personal identifiers with PrivacyPromptAI before pasting into Claude.

PHI focus

💊 Pharmacy analysing prescription data with Gemini

Prescription records are definitionally PHI under HIPAA. The pharmacy is a covered entity. All 18 Safe Harbor identifiers must be removed. GDPR also applies if any EU patients are involved. Double de-identification required before using Gemini or any cloud AI.

PII only

📊 Marketing team using Copilot to analyse customer data

Customer lists include names, emails, and purchase history — all PII under GDPR and CCPA. No health data is involved, so HIPAA does not apply. Redact the customer identifiers with PrivacyPromptAI before using Copilot to analyse trends.

PII + PHI

🧬 Research lab sharing genomic data with an AI analysis tool

Genomic data is both GDPR special category data (genetic data, Article 9(1)(e)) and PHI under HIPAA (if the lab is a covered entity or business associate). Both frameworks require explicit legal basis. Pseudonymisation alone is rarely sufficient for genetic data.

PII only

⚖️ Law firm using AI to review contracts

Legal documents contain PII (client names, addresses, IDs, financial terms). Unless the firm handles medical records, PHI rules do not apply. GDPR and attorney-client confidentiality obligations apply. Use PrivacyPromptAI to redact before AI-assisted contract review.

Remove PII and PHI-Related Identifiers — Free

PrivacyPromptAI detects names, emails, dates, SSNs, phone numbers, IBANs, IP addresses, and more — covering the majority of HIPAA Safe Harbor identifiers. 100% local. No signup. GDPR-compliant.

Try Free Redaction Tool GDPR Compliance Guide →

Frequently Asked Questions — PII vs PHI

Ask two questions: (1) Does the information relate to a person's past, present, or future health condition, healthcare, or payment for healthcare? (2) Is it held or transmitted by a HIPAA-covered entity (hospital, clinic, insurer, pharmacy) or their business associates? If both answers are yes, it is PHI. If the information is health-related but held by a non-healthcare entity (e.g. a wellness app or HR department), HIPAA may not apply — but GDPR's special category rules likely do.
They overlap significantly but are not identical. GDPR's health data definition (Article 4(15)) is broader — it covers any data relating to physical or mental health, regardless of who holds it. HIPAA's PHI is narrower in one direction (only covered entities) but broader in another (includes administrative data like billing and appointment scheduling). If you hold data that qualifies as both, you must meet the stricter requirement from each framework simultaneously.
Under HIPAA's Safe Harbor method, data from which all 18 identifiers have been removed — and from which there is no actual knowledge it could re-identify an individual — is no longer PHI and is not subject to HIPAA restrictions. You can then use it with AI tools without triggering HIPAA obligations. However, GDPR's anonymisation standard is stricter: data is only truly anonymous if re-identification is "reasonably impossible." Always apply both tests before sharing with AI. For storing de-identified records securely afterwards, pCloud offers client-side encrypted Swiss cloud storage. Learn how to remove PII/PHI step by step →
PrivacyPromptAI detects the identifiable components of PHI — names, email addresses, phone numbers, dates, SSNs, account numbers, IP addresses, vehicle identifiers, and biometric references — which correspond directly to the majority of HIPAA's 18 Safe Harbor identifiers. It does not detect clinical content such as diagnoses or prescriptions by text meaning. For healthcare documents, the tool handles the identity layer; a clinical reviewer should verify the health content layer. See the full list of detected types →
The 18 HIPAA Safe Harbor identifiers (45 CFR § 164.514(b)) that must be removed to de-identify PHI are: (1) Names, (2) Geographic data smaller than a state, (3) Dates other than year, (4) Telephone numbers, (5) Fax numbers, (6) Email addresses, (7) Social Security Numbers, (8) Medical record numbers, (9) Health plan beneficiary numbers, (10) Account numbers, (11) Certificate/licence numbers, (12) Vehicle identifiers, (13) Device identifiers, (14) Web URLs, (15) IP addresses, (16) Biometric identifiers, (17) Full-face photographs, (18) Any other unique identifying number or code. PrivacyPromptAI detects the text-based identifiers in this list automatically. See our full FAQ →
Yes, if it is held by a HIPAA-covered entity and can be linked — directly or indirectly — to a specific individual. An ICD-10 code or CPT billing code alone does not name a person, but in the context of a patient record where other identifiers are present, it is part of PHI. Similarly under GDPR, a diagnosis code linked to any other data that identifies the individual is special category health data (Article 9). When sharing clinical or billing documents with AI tools, remove both the diagnosis codes' context (patient identifiers) and any rare condition codes that could re-identify a patient in a small dataset. See Compliance →
HIPAA's Security Rule lists encryption as an "addressable" (not required) safeguard for electronic PHI (ePHI) — but this does not mean it is optional in practice. If you determine that encryption is not reasonable and appropriate for your situation, you must document why and implement an equivalent alternative. In reality, most regulators and auditors expect encryption for ePHI at rest and in transit. When it comes to AI tools, the question of encryption is secondary — if PHI enters an AI system, the data processor relationship and consent issues arise regardless of transport security. The correct approach is de-identification via Safe Harbor removal before any AI processing. When transmitting ePHI across networks, NordVPN provides an audited, no-logs encrypted connection. For the legal side, Termly can generate a GDPR/HIPAA-ready privacy policy. How to remove PII/PHI step by step →