The Data You Think Is Protected Isn't
Understanding PHI, HIPAA, and AI—what makes this different from every previous healthcare technology.
Why does AI create genuinely new privacy challenges—not just faster versions of old problems—and how do you recognize the exposures that matter in clinical workflows?
In September 2024, a physician at an Ontario hospital installed Otter.ai—an AI transcription tool—on his personal device. He'd left the hospital over a year earlier but was still on the invite list for weekly hepatology rounds. When the next meeting started, the Otter bot joined automatically, recorded physicians discussing seven patients by name, and emailed a transcript to 65 people—including 12 who no longer worked at the hospital.
No one in the meeting knew the bot was there. Patient names, diagnoses, and treatment details were now sitting in inboxes of people who had no business seeing them—and on Otter's servers, where the company's privacy policy allows use of recordings to train their AI models.
This incident captures why AI is genuinely different from previous healthcare technology. It's not just a faster fax machine or a better database. AI tools can act autonomously, join meetings without invitation, record without visible indication, and transmit data to external servers in ways that bypass every traditional safeguard. The regulatory framework—HIPAA, designed in 1996 for paper charts and fax machines—wasn't built for software that learns from every input and can take actions on its own.
This module will help you understand where the real risks are. We'll cover the basics of HIPAA and PHI, but the goal isn't comprehensive compliance training—it's developing intuition for the edge cases that matter when AI enters clinical workflows. You'll learn to recognize three categories of PHI exposure that escalate in subtlety: direct PHI (the obvious identifiers), indirect PHI (data that becomes identifying when combined), and shadow AI (the untracked tools that create compliance blind spots—like that Otter bot).
What Makes AI Different
Traditional healthcare IT systems—EHRs, billing systems, scheduling software—are essentially sophisticated databases. They store what you put in, retrieve what you ask for, and follow deterministic rules. If patient data leaks, it's usually because of a configuration error, a hack, or human mistake. The system itself doesn't learn, doesn't act autonomously, and doesn't transmit data unless explicitly programmed to.
AI systems are fundamentally different in several ways that matter for privacy:
The Velocity Problem
New AI tools appear weekly. Employees adopt them because they genuinely help—who wouldn't want to cut documentation time in half? But each tool potentially creates a new data flow, a new vendor relationship, and a new compliance gap. Traditional IT governance, with its months-long procurement cycles, can't keep pace. By the time a tool is formally evaluated, half the staff may already be using it.
HIPAA Fundamentals for AI
Who HIPAA Actually Covers (and Who It Doesn't)
HIPAA is an entity-based framework, not a data-based one. This distinction is critical. The law doesn't protect "health data"—it protects health data held by specific types of organizations.
| Category | Examples | HIPAA Coverage |
|---|---|---|
| Covered Entities | Health plans, clearinghouses, providers who transmit electronically | Yes |
| Business Associates | AI vendors with signed BAA | Yes (via BAA) |
| Consumer Apps | Health apps, fitness trackers, wearables | No |
| AI Without BAA | ChatGPT (consumer), most free AI tools | No |
The gap this creates: When a patient enters symptoms into a consumer health app, or a clinician pastes notes into ChatGPT without a BAA, that data isn't protected by HIPAA. The FTC has stepped into some of this gap with the Health Breach Notification Rule, but enforcement is inconsistent and the protections are narrower.
The Three HIPAA Rules That Matter for AI
Privacy Rule: Governs how PHI can be used and disclosed. PHI can generally only be used for treatment, payment, and healthcare operations without explicit patient authorization. Using PHI to train commercial AI models typically requires either authorization or de-identification.
Security Rule: Requires administrative, physical, and technical safeguards for electronic PHI. In January 2025, HHS proposed the first major update in 20 years, mandating encryption, multi-factor authentication, and 72-hour disaster recovery. AI systems processing ePHI will face these enhanced standards.
Breach Notification Rule: Requires notification within 60 days of discovering a breach of unsecured PHI. From 2018-2023, large breaches increased 102% and affected individuals increased 1,002%. An AI system that inadvertently exposes PHI triggers these requirements.
Direct PHI in AI Systems
Direct PHI includes the 18 categories of identifiers that HIPAA's Safe Harbor de-identification method requires you to remove. These are the obvious markers that link data to individuals.
The 18 Safe Harbor Identifiers
| Identifier Category | AI-Specific Considerations |
|---|---|
| Names | May appear in transcription, NLP outputs, training data |
| Geographic data smaller than state | ZIP codes in combined datasets; geolocation in app data |
| Dates (except year) related to individual | Visit timestamps, DOB, admission/discharge dates |
| Phone/fax numbers, email addresses | Contact info in scheduling data, patient portals |
| SSN, medical record numbers | Often embedded in EHR exports used for training |
| Device identifiers, IP addresses | Logged by AI systems, app analytics, telehealth platforms |
| URLs, biometric identifiers, photos | CT/MRI reconstructions, voice prints, facial images |
| Any unique identifying number/code | Patient IDs, encounter numbers, prescription IDs |
Where Direct PHI Enters AI Systems
Training Data: When AI models are trained on clinical data, PHI may be embedded in the model weights themselves. This creates a form of data persistence that's difficult to audit and impossible to fully "delete." Model inversion attacks have demonstrated the ability to extract training data from some models.
Inference-Time Inputs: When clinicians paste patient notes into AI tools, that data is transmitted to external servers. Even if the vendor promises not to use it for training, it may be logged, cached, or retained for abuse monitoring. Most consumer AI tools retain data for 30+ days.
Generated Outputs: AI-generated clinical notes, summaries, and recommendations become PHI themselves. If an ambient scribe generates a SOAP note, that output requires the same protections as a manually-written note.
The BAA Requirement
Before using any AI tool with PHI, a Business Associate Agreement must be in place. The BAA establishes the vendor as a business associate, binding them to HIPAA requirements including safeguards, use restrictions, and breach notification.
A vendor claiming to be "HIPAA compliant" means nothing without a signed BAA. The FTC has taken enforcement actions against companies making false HIPAA compliance claims (GoodRx, BetterHelp). Always verify with documentation.
Current BAA Availability by Platform
- OpenAI API: BAA available for zero-retention endpoints (contact baa@openai.com)
- ChatGPT (consumer): No BAA available. Cannot be used with PHI.
- ChatGPT Enterprise/Edu: BAA available through sales-managed accounts
- Anthropic API: BAA available only for HIPAA-eligible services with zero data retention; does not cover Claude.ai, Workbench, or beta features
- Claude (consumer/Pro/Team): No BAA available. Cannot be used with PHI.
- Azure OpenAI: BAA included in Microsoft Product Terms by default
- Google Vertex AI: BAA available for healthcare customers
- Google Workspace (Gemini): BAA available via Admin console; requires configuration. Note: NotebookLM is NOT covered.
- AWS Bedrock: HIPAA-eligible service; BAA available through AWS Business Associate Addendum
- AWS HealthScribe: HIPAA-eligible service with BAA coverage
- OpenEvidence: HIPAA compliant; free BAA available for covered entities
- Glass Health: HIPAA-compliant enterprise offering; contact for BAA
- Purpose-built scribes (Freed, Suki, Abridge, etc.): Generally offer BAAs; verify before deployment
Indirect PHI and Re-identification Risk
This is where the regulatory framework shows its age. HIPAA's Safe Harbor method was designed before modern machine learning, before data brokers, before the explosion of auxiliary datasets that make re-identification increasingly feasible.
The Mosaic Effect
The mosaic effect describes how individually benign data points become identifying when combined. In 1997, researcher Latanya Sweeney demonstrated that 87% of Americans could be uniquely identified using just three data points: ZIP code, birth date, and gender. She famously re-identified Massachusetts Governor William Weld's medical records from "anonymized" hospital data by linking it to publicly available voter rolls.
For AI systems, this creates a fundamental tension. Machine learning thrives on rich, detailed data. De-identification that's sufficient to prevent re-identification often strips the clinical utility that makes the data valuable for training or analysis.
How Many Data Points Does It Take?
The research on re-identification is sobering. Multiple studies have consistently shown that 3-5 indirect identifiers are typically sufficient to re-identify individuals from medical records, especially when combined with publicly available datasets.
- Sweeney (1997, 2000): Demonstrated that 87% of the U.S. population can be uniquely identified by just 3 variables—ZIP code, birth date, and gender. Using only publicly available voter registration data, she re-identified the Massachusetts Governor's medical records.
- Golle (2006): Found that combining gender, ZIP code, and birth date uniquely identifies 63% of the U.S. population. Adding a fourth variable (like race or marital status) increases this substantially.
- Narayanan & Shmatikov (2008): Re-identified Netflix users by combining just 2-8 movie ratings with timestamps against public IMDB reviews. The same technique applies to healthcare—sparse data points combined with auxiliary information.
- El Emam et al. (2011): Systematic review of re-identification attacks found that records with as few as 3 quasi-identifiers were vulnerable, with success rates ranging from 10-35% depending on the external dataset used.
- Rocher et al. (2019): Using machine learning on a dataset with just 15 demographic attributes, correctly re-identified 99.98% of Americans. Even incomplete datasets with fewer attributes achieved 83% accuracy with only 3-4 data points.
Individual data points look safe. Combined, they form a fingerprint.
The data consistently shows that 3-5 indirect identifiers are typically sufficient to re-identify individuals from medical records, especially when combined with publicly available datasets. This is why HIPAA's Safe Harbor method requires removing all 18 direct identifiers AND ensuring no actual knowledge exists that remaining information could identify individuals.
The implication for AI is clear: when you paste a clinical scenario into ChatGPT or any consumer AI tool, the combination of age, gender, diagnosis, medications, and timeline may be sufficient to identify your patient—even if you've removed their name.
Quasi-Identifiers in Healthcare Data
Even after removing the 18 Safe Harbor identifiers, healthcare data contains numerous quasi-identifiers that can enable re-identification when combined with external data sources:
- Rare diagnoses: A unique combination of conditions may identify a patient within a population
- Treatment patterns: Specific medication sequences or surgical combinations
- Temporal patterns: Visit frequency, hospitalization duration, time between events
- Provider relationships: Combination of specialists seen
- Demographic combinations: Age range + gender + ethnicity + approximate location
- Free-text narratives: Clinical notes often contain identifying details even after NER-based scrubbing
AI-Specific Re-identification Risks
Image Reconstruction: CT and MRI scans contain sufficient geometric information to reconstruct facial features. A "de-identified" head CT can potentially be matched to a photograph using 3D reconstruction techniques. AI dramatically improves the feasibility of such attacks.
Voice Prints: Voice recordings—including those captured by ambient scribes—are explicitly listed as biometric identifiers under HIPAA. Speaker identification models can match voices across recordings, potentially linking "anonymized" research data to identified individuals.
Training Data Extraction: Model inversion and membership inference attacks can potentially extract or verify the presence of specific records in training data. This transforms the model itself into a form of data storage that may be subject to HIPAA requirements.
The Safe Harbor List Is Outdated
The 18 Safe Harbor identifiers were defined in 1996 and haven't been updated since. They don't account for:
- Social media: Platforms that didn't exist and now contain vast amounts of personally identifying information
- Commercial data brokers: Companies that aggregate consumer data from thousands of sources
- Genetic data: DNA sequences that are uniquely identifying and increasingly available through consumer testing
- Location history: Smartphone data that can uniquely identify individuals through movement patterns
- Wearable data: Health metrics from fitness trackers that create unique biometric signatures
- AI capabilities: Machine learning tools that can find patterns across datasets at scale impossible in 1996
Shadow AI and Stealth PHI Exposure
Shadow AI is the healthcare equivalent of shadow IT: employees using AI tools without organizational approval, oversight, or integration into compliance frameworks. It's the fastest-growing and least-controlled category of PHI exposure.
The Scale of the Problem
Research indicates that nearly 95% of healthcare organizations believe their staff are already using generative AI in email or content workflows. 62% of leaders have directly observed employees using unsanctioned tools. Yet a quarter of organizations have not formally approved any AI use—meaning staff are acting without oversight, outside compliance frameworks, and without BAAs in place.
Shadow AI incidents account for 20% of AI-related security breaches—7 percentage points higher than incidents involving sanctioned AI.
Want proof that AI is a work tool, not a toy? Similarweb data shows ChatGPT weekday usage is 50-60% higher than weekends. The pattern is so consistent it creates a "sawtooth" graph—traffic spikes Monday through Friday, then drops every Saturday and Sunday.
This mirrors what healthcare organizations are seeing: clinicians use AI during working hours, for work tasks, to solve work problems. They're not browsing ChatGPT for fun—they're using it to write notes, draft letters, and look up information. Which means every weekday, AI tools are processing work content. The question is whether that content includes PHI, and whether the tools have BAAs.
Anatomy of the Ontario Hospital Breach
Let's return to the Otter.ai incident, because it illustrates almost every shadow AI failure mode:
| Failure | What Happened |
|---|---|
| Personal device, work data | Physician installed Otter on personal device using personal email still on meeting invite list |
| No offboarding process | Physician left in June 2023 but remained on meeting invites until breach in September 2024—over 15 months |
| Autonomous AI action | Otter's "notetaker bot" joined the meeting automatically based on calendar invite. No one clicked anything. |
| No visibility | Participants didn't notice the bot until emails went out. PHI already transmitted to Otter's servers. |
| Incomplete remediation | Of 65 recipients, only 53 confirmed deletion. Data remained on Otter's servers for potential model training. |
Common Shadow AI Scenarios
Clinical Documentation: A physician pastes patient notes into ChatGPT to generate a summary or draft a letter. They've just transmitted PHI to a platform without a BAA, potentially violating HIPAA.
Administrative Tasks: A clinic manager uploads patient scheduling data to an AI tool for analysis. A billing specialist uses AI to help with denial appeals. Each creates an untracked data flow to unvetted platforms.
The Productivity Trap: A physician discovers ChatGPT can generate patient summaries in seconds. They start with de-identified summaries, then gradually include more context, then patient names "just this once" when running late. By the time anyone notices, months of PHI has been transmitted.
Why Shadow AI Happens
- Productivity pressure: Clinicians are drowning in documentation. AI tools offer genuine time savings.
- Approval friction: Sanctioned AI tools require lengthy procurement. Free tools are available immediately.
- Awareness gaps: Many users don't understand that pasting text into an AI tool constitutes data transmission to an external server.
- Tool limitations: Approved tools may not meet user needs, pushing staff to seek alternatives.
- Embedded AI: AI features are increasingly embedded in approved tools (CRM systems, email clients) without separate vetting.
Governance Approaches
Industry experts advise against blanket bans—they don't work and push usage further underground. Instead:
- Provide alternatives: Deploy enterprise AI tools with BAAs, security controls, and logging. Make the approved path as convenient as the shadow path.
- Create fast-track approval: Establish a streamlined process for evaluating new AI tools. Reduce the friction that drives shadow usage.
- Educate continuously: Help staff understand why the restrictions exist and what's at stake. Focus on the "why" rather than just the "don't."
- Monitor adaptively: Use tools that can detect AI usage. Treat detections as opportunities for guidance, not punishment.
- Listen to shadow users: Shadow AI reveals unmet needs. Use it as free market research—what are people trying to accomplish?
What You Should Actually Do
Before Deploying Any AI System
- Map the data flows: Where does PHI enter the system? Where is it stored? Where does it go? Who has access?
- Verify BAA status: Is a Business Associate Agreement in place? What does it actually cover? Does it address AI-specific risks?
- Assess minimum necessary: Does the AI need all the data being provided? Can inputs be limited to what's actually required?
- Evaluate de-identification: If using de-identified data, which method was used? Has re-identification risk been assessed in light of AI capabilities?
- Review data retention: How long does the vendor retain inputs? Are they used for model training? What happens to audit logs?
- Plan for breach: If PHI is exposed through this system, what's the notification plan? Who's responsible for detection?
- Document everything: Risk assessments, vendor evaluations, configuration decisions, training records. The documentation is your defense.
Vendor Evaluation Questions
When evaluating AI vendors for healthcare use, ask:
- Will you sign our BAA (or provide your standard BAA for review)?
- Where is data processed and stored? Which jurisdictions?
- Is any data used to train or improve your models? How can we opt out?
- What encryption is used in transit and at rest?
- How long is data retained? Can we specify retention limits?
- What audit logging is available? Who can access logs?
- What's your incident response process? What's the notification timeline?
- Have you completed SOC 2 Type II? Can we review the report?
We're operating in a gap between what AI can do and what regulators have addressed. HIPAA wasn't written for neural networks that learn from every input. The safest approach is conservative: treat AI systems that touch patient data as high-risk by default, require BAAs and security verification before deployment, and build governance structures that can adapt as both capabilities and regulations evolve.
Upload this module's readings to a NotebookLM notebook and explore:
- Ask it to summarize the key differences between Safe Harbor and Expert Determination de-identification
- Generate a checklist for evaluating AI vendor compliance
- Create an Audio Overview comparing the Ontario breach to other shadow AI scenarios
Remember: NotebookLM itself is grounded to your uploaded sources—a practical example of how constrained AI tools can be safer for sensitive content.
Readings
Reflection Questions
- Think about AI tools you or your colleagues currently use. Do any of them involve patient data? Is there a BAA in place?
- If you discovered a colleague was using ChatGPT with patient notes to save time on documentation, how would you approach that conversation?
- Consider the mosaic effect: what combinations of data in your organization might allow re-identification even after Safe Harbor de-identification?
- How would you design an AI governance process that reduces shadow AI while supporting clinician productivity?
Learning Objectives
- Explain why AI creates genuinely new privacy challenges compared to traditional healthcare IT
- Identify who HIPAA covers and recognize the coverage gaps that affect AI tools
- List the 18 Safe Harbor identifiers and explain their limitations in the AI era
- Describe the mosaic effect and quasi-identifiers that enable re-identification
- Recognize shadow AI patterns and explain why blanket bans are ineffective
- Apply a practical checklist for evaluating AI vendors before PHI exposure