Protected health information

Protected health information (PHI) under U.S. law is any information about health status, provision of health care, or payment for health care that is created or collected by a Covered Entity (or a Business Associate of a Covered Entity), and can be linked to a specific individual. This is interpreted rather broadly and includes any part of a patient's medical record or payment history.

Instead of being anonymized, PHI is often sought out in datasets for de-identification before researchers share the dataset publicly. Researchers remove individually identifiable PHI from a dataset to preserve privacy for research participants.

There are many forms of PHI, with the most common being physical storage in the form of paper-based personal health records (PHR). Other types of PHI include electronic health records, wearable technology, and mobile applications. In recent years, there has been a growing number of concerns regarding the safety and privacy of PHI.

United States
Under the U.S. Health Insurance Portability and Accountability Act (HIPAA), PHI that is linked based on the following list of 18 identifiers must be treated with special care:


 * 1) Names
 * 2) All geographical identifiers smaller than a state, except for the initial three digits of a zip code if, according to the current publicly available data from the U.S. Bureau of the Census: the geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; the initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000
 * 3) Dates (other than year) directly related to an individual
 * 4) Phone Numbers
 * 5) Fax numbers
 * 6) Email addresses
 * 7) Social Security numbers
 * 8) Medical record numbers
 * 9) Health insurance beneficiary numbers
 * 10) Account numbers
 * 11) Certificate/license numbers
 * 12) Vehicle identifiers and serial numbers, including license plate numbers;
 * 13) Device identifiers and serial numbers;
 * 14) Web Uniform Resource Locators (URLs)
 * 15) Internet Protocol (IP) address numbers
 * 16) Biometric identifiers, including finger, retinal and voice prints
 * 17) Full face photographic images and any comparable images
 * 18) Any other unique identifying number, characteristic, or code except the unique code assigned by the investigator to code the data

HIPAA Privacy Rule
The HIPAA Privacy Rule addresses the privacy and security aspects of PHI. There are three main purposes which include:


 * 1. To protect and enhance the rights of consumers by providing them access to their health information and controlling the inappropriate use of that information;


 * 2. To improve the quality of health care in the United States by restoring trust in the health care system among consumers, health care professionals, and the multitude of organizations and individuals committed to the delivery of care; and


 * 3. To improve the efficiency and effectiveness of health care delivery by creating a national framework for health privacy protection that builds on efforts by states, health systems, and individual organizations and individuals.

LabMD, Inc. v. Federal Trade Commission
In 2016, the U.S. Circuit Court of Appeals for the Eleventh Circuit overturned the decision in LabMD, Inc. v. Federal Trade Commission (FTC). The FTC filed a complaint against medical testing laboratory LabMD, Inc. alleging that the company failed to reasonably protect the security of consumers’ personal data, including medical information. The FTC alleged that in two separate incidents, LabMD collectively exposed the personal information of approximately 10,000 consumers. The court vacated the original cease-and-desist order, stating that it would "mandate a complete overhaul of LabMD’s data-security program and says little about how this is to be accomplished.”

De-identification versus anonymization
Anonymization is a process in which PHI elements are eliminated or manipulated with the purpose of hindering the possibility of going back to the original data set. This involves removing all identifying data to create unlinkable data. De-identification under the HIPAA Privacy Rule occurs when data has been stripped of common identifiers by two methods:
 * 1. The removal of 18 specific identifiers listed above (Safe Harbor Method)
 * 2. Obtain the expertise of an experienced statistical expert to validate and document the statistical risk of re-identification is very small (Statistical Method).

De-identified data is coded, with a link to the original, fully identified data set kept by an honest broker. Links exist in coded de-identified data making the data considered indirectly identifiable and not anonymized. Coded de-identified data is not protected by the HIPAA Privacy Rule, but is protected under the Common Rule. The purpose of de-identification and anonymization is to use health care data in larger increments, for research purposes. Universities, government agencies, and private health care entities use such data for research, development and marketing purposes.

Covered Entities

In general, U.S. law governing PHI applies to data collected in the course of providing and paying for health care. Privacy and security regulations govern how healthcare professionals, hospitals, health insurers, and other Covered Entities use and protect the data they collect. The source of the data is as relevant as the data itself when determining if information is PHI under U.S. law. For example, sharing information about someone on the street with an obvious medical condition such as an amputation is not restricted by U.S. law. However, obtaining information about the amputation exclusively from a protected source, such as from an electronic medical record, would breach HIPAA regulations.

Business Associates

Covered Entities often use third parties to provide certain health and business services. If they need to share PHI with those third parties, it is the responsibility of the Covered Entity to put in place a Business Associate Agreement that holds the third party to the same standards of privacy and confidentiality as the Covered Entity.

Protected health information storage
Protected health information can be stored in many different forms. According to HIPAA, there are many requirements and limitations regarding how PHI can be stored.

Physical storage
Until recently, physical storage has been the most common method of storing PHI. Physical safeguards for PHI include storing paper records in locked cabinets and enabling a control over the records. A security authority, PIN pad, or identification card could all be necessary to access physical storage of PHI.

Electronic records
Much of PHI is stored in electronic health records (EHR). Cloud computing and other services allow healthcare providers to store vast amounts of data for easy access. For example, Kaiser Permanente has over 9 million members and stores anywhere from 25 to 44 petabytes. In Australia, over 90% of healthcare institutions have implemented EHRs, in an attempt to improve efficiency. E-health architecture types can either be public, private, hybrid, or community, depending on the data stored. Healthcare providers will often store their data on a vast network of remote servers, proving susceptible to privacy breaches. According to a study, the US could save $81 billion annually from switching to a universal electronic health record (EHR).

Wearable technology
In PHI, wearable technology often comes in the form of smartwatches, ECG monitors, blood pressure monitors, and biosensors. Wearable technology has faced rapid growth with 102.4 million units shipped in 2016, up 25% from the 81.9 million units shipped in 2015. According to Insider Intelligence research, the number of health and fitness app users will remain over 84 million through 2022. Health and fitness tracking capabilities are a target for companies producing wearable technology. Privacy concerns for consumers arise when these technology companies are not considered covered entities or business associates under HIPAA or where the health information collected is not PHI.

Mobile applications
Mobile applications have been proven essential, especially for the elderly or disabled. The adoption of mobile healthcare is said to be attractive due to factors like patient behavior, subjective norm, personal innovativeness, perceived behavioral control, and behavioral intention. The legitimacy of certain mobile applications that store PHI can be determined by the user reviews on the application.

Patient Privacy
In a study conducted by researchers, 14 patients were asked for their opinions on privacy concerns and healthcare perceptions. Researchers found that all participants agreed on the importance of healthcare privacy. Participants demonstrated a vague understanding of the legislated patient privacy rights. There were differing opinions on whose responsibility it should be to protect health information; some thought it was their own responsibility, while others thought that the government was responsible. Consent was rarely brought up within the discussion.

Because patient privacy is the reason for regulations on PHI, analyzing consumer data can be extremely difficult to come by. Luca Bonomi and Xiaoqian Jiang determined a technique to perform temporal record linkage using non-protected health information data. As standard linkage processes lack the ability to incorporate the time setting, they result in being ineffective. Bonomi and Jiang propose using the patient’s non-protected health information data to determine records and establish patterns. This approach allows the linkage of patient records using non-PHI data, by giving doctors patterns and a better idea of important diagnoses.

Common Forms of Cybersecurity Attacks on PHI

 * 1) Phishing
 * 2) Eavesdropping
 * 3) Brute-force attacks
 * 4) Selective forwarding
 * 5) Sinkhole threats
 * 6) Sybil attacks
 * 7) Location threats
 * 8) Internal attacks

Attacks on PHI
From 2005 to 2019, the total number of individuals affected by healthcare data breaches was 249.09 million. According to an IBM report, the average cost of a data breach in 2019 was $3.92 million, while a healthcare industry breach usually costs $6.45 million. However, the average cost of a healthcare data breach (average breach size 25,575 records) in the U.S. is $15 million.

In 2017, healthcare compliance analytics platform Protenus stated that 477 healthcare breaches were reported to the U.S. Division of Health and Human Services (HHS). Of these, 407 showed that 5.579 million patient records were affected.

The 2018 Verizon Protected Health Information Data Breach Report (PHIDBR) examined 27 countries and 1368 incidents, detailing that the focus of healthcare breaches was mainly the patients, their identities, health histories, and treatment plans. According to HIPAA, 255.18 million people were affected from 3051 healthcare data breach incidents from 2010 to 2019.

Health-related fraud is estimated to cost the U.S. nearly $80 billion annually. The healthcare industry remains the most costly and targeted industry to data breaches. Healthcare companies have been criticized for not adapting and prioritizing data security. One reason is due to the leeway and minimal penalties for those that fail to comply with the HIPAA Security Rule. There is also limited competition and a stable customer base within the healthcare industry. Researchers are searching for more secure ways to protect PHI.

Ethical Concerns
In the case of PHI, there are ethical concerns regarding how information is treated on a daily basis by healthcare personnel. In 1996, the Clinton Administration passed the HIPAA Privacy Rule, limiting a physician's ability to arbitrarily disclose patients’ personal medical records.

As health artificial intelligence (AI) applications are expected to save over $150 billion in annual savings for U.S. healthcare, researchers are studying the risks of potential PHI leaks. Currently, 21% of U.S. consumers or 57 million people, use a quantified self health and fitness tracking (QSHFT) application. In a study conducted by Nancy Brinson and Danielle Rutherford, over 90% of consumers were comfortable with the opportunity to share data with a healthcare provider. However, Brinson and Rutherford claim that consumers fail to make privacy a priority when they choose to share this information. To combat misuse of PHI on mobile healthcare platforms, Brinson and Rutherford suggest the creation of a policy rating system for consumers. A rating system, monitored by the Federal Trade Commission would allow consumers a centralized way to evaluate data collection methods amongst mobile health providers.

In 2019, the US Department of Health and Human Services Office for Civil Rights (OCA) promised to enforce patients’ right to access under HIPAA, using the Right of Access Initiative. There have currently already been two settlements with the OCA under the Right of Access Initiative, after companies failed to give patient medical records.