Genetic privacy

Genetic privacy involves the concept of personal privacy concerning the storing, repurposing, provision to third parties, and displaying of information pertaining to one's genetic information. This concept also encompasses privacy regarding the ability to identify specific individuals by their genetic sequence, and the potential to gain information on specific characteristics about that person via portions of their genetic information, such as their propensity for specific diseases or their immediate or distant ancestry.

With the public release of genome sequence information of participants in large-scale research studies, questions regarding participant privacy have been raised. In some cases, it has been shown that it is possible to identify previously anonymous participants from large-scale genetic studies that released gene sequence information.

Genetic privacy concerns also arise in the context of criminal law because the government can sometimes overcome criminal suspects' genetic privacy interests and obtain their DNA sample. Due to the shared nature of genetic information between family members, this raises privacy concerns of relatives as well.

As concerns and issues of genetic privacy are raised, regulations and policies have been developed in the United States both at a federal and state level.

Significance of genetic information
In the majority of cases, an individual's genetic sequence is considered unique to that individual. One notable exception to this rule in humans is the case of identical twins, who have nearly identical genome sequences at birth. In the remainder of cases, one's genetic fingerprint is considered specific to a particular person and is regularly used in the identification of individuals in the case of establishing innocence or guilt in legal proceedings via DNA profiling. Specific gene variants one's genetic code, known as alleles, have been shown to have strong predictive effects in the occurrences of diseases, such as the BRCA1 and BRCA2 mutant genes in Breast Cancer and Ovarian Cancer, or PSEN1, PSEN2, and APP genes in early-onset Alzheimer's disease. Additionally, gene sequences are passed down with a regular pattern of inheritance between generations, and can therefore reveal one's ancestry via genealogical DNA testing. Additionally with knowledge of the sequence of one's biological relatives, traits can be compared that allow relationships between individuals, or the lack thereof, to be determined, as is often done in DNA paternity testing. As such, one's genetic code can be used to infer many characteristics about an individual, including many potentially sensitive subjects such as:


 * Parentage / Non-paternity
 * Consanguinity
 * Adoptive Status
 * Ancestry
 * Propensity for Disease
 * Predicted Physical Characteristics

Sources of genetic information
Common specimen types for direct-to-consumer genetic testing are cheek swabs and saliva samples. One of the most popular reasons for at-home genetic testing is to obtain information on an individual's ancestry via genealogical DNA testing and is offered by many companies such as 23andMe, AncestryDNA, Family Tree DNA, or MyHeritage. Other tests are also available which provide consumers with information on genes which influence the risk of specific diseases, such as the risk of developing late-onset Alzheimer's disease or celiac disease.

Privacy Breaches
Studies have shown that genomic data is not immune to adversary attacks. A study conducted in 2013 revealed vulnerabilities in the security of public databases that contain genetic data. As a result, research subjects could sometimes be identified by their DNA alone. Although reports of premeditated breaches outside of experimental research are disputed, researchers suggest the liability is still important to study.

While accessible genomic data has been pivotal in advancing biomedical research, it also escalates the possibility of exposing sensitive information. A common practice in genomic medicine to protect patient anonymity involves removing patient identifiers. However, de-identified data is not subject to the same privileges as the research subjects. Furthermore, there is an increasing ability to re-identify patients and their genetic relatives from their genetic data.

One study demonstrated re-identification by piecing together genomic data from short tandem repeats (e.g. CODIS), SNP allele frequencies (e.g. ancestry testing), and whole-genome sequencing. They also hypothesize using a patient's genetic information, ancestry testing, and social media to identify relatives. Other studies have echoed the risks associated with linking genomic information with public data like social media, including voter registries, web searches, and personal demographics, or with controlled data, like personal medical records.

There is also controversy regarding the responsibility a DNA testing company has to ensure that leaks and breaches do not happen. Determining who legally owns the genomic data, the company or the individual, is of legal concern. There have been published examples of personal genome information being exploited, as well as indirect identification of family members. Additional privacy concerns, related to, e.g., genetic discrimination, loss of anonymity, and psychological impacts, have been increasingly pointed out by the academic community as well as government agencies.

Law Enforcement
Additionally, for criminal justice and privacy advocates, the use of genetic information in identifying suspects for criminal investigations proves worrisome under the United States Fourth Amendment—especially when an indirect genetic link connects an individual to crime scene evidence. Since 2018, law enforcement officials have been harnessing the power of genetic data to revisit cold cases with DNA evidence. Suspects discovered through this process are not directly identified by the input of their DNA into established criminal databases, like CODIS. Instead, suspects are identified as the result of familial genetic sleuthing by law enforcement, submitting crime scene DNA evidence to genetic database services that link users whose DNA similarity indicates a family connection. Officers can then track the newly identified suspect in person, waiting to collect discarded trash that might carry DNA in order to confirm the match.

Despite the privacy concerns of suspects and their relatives, this procedure is likely to survive Fourth Amendment scrutiny. Much like donors of biological samples in cases of genetic research, criminal suspects do not retain property rights in abandoned waste; they can no longer assert an expectation of privacy in the discarded DNA used to confirm law enforcement suspicions, thereby eliminating their Fourth Amendment protection in that DNA. Additionally, the genetic privacy of relatives is likely irrelevant under current caselaw since Fourth Amendment protection is “personal” to criminal defendants.

Psychological Impact
In a systematic review of perspectives toward genetic privacy, researchers highlight some of the concerns individuals hold regarding their genetic information, such as the potential dangers and effects on themselves and family members. Academics note that participating in biomedical research or genetic testing has implications beyond the participant; it can also reveal information about genetic relatives. The study also found that people expressed concerns as to which body controls their information and if their genetic information could be used against them.

Additionally, the American Society of Human Genetics has expressed issues about genetic tests in children. They infer that testing could lead to negative consequences for the child. For example, if a child's likelihood for adoption was influenced by genetic testing, the child might suffer from self esteem issues. A child's well-being might also suffer due to paternity testing or custody battles that require this type of information.

Regulations
When the access of genetic information is regulated, it can prevent insurance companies and employers from reaching such data. This could avoid issues of discrimination, which oftentimes leaves an individual whose information has been breached without a job or without insurance.

Federal Regulation
In the United States, biomedical research containing human subjects is governed by a baseline standard of ethics known as The Common Rule, which aims to protect a subject's privacy by requiring "identifiers" such as name or address to be removed from collected data. A 2012 report by the Presidential Commission for the Study of Bioethical Issues stated, however, that "what constitutes 'identifiable' and 'de-identified' data is fluid and that evolving technologies and the increasing accessibility of data could allow de-identified data to become re-identified". In fact, research has already shown that it is "possible to discover a study participant's identity by cross-referencing research data about him and his DNA sequence … [with] genetic genealogy and public-records databases". This has led to calls for policy-makers to establish consistent guidelines and best practices for the accessibility and usage of individual genomic data collected by researchers.

Privacy protections for genetic research participants were strengthened by provisions of the 21st Century Cures Act (H.R.34) passed on 7 December 2016 for which the American Society of Human Genetics (ASHG) commended Congress, Senator Warren and Senator Enzi.

The Genetic Information Nondiscrimination Act of 2008 (GINA) protects the genetic privacy of the public, including research participants. The passage of GINA makes it illegal for health insurers or employers to request or require genetic information of an individual or of family members (and further prohibits the discriminatory use of such information). This protection does not extend to other forms of insurance such as life insurance.

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) also provides some genetic privacy protections. HIPAA defines health information to include genetic information, which places restrictions on who health providers can share the information with.

State Regulation
Three kinds of laws are frequently associated with genetic privacy: those relating to informed consent and property rights, those preventing insurance discrimination, and those prohibiting employment discrimination. According to the National Human Genome Research Institute, forty-one states have enacted genetic privacy laws as of January 2020. However, those privacy laws vary in the scope of protection offered; while some laws "apply broadly to any person" others apply "narrowly to certain entities such as insurers, employers, or researchers."

Arizona, for example, falls in the former category and offers broad protection. Currently, Arizona's genetic privacy statutes focus on the need for informed consent to create, store, or release genetic testing results, but a pending bill would amend the state genetic privacy law framework to grant exclusive property rights in genetic information derived from genetic testing to all persons tested. In expanding privacy rights by including property rights, the bill would grant persons who undergo genetic testing greater control over their genetic information. Arizona also prohibits insurance and employment discrimination on the basis of genetic testing results.

New York State also has strong legislative measures protecting individuals from genetic discrimination. Section 79-I of the New York Civil Rights Law places strict restrictions on the usage of genetic data. The statute also outlines the proper conditions for consenting to genetic data collection or usage.

California similarly offers a broad range of protection for genetic privacy, but it stops short of granting individuals property rights in their genetic information. While currently enacted legislation focuses on prohibiting genetic discrimination in employment and insurance, a piece of pending legislation would extend genetic privacy rights to provide individuals with greater control over genetic information obtained through direct-to-consumer testing services like 23andMe.

Florida passed House Bill 1189, a DNA privacy law that prohibits insurers from using genetic data, in July 2020.

On the other hand, Mississippi offers few genetic privacy protections beyond those required by the federal government. In the Mississippi Employment Fairness Act, the legislature recognized the applicability of the Genetic Information Nondiscrimination Act, which "prohibit[s] discrimination on the basis of genetic information with respect to health insurance and employment."

Other
To balance data sharing with the need to protect the privacy of research subjects geneticists are considering to move more data behind controlled-access barriers, authorizing trusted users to access the data from many studies, rather than "having to obtain it piecemeal from different studies".

In October 2005, IBM became the world's first major corporation to establish a genetics privacy policy. Its policy prohibits using employees' genetic information in employment decisions.

Breaching techniques
According to a 2014 study by Yaniv Erlich and Arvind Narayanan, genetic privacy breaching techniques fall into three categories:

Identity Tracing

 * Here the aim is to link between an unknown genome and the concealed identity of the data originator by accumulating quasi-identifiers − residual pieces of information that are embedded in the dataset − and to gradually narrow down the possible individuals that match the combination of these quasi-identifiers.

Attribute Disclosure Attacks via DNA (ADAD)

 * Here the adversary already has access to the identified DNA sample of the target and to a database that links DNA-derived data to sensitive attributes without explicit identifiers, for example a public database of the genetic study of drug abuse. The ADAD techniques match the DNA data and associate the identity of the target with the sensitive attribute

Completion Techniques

 * Here the adversary also knows the identity of a genomic dataset but has access only to a sanitized version without sensitive loci. The aim here is to expose the sensitive loci that are not part of the original data.

However, more recent studies have indicated new avenues for breaching genetic privacy:

Phenotype Inferences

 * Here, the goal is to use readily available phenotype information about an individual, such as physical features (or some combination thereof), to make genetic inferences. As genetic databases grow at unprecedented rates, providing larger and more comprehensive aggregates, the ability to make inferences with more probabilistic certainty greatly increases. Furthermore, the scope of potential inferences grows with expanding datasets.

Safeguards
According to a 2022 study by Zhiyu Wan et al., safeguards for genetic privacy fall into two categories:

Legal Safeguards

 * Legal safeguards include the Genetic Information Nondiscrimination Act of 2008, the Health Insurance Portability and Accountability Act of 1996, the Common Rule, the US National Institutes of Health (NIH) data sharing policy, European Union’s General Data Protection Regulation (GDPR), US state privacy laws (e.g., California Consumer Privacy Act, California Privacy Rights Act, or Virginia Consumer Data Protection Act), self-regulations (e.g., data use agreements, privacy policies, or terms of service), and informed consents.

Technical Safeguards

 * Technical safeguards include cryptographic tools, access control, and data perturbation approaches. Specifically, cryptographic approaches include homomorphic encryption, secure multiparty computation, trusted execution environment, and Blockchain, whereas data perturbation approaches include k-anonymity, Beacon services, differential privacy, and synthetic data generation.