The Most Personal Information Possible: Protecting Genetic and Biometric Data

Companies, governments, and other organizations in the digital age collect massive amounts of personal information from the public. Names, addresses, personal communications, demographics--just about every personally identifying piece of information is collected and stored on a database to create a personal profile of the consumer. Despite the increasing prevalence of consumer data protection laws, genetic and biometric data may fall into the gaps of regulation. Some laws, like the Colorado Privacy Act, include genetic and biometric data as protected sensitive data, but not all do. As commercial data collection expands, collecting genetic and biometric data will become an ever-greater privacy concern.

What is genetic data and why does it need to be protected?

Genetic data, or genomic data, is information about an individual's genes. This includes the basic nuleotide (GCAT) sequence and genetic encoding of certain traits (like gene sequences common to those of a certain geographic origin, or a sequence that contributes to a genetic health condition).

Genomic data collection is usually performed in a health context. DNA samples may be collected for biomedical research into or medical diagnosis of genetic conditions. Genetic information is also used in criminal investigation, both to identify suspects and exonerate those who have been falsely convicted.

DNA is the most identifying of all personal features; only identical siblings will have the same genetic sequence. This means a DNA sample is impossible to completely anonymize. Even if the name is removed from one sample, it can be cross-referenced with other samples to identify the subject or their relatives. Genetic information can also reveal private details that may result in discrimination, such as ethnicity, tendency towards certain health conditions, or genes that don't match the subject's gender.

In the US, the legal protections of genetic privacy are mixed. HIPAA protects patients from having their gene samples for diagnosis released without consent or anonymization, but as stated above, DNA is almost impossible to anonymize. The NIH handles this concern by only allowing certain certified, trusted researchers to access its database. 33 of 50 states have some form of genomic privacy laws, but these laws vary in protections and there is no comprehensive federal genomic privacy law. The main gap in genetic data privacy is for consumer genetic testing.

Consumer DNA testing services such as 23andMe or Ancestry, which tests customer's DNA to determine genetic ancestry or predisposition to health risks, are almost entirely unregulated. Many such companies have strong privacy and consent policies, but there is no federal law requiring them to enact such policies. State privacy laws protect genetic data as sensitive information, but at time of writing only five states have passed a comprehensive privacy law: California, Connecticut, Colorado, Utah, and Virginia. Without a strong privacy policy or regulatory control, consumer DNA testing companies can provide their customers' genetic data to third parties without consent--including law enforcement.

Surreptitious Testing

Beyond providing customer genetic data to third parties without consent, many data privacy protection frameworks ignore the threat of surreptitious DNA testing. Surreptitious testing is the testing of DNA samples without the consent of the donor or law enforcement justification. An individual could collect someone else's DNA from a blood stain or licked envelope and send that to be tested without the donor's consent.

Not all US state privacy laws prohibit this activity, some only forbid disclosure by the testing company without consent. Analyzing the genetic data without donor consent falls into a legal gray area, all depending on what the test is for, how the results will be used, and where the testing is done. Without a consistent law, certain jurisdictions still allow collection and processing of genetic data without consent.

What is biometric data and why does it need to be protected?

Biometric data is data about the subject's body, in some cases defined to include genetic information. Here, it means physically identifying personal features, such as face and fingerprints. Biometric data is collected for identification purposes, such as for unlocking devices.

In the US, only Illinois, Texas, and Washington have biometric privacy laws. The California Consumer Privacy Act (CCPA) has some biometric protections, but they are less stringent than Illinois' Biometric Information Privacy Act (BIPA). Biometric protection laws are gaining traction, however. In 2022, California, Kentucky, Maine, Maryland, Massachusetts, Missouri, and New York have all introduced biometric privacy laws modeled after BIPA.

Europe's GDPR specifies biometric data as a type of sensitive data warranting higher protection. Under GDPR, sensitive data can only be processed with the consent of the subject or for certain allowed purposes. It includes any personal data derived from technical processing of a person's physical, physiological, or behavioral characteristics in order to identify that person. This broad definition includes the commonly-used facial and fingerprint ID systems, as well as ones that have yet to be developed or become widespread, such as analysis of motion, habits, or personality to identify a person.

Beyond those GDPR requirements, the EU requires data collectors to perform a "privacy impact assessment" when processing may infringe on the rights and freedoms of the subjects. Collection of biometric data is considered one to be one of those risky activities. Biometrics are intimate and immutable features of the body--when describing any sort of identifying data, one of the most common similes used is to describe it as "like a fingerprint." They are often used to secure highly valuable things, such as getting access to devices. Beyond security concerns, it is unsettling to consider how data collectors have a detailed record of our physical features. As such, biometric data warrants increased regulation, especially when processed at scale for large populations.

Conclusion

Collecting genetic and biometric data are important for innovation in health and security, but they are extremely sensitive categories of information because they are highly identifying and closely tied to our own bodies. As such, they require a legal regime of protection that both allows for scientific and technological innovation while also treating the legitimate privacy concerns around this data with respect. More and more jurisdictions are becoming aware of this necessity, but legal protections are still limited. In the absence of regulation, it is important for organizations to treat genetic and biometric data with the care it deserves and important for consumers to be cautious when asked to provide such sensitive personal data.

About Ardent Privacy

Ardent Privacy is an "Enterprise Data Privacy Technology" solutions provider based in the Maryland/DC region of the United States and Pune, India. Ardent harnesses the power of AI to enable companies with data discovery and automated compliance with DPB (India), RBI Security Guidelines, GDPR (EU), CCPA/CPRA (California), and other global regulations by taking a data-driven approach. Ardent Privacy's solution utilizes machine learning and artificial intelligence to identify, inventory, map, minimize, and securely delete data in enterprises to reduce legal and financial liability.

For more information visit https://ardentprivacy.ai/and for more resources here.

Ardent Privacy articles should not be considered legal advice on data privacy regulations or any other specific facts or circumstances.