Combined DNA Index System



The Combined DNA Index System (CODIS) is the United States national DNA database created and maintained by the Federal Bureau of Investigation. CODIS consists of three levels of information; Local DNA Index Systems (LDIS) where DNA profiles originate, State DNA Index Systems (SDIS) which allows for laboratories within states to share information, and the National DNA Index System (NDIS) which allows states to compare DNA information with one another.

The CODIS software contains multiple different databases depending on the type of information being searched against. Examples of these databases include, missing persons, convicted offenders, and forensic samples collected from crime scenes. Each state, and the federal system, has different laws for collection, upload, and analysis of information contained within their database. However, for privacy reasons, the CODIS database does not contain any personal identifying information, such as the name associated with the DNA profile. The uploading agency is notified of any hits to their samples and are tasked with the dissemination of personal information pursuant to their laws.

Establishment
The creation of a national DNA database within the U.S. was first mentioned by the Technical Working Group on DNA Analysis Methods (TWGDAM) in 1989. The FBI's strategic goal was to maximize the voluntary participation of states and avoid what happened several years early, when eight western states frustrated with the progress creating a national Automated Fingerprint Identification System (AFIS) network formed the Western Identification Network (WIN). The FBI's strategy to discourage states from creating systems that competed with CODIS was to develop DNA databasing software and provide it free of charge to state and local crime laboratories.This strategic decision--to provide software free of charge for the purpose of gaining market share--was innovative at that time and predated the browser wars. In 1990, the FBI began a pilot DNA databasing program with 14 state and local laboratories.

In 1994, Congress passed the DNA Identification Act which authorized the FBI to create a national DNA database of convicted offenders as well as separate databases for missing persons and forensic samples collected from crime scenes. (Some in the Bureau believed the Act was not required to establish a national DNA database because the FBI's Criminal Justice Information Services Division was already using similar authorities to provide data-sharing solutions to federal, state, local, and tribal law enforcement agencies.) The DNA Identification Act also required that laboratories participating in the CODIS program maintain accreditation from an independent nonprofit organization that is actively involved in the forensic fields and that scientists processing DNA samples for submission into CODIS maintain proficiency and are routinely tested to ensure the quality of the profiles being uploaded into the database. The national level of CODIS (NDIS) was implemented in October 1998. Today, all 50 states, the District of Columbia, federal law enforcement, the Army Laboratory, and Puerto Rico participate in the national sharing of DNA profiles.

Database structure
The CODIS database contains several different indexes for the storage of DNA profile information. For assistance in criminal investigations three indexes exist: the offender index, which contains DNA profiles of those convicted of crimes; the arrestee index, which contains profiles of those arrested of crimes pursuant to the laws of the particular state; and the forensic index, which contains profiles collected from a crime scene. Additional indexes, such as the unidentified human remain index, the missing persons index, and the biological relatives of missing persons index, are used to assist in identifying missing persons. Specialty indexes also exist for other specimens that do not fall into the other categories. These indexes include the staff index, for profiles of employees who work with the samples, and the multi-allelic offender index, for single-source samples that have three or more alleles at two or more loci.

Non-criminal indexes
While CODIS is generally used for linking crimes to other crimes and potentially to suspects there are non-criminal portions of the database such as the missing person indexes. The National Missing Person DNA Database, also known as CODIS(mp), is maintained by the FBI at the NDIS level of CODIS allowing all states to share information with one another. Created in 2000 using the existing CODIS infrastructure, this section of the database is designed to help identify human remains by collecting and storing DNA information on the missing or the relatives of missing individuals. Unidentified remains are processed for DNA by the University of North Texas Center for Human Identification which is funded by the National Institute of Justice. Nuclear, Y-STR (for males only), and mitochondrial analysis can be performed on both unknown remains and on known relatives in order to maximize the chance of identifying remains.

Statistics
, NDIS contained more than 14 million offender profiles, more than four million arrestee profiles and more than one million forensic profiles. The effectiveness of CODIS is measured by the number of investigations aided through database hits. , CODIS had aided in over 520 thousand investigations and produced more than 530 thousand hits. Each state has their own SDIS database and each state can set their own inclusionary standards that can be less strict than the national level. For this reason, a number of profiles that are present in state level databases are not in the national database and are not routinely searched across state lines.

Scientific basis
The bulk of identifications using CODIS rely on short tandem repeats (STRs) that are scattered throughout the human genome and on statistics that are used to calculate the rarity of that specific profile in the population. STRs are a type of copy-number variation and comprise a sequence of nucleotide base pairs that is repeated over and over again. At each location tested during DNA analysis, also known as a locus (plural loci), a person has two sets of repeats, one from the father and one from the mother. Each set is measured and the number of repeat copies is recorded. If both strands, inherited from the parents, contain the same number of repeats at that locus the person is said to be homozygous at that locus. If the repeat numbers differ they are said to be heterozygous. Every possible difference at a locus is an allele. This repeat determination is performed across a number of loci and the repeat values is the DNA profile that is uploaded to CODIS. As of January 1, 2017, requirements for upload to national level for known offender profiles is 20 loci.

Alternatively, CODIS allows for the upload of mitochondrial DNA (mtDNA) information into the missing persons indexes. Since mtDNA is passed down from mother to offspring it can be used to link remains to still living relatives who have the same mtDNA.

Loci


Prior to January 1, 2017, the national level of CODIS required that known offender profiles have a set of 13 loci called the "CODIS core". Since then, the requirement has expanded to include seven additional loci. Partial profiles are also allowed in CODIS in separate indexes and are common in crime scene samples that are degraded or are mixtures of multiple individuals. Upload of these profiles to the national level of CODIS requires at least eight of the core loci to be present as well as a profile rarity of 1 in 10 million (calculated using population statistics).

Loci that fall within a gene are named after the gene. For example, TPOX, is named after the human thyroid peroxidase gene. Loci that do not fall within genes are given a standard naming scheme for uniformity. These loci are named D + the chromosome the locus is on + S + the order in which the location on that chromosome was described. For example, D3S1358 is on the third chromosome and is the 1358th location described. The CODIS core are listed below; loci with asterisks are the new core and were added to the list in January 2017.

• CSF1PO

• D3S1358

• D5S818

• D7S820

• D8S1179

• D13S317

• D16S539

• D18S51

• D21S11

• FGA

• TH01

• TPOX

• vWA

• D1S1656*

• D2S441*

• D2S1338*

• D10S1248*

• D12S391*

• D19S433*

• D22S1045*

The loci used in CODIS were chosen because they are in regions of noncoding DNA, sections that do not code for proteins. These sections should not be able to tell investigators any additional information about the person such as their hair or eye color, or their race. However, new advancements in the understanding of genetic markers and ancestry have indicated that the CODIS loci may contain phenotypic information.

International use
While the U.S. database is not directly connected to any other country, the underlying CODIS software is used by other agencies around the world. , the CODIS software is used by 90 international laboratories in 50 countries. International police agencies that want to search the U.S. database can submit a request to the FBI for review. If the request is reasonable and the profile being searched would meet inclusionary standards for a U.S. profile, such as number of loci, the request can be searched at the national level or forwarded to any states where reasonable suspicion exists that they may be present in that level of the database.

Arrestee collection
The original purpose of the CODIS database was to build upon the sex offender registry through the DNA collection of convicted sex offenders. Over time, that has expanded. Currently, all 50 states collect DNA from those convicted of felonies. A number of states also collect samples from juveniles as well as those who are arrested, but not yet convicted, of a crime. Note that even in states which limit collection of DNA retained in the state database only to those convicted of a crime, local databases, such as the forensic laboratory operated by New York City's Office of Chief Medical Examiner, may collect DNA samples of arrestees who have not been convicted. The collection of arrestee samples raised constitutional issues, specifically the Fourth Amendment prohibiting unreasonable search and seizure. It was argued that the collection of DNA from those that were not convicted of a crime, without an explicit order to collect, was considered a warrantless search and therefore unlawful. In 2013, the United States Supreme Court ruled in Maryland v. King that the collection of DNA from those arrested for a crime, but not yet convicted, is part of the police booking procedure and is reasonable when that collection is used for identification purposes.

Familial searching
The inheritance pattern of some DNA means that close relatives share a higher percentage of alleles between each other than with other, random, members of society. This allows for the searching of close matches within CODIS when an exact match is not found. By focusing on close matches, investigators can potentially find a close relative whose profile is in CODIS narrowing their search to one specific family. Familial searching has led to several convictions after the exhaustion of all other leads including the Grim Sleeper serial killer. This practice also raised Fourth Amendment challenges as the individual who ends up being charged with a crime was only implicated because someone else's DNA was in the CODIS database. , twelve states have approved the use of familial searching in CODIS.