The Cost of Making Crime Not Pay: Obama, CODIS and Forensic DNA

Earlier this month President Barack Obama appeared on the television show “America’s Most Wanted” to discuss the creation of a national forensic DNA database. In his interview with AMW host John Walsh, President Obama expressed his strong support for a number of law enforcement initiatives, including a proposal to expand the compulsory DNA sampling of individuals arrested and charged with certain crimes.

In this post we’ll take a look at the current system of forensic DNA profiling, starting with the Combined DNA Index System (CODIS), which is the FBI program that oversees DNA profile databanking in the United States. It comprises databases at the local, state and national levels, with the National DNA Index System (NDIS) the crown jewel. The CODIS program operates as a powerful law enforcement tool but, in the eyes of some – including President Obama – it is not yet powerful enough. But even the existing CODIS collection, with its nearly eight million DNA profiles, poses a number of interesting ethical, legal and social issues.

In a follow up post we’ll take a look at President Obama’s push to expand CODIS to include mandatory arrestee profiling, as well as other developments in forensic DNA policy, such as the provocative New York Times editorial earlier this week calling for a nationwide database that would keep the DNA profile of every American – from hardened convicts to newborn babies – on file for law enforcement use. Our follow-up will explore the question of whose DNA profiles should be included in the CODIS system and used for law enforcement purposes, as well as the issues raised by expanding the scope of CODIS as proposed by President Obama and others.

Today, however, we begin the Genomics Law Report’s coverage of the issues surrounding the collection and use of forensic DNA with an introduction to CODIS and an examination of two important aspects of the program that are the subject of ongoing controversy: (1) the controversial use of partial DNA matches as an investigative tool and (2) the recent efforts by scientific researchers to gain access to the profiles maintained in the NDIS in order to evaluate the accuracy of DNA matches it produces.

CODIS: Quick and Dirty. Under CODIS, each state maintains a State DNA Index System (SDIS) at one of its laboratories. An SDIS is comprised of at least two indices—the Offender Index and the Forensic Index. DNA profiles are constructed from samples taken from people convicted of certain crimes (as defined by each individual state, but usually encompassing, at a minimum, all felonies) and are housed in the Offender Index of the corresponding SDIS. Similar DNA profiles are constructed from samples taken from crime scenes (but from an unknown contributor) and are housed in the Forensic Index. A state may also maintain additional indices in its SDIS (for example, an Arrestee Index of the type favored by President Obama is currently maintained by eighteen states, as well as by the FBI).

The NDIS includes the same three indices for use in criminal investigations: the Offender Index, the Forensic Index, and the Arrestees Index.1 Individual states may forward profiles from the corresponding indices of their respective SDIS to the NDIS, accompanied by an identifying number indicating the state and laboratory that supplied the profile. As a result, the information maintained in the NDIS is de-identified: that is, it is linked only indirectly (through the coded identifier) to the contributor’s name or other personally identifying information.

The FBI utilizes the CODIS software to perform three primary searches in the NDIS on a weekly basis: it compares the profiles in the Offender Index to those in the Forensic Index; those in the Arrestee Index to those in the Forensic Index; and those in the Forensic Index to the rest of those in the Forensic Index.2 A match from either of the first two searches would link a known individual to a crime scene—which is a potentially powerful piece of evidence in a criminal investigation—while a match from the third search could tie together two previously unconnected crimes, allowing the investigating authorities to collaborate in developing investigative leads. If CODIS identifies that two profiles are identical, the states that submitted the profiles in question are notified of the match. At that point, the investigating authorities from the two states must contact each other directly to proceed.3

The Use of Partial Matches. An emerging issue is whether the FBI should notify the states when two profiles are not identical, but are extremely similar. For example, if the two profiles exactly match at many loci and present at least one similar allele at each locus, this “moderate stringency” match is unlikely to occur between unrelated individuals. Instead, a more likely result is that the contributors of the two samples are related. (Exactly how likely is an area of disputed scientific research, and one of the main reasons disclosure to the NDIS is sought, as discussed below.) Thus, if an Offender or Arrestee profile demonstrates a moderate stringency match with a Forensic profile, it is possible that a relative of the offender or arrestee was the contributor of the forensic sample, creating a connection that law enforcement agencies are understandably interested in exploring.

Currently, a number of states allow or have allowed partial matches to serve as the basis for law enforcement investigations of known offenders’ relatives, but the exact policies of the states can be difficult to discern. According to one survey conducted in November 2009, at least 15 states allow DNA analysts to inform law enforcement of partial matches, though at least 10 of those states required that the partial match be discovered unintentionally.4

Meanwhile, in Britain, the use of partial matches is widespread. It famously led to the capture of the “shoe rapist,” a serial rapist in the city of Rotherham who escaped capture for more than 20 years. After his sister’s DNA profile—which was included in the national database because of a DUI arrest—returned a partial match to evidence taken from the scene of one of his attacks, police solved the crime within eight hours.5

At the national level in the United States, the Scientific Working Group on DNA Analysis Methods Ad Hoc Committee on Partial Matches made recommendations to the FBI director about the use of partial matches in an October 2009 report. Despite expressing concern about the frequency of partial matches between profiles contributed by unrelated individuals, the committee ultimately concluded that the informed and practiced use of partial matches is beneficial.

Issues Associated with Partial Matching. The practice of using partial matches as investigatory leads is one that raises difficult issues on both sides. Law enforcement officials consider partial matching an important investigative tool, while critics argue that it can unduly subject innocent individuals to intrusive investigations through no fault of their own. Convicted offenders, by virtue of committing a qualifying crime, have arguably surrendered a portion of their privacy rights and may be justifiably included (pdf) in a DNA database such as the Offender Index. To a somewhat lesser extent, arrestees have arguably surrendered a portion of their privacy rights as well.

The relatives of these offenders and arrestees, on the other hand, have done nothing to justify disparate treatment as compared to any other law-abiding citizen. Nonetheless, a partial match between a sample from an unknown contributor in the Forensic Index and a prior offender or arrestee whose DNA is contained in the applicable CODIS index, if provided to investigators, will implicate (potentially) innocent relatives of the offender / arrestee as suspects. Accordingly, these individuals then face the specter of “guilt by genetic association”; they are suspects in investigations because of the prior actions of their genetic relatives.

Using a partial match to identify a guilty individual would be less controversial if they reliably indicated a genetic relationship between the two samples. While it is known that the samples in any partial match come from two separate individuals, the fact is that in many partial match examples those two individuals are not related at all. In these cases, a profile from the Forensic Index partially matches a profile in the Offender or Arrestee Index not because of a genetic relationship between the two individuals, but simply because of chance.

Some commentators are particularly concerned that the current overrepresentation of minorities in the criminal justice system will, in turn, produce a disproportionate number of partial matches that implicate minorities as suspects. Maryland attorney Stephen B. Mercer put the concern plainly: “What you’re gonna end up seeing is nearly the majority of the African American population being under genetic surveillance. If you do the math, that’s where you end up.”

Access, Accuracy and the NDIS. Another concern surrounding the use of partial matches is the rate of false positive in forensic DNA matching. False positives occur when two profiles are described as a statistical match when, in fact, they originate from genetically unrelated sources. False positives appear in two different forms. The first is when two samples are declared to be a perfect match – on the basis of sharing the same snippets of genetic code at a sufficient number of locations they are determined to have been provided by the same individual – but, in reality, were supplied by two separate (but possibly genetically related) individuals. The second example of a false positive was described above in the context of partial matching. It occurs when a partial match is reported, and the two samples believed to be provided by genetic relatives, when in fact the samples come from individuals that are not (closely) genetically related.

Both complete matches and partial matches are determined based upon the frequency with which alleles appear at certain forensic loci within a given population. The measured allelic frequencies dictate the number of positions at which two DNA samples must be identical in order to declare a match or partial match. Thus, determining these frequencies correctly is of paramount importance to avoiding false positives and ensuring the accuracy of both direct and partial matches.

Important factors in evaluating the discriminatory power of DNA profile matches—including estimates of allele frequencies and loci independence—have to date been determined through the use of relatively small collections of profiles.6 Recently, a number of researchers have questioned the accuracy of the data used to construct profile matches and have mounted a campaign to obtain access to the nearly eight million DNA profiles currently stored in the NDIS.

Advocates of opening access to the NDIS argue that the much larger NDIS dataset would enable researchers to confirm the accuracy of the figures and assumptions used to construct and match DNA profiles. Most prominently, last December forty-one researchers signed on to a letter published in Science Magazine using this reason as its headline cause for NDIS disclosure. Also targeted are the assumed frequencies of some phenomena—for instance, the likelihood that a combination of three forensic samples could appear consistent with a combination of two samples, thereby confusing the forensic analysis—that have thus far only been calculated through simulation. In each case, researchers argue that access to the NDIS could lead to a clearer and more accurate understanding of the likelihood of such phenomena.

Until recently, the FBI has taken a hostile stance toward attempts to study DNA databases, despite the express statutory recognition of the possibility of anonymous disclosure,7 citing privacy concerns and the burden of disclosure. In particular, though the DNA segments (technically referred to as short tandem repeats, or STRs) selected by the FBI for the composition of a DNA profile have been frequently referred to as “junk DNA,” some commentators have expressed concern that future scientific advances may reveal uses of limited forensic profiles that stretch beyond simple identification, including, for example, predicting an individual’s disease susceptibility.8 Other commentators counter that concern over the number of similar profiles discovered through studies of smaller databases is overblown. Nevertheless, in response to the Science letter, the FBI appears to have adopted a slightly more receptive tone: “We are exploring ways to investigate some of the topics,” says Dr. D. Christian Hassell, Assistant Director of the FBI’s Laboratory Division. Even so, without a release of anonymized DNA profiles for research (a decision that comes with its own, separate set of issues) or a relevant, thorough statistical analysis of those profiles, the movement for disclosure is unlikely to subside.

Definitive answers to these questions will go a long way in determining how comfortable Americans – and courts – are with proposals that would dramatically expand the circumstances in which forensic DNA profiles are collected and used. We’ll dig deeper into these and other issues in an upcoming post.

(image provided by Wikimedia Commons)


1 In addition, the NDIS maintains three indices that are related to the identification of missing persons: the Missing Persons Index, the Unidentified Human Remains Index, and the Biological Relatives of Missing Persons Index.

2 Each state uses CODIS, through a license from the FBI, to perform the same three searches.

3 Note that a match between an Offender profile and a Forensic profile is generally used as a basis for probable cause to obtain a new sample from the known contributor of the Offender profile, rather than as evidence used in court.

4 High-profile cases have been less successful in the United States. In 2008, California attempted to identify the “Grim Sleeper” using familial searching, but failed. More recently, however, a rapist on the Isle of Wight was identified thanks to a partial match between crime scene DNA aignd the DNA of the rapist’s daughter.

5 Apparently, routine searches in the CODIS program “allow[] for a little imprecision at each location because of so many different laboratories and agents.

6 Illustratively, five scientists published their findings on allele frequencies in the Journal of Forensic Science in 2003 using just 700 unique DNA profiles.

7 In 42 U.S.C. § 14132, Congress specifically contemplated and authorized disclosure of the anonymous profiles for research purposes.

8 Recent scientific research has shed further doubt on what is an increasingly discredited notion: that DNA which exists outside of protein-coding regions should be considered “junk DNA.”