bioRxiv | 2019

How lucky was the genetic investigation in the Golden State Killer case?

 
 

Abstract


Long-range forensic familial searching is a new method in forensic genetics. In long-range search, a sample of interest is genotyped at single-nucleotide polymorphism (SNP) markers, and the genotype is compared with a large database in order to find relatives. Here, we perform some simple calculations that explore the basic phenomena that govern long-range searching. Two opposing phenomena—one genealogical and one genetic—govern the success of the search in a database of a given size. As one considers more distant genealogical relationships, any target sample is likely to have more relatives—on average, one has more second cousins than first cousins, and so on. But more distant relatives are also harder to detect genetically. Starting with third cousins, there is an appreciable chance that a given genealogical relationship will not be detectable genetically. Given the balance of these genealogical and genetic phenomena and the size of databases currently queryable by law enforcement, it is likely that most people with substantial recent ancestry in the United States are accessible via long-range search. Note This material was originally posted on the Coop lab site on May 7th, 2018, soon after the reporting of the arrest of Joseph DeAngelo in the Golden State Killer case, one of the first high-profile uses of long-range familial search. Subsequently, Erlich et al. (2018) published a detailed analysis in a large empirical dataset along with a theoretical analysis of a model similar to the one we use here, obtaining results broadly consistent with the ones presented here. Because Erlich and colleagues kindly cited this work when describing their model, we thought it would be appropriate to post this material in a venue where it is more easily cited.

Volume None
Pages None
DOI 10.1101/531384
Language English
Journal bioRxiv

Full Text