Xiangpei Zeng
University of North Texas Health Science Center
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xiangpei Zeng.
Forensic Science International-genetics | 2014
Jonathan L. King; Bobby L. LaRue; Nicole M.M. Novroski; Monika Stoljarova; Seung Bum Seo; Xiangpei Zeng; David H. Warshauer; Carey Davis; Walther Parson; Antti Sajantila; Bruce Budowle
Mitochondrial DNA typing in forensic genetics has been performed traditionally using Sanger-type sequencing. Consequently sequencing of a relatively-large target such as the mitochondrial genome (mtGenome) is laborious and time consuming. Thus, sequencing typically focuses on the control region due to its high concentration of variation. Massively parallel sequencing (MPS) has become more accessible in recent years allowing for high-throughput processing of large target areas. In this study, Nextera(®) XT DNA Sample Preparation Kit and the Illumina MiSeq™ were utilized to generate quality whole genome mitochondrial haplotypes from 283 individuals in a both cost-effective and rapid manner. Results showed that haplotypes can be generated at a high depth of coverage with limited strand bias. The distribution of variants across the mitochondrial genome was described and demonstrated greater variation within the coding region than the non-coding region. Haplotype and haplogroup diversity were described with respect to whole mtGenome and HVI/HVII. An overall increase in haplotype or genetic diversity and random match probability, as well as better haplogroup assignment demonstrates that MPS of the mtGenome using the Illumina MiSeq system is a viable and reliable methodology.
Forensic Science International-genetics | 2015
Xiangpei Zeng; Jonathan L. King; Monika Stoljarova; David H. Warshauer; Bobby L. LaRue; Antti Sajantila; Jaynish Patel; Douglas R. Storts; Bruce Budowle
STR typing in forensic genetics has been performed traditionally using capillary electrophoresis (CE). However, CE-based method has some limitations: a small number of STR loci can be used; stutter products, dye artifacts and low level alleles. Massively parallel sequencing (MPS) has been considered a viable technology in recent years allowing high-throughput coverage at a relatively affordable price. Some of the CE-based limitations may be overcome with the application of MPS. In this study, a prototype multiplex STR System (Promega) was amplified and prepared using the TruSeq DNA LT Sample Preparation Kit (Illumina) in 24 samples. Results showed that the MinElute PCR Purification Kit (Qiagen) was a better size selection method compared with recommended diluted bead mixtures. The library input sensitivity study showed that a wide range of amplicon product (6-200ng) could be used for library preparation without apparent differences in the STR profile. PCR sensitivity study indicated that 62pg may be minimum input amount for generating complete profiles. Reliability study results on 24 different individuals showed that high depth of coverage (DoC) and balanced heterozygote allele coverage ratios (ACRs) could be obtained with 250pg of input DNA, and 62pg could generate complete or nearly complete profiles. These studies indicate that this STR multiplex system and the Illumina MiSeq can generate reliable STR profiles at a sensitivity level that competes with current widely used CE-based method.
Forensic Science International-genetics | 2015
Xiangpei Zeng; Jonathan L. King; Spencer Hermanson; Jaynish Patel; Douglas R. Storts; Bruce Budowle
Capillary electrophoresis (CE) and multiplex amplification with fluorescent tagging have been routinely used for STR typing in forensic genetics. However, CE-based methods restrict the number of markers that can be multiplexed simultaneously and cannot detect any intra-repeat variations within STRs. Several studies already have indicated that massively parallel sequencing (MPS) may be another potential technology for STR typing. In this study, the prototype PowerSeq(™) Auto System (Promega) containing the 23 STR loci and amelogenin was evaluated using Illumina MiSeq. Results showed that single source complete profiles could be obtained using as little as 62 pg of input DNA. The reproducibility study showed that the profiles generated were consistent among multiple typing experiments for a given individual. The mixture study indicated that partial STR profiles of the minor contributor could be detected up to 19:1 mixture. The mock forensic casework study showed that full or partial profiles could be obtained from different types of single source and mixture samples. These studies indicate that the PowerSeq Auto System and the Illumina MiSeq can generate concordant results with current CE-based methods. In addition, MPS-based systems can facilitate mixture deconvolution with the detection of intra-repeat variations within length-based STR alleles.
BMC Genomics | 2015
Seung Bum Seo; Xiangpei Zeng; Jonathan L. King; Bobby L. LaRue; Mourad Assidi; Mohammad Hussain Al-Qahtani; Antti Sajantila; Bruce Budowle
BackgroundMassively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM.Results24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%.ConclusionsIn this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed. Overall, the results of this study, based on orthogonal concordance testing and phylogenetic scrutiny, supported that whole mtGenome sequence data with high accuracy can be obtained using the PGM platform.
International Journal of Legal Medicine | 2016
Xiangpei Zeng; Ranajit Chakraborty; Jonathan L. King; Bobby L. LaRue; Rodrigo S. Moura-Neto; Bruce Budowle
Ancestry informative markers (AIMs) can be used to detect and adjust for population stratification and predict the ancestry of the source of an evidence sample. Autosomal single nucleotide polymorphisms (SNPs) are the best candidates for AIMs. It is essential to identify the most informative AIM SNPs across relevant populations. Several informativeness measures for ancestry estimation have been used for AIMs selection: absolute allele frequency differences (δ), F statistics (FST), and informativeness for assignment measure (In). However, their efficacy has not been compared objectively, particularly for determining affiliations of major US populations. In this study, these three measures were directly compared for AIMs selection among four major US populations, i.e., African American, Caucasian, East Asian, and Hispanic American. The results showed that the FST panel performed slightly better for population resolution based on principal component analysis (PCA) clustering than did the δ panel and both performed better than the In panel. Therefore, the 23 AIMs selected by the FST measure were used to characterize the four major American populations. Genotype data of nine sample populations were used to evaluate the efficiency of the 23-AIMs panel. The results indicated that individuals could be correctly assigned to the major population categories. Our AIMs panel could contribute to the candidate pool of AIMs for potential forensic identification purposes.
Forensic Science International-genetics | 2016
Frank R. Wendt; David H. Warshauer; Xiangpei Zeng; Jennifer D. Churchill; Nicole M.M. Novroski; Bing Song; Jonathan L. King; Bobby L. LaRue; Bruce Budowle
Short tandem repeat (STR) loci are the traditional markers used for kinship, missing persons, and direct comparison human identity testing. These markers hold considerable value due to their highly polymorphic nature, amplicon size, and ability to be multiplexed. However, many STRs are still too large for use in analysis of highly degraded DNA. Small bi-allelic polymorphisms, such as insertions/deletions (INDELs), may be better suited for analyzing compromised samples, and their allele size differences are amenable to analysis by capillary electrophoresis. The INDEL marker allelic states range in size from 2 to 6 base pairs, enabling small amplicon size. In addition, heterozygote balance may be increased by minimizing preferential amplification of the smaller allele, as is more common with STR markers. Multiplexing a large number of INDELs allows for generating panels with high discrimination power. The Nextera™ Rapid Capture Custom Enrichment Kit (Illumina, Inc., San Diego, CA) and massively parallel sequencing (MPS) on the Illumina MiSeq were used to sequence 68 well-characterized INDELs in four major US population groups. In addition, the STR Allele Identification Tool: Razor (STRait Razor) was used in a novel way to analyze INDEL sequences and detect adjacent single nucleotide polymorphisms (SNPs) and other polymorphisms. This application enabled the discovery of unique allelic variants, which increased the discrimination power and decreased the single-locus random match probabilities (RMPs) of 22 of these well-characterized INDELs which can be considered as microhaplotypes. These findings suggest that additional microhaplotypes containing human identification (HID) INDELs may exist elsewhere in the genome.
American Journal of Forensic Medicine and Pathology | 2016
Frank R. Wendt; Xiangpei Zeng; Jennifer D. Churchill; Jonathan L. King; Bruce Budowle
AbstractShort tandem repeats and single nucleotide polymorphisms (SNPs) are used to individualize biological evidence samples. Short tandem repeat alleles are characterized by size separation during capillary electrophoresis (CE). Massively parallel sequencing (MPS) offers an alternative that can overcome limitations of the CE. With MPS, libraries are prepared for each sample, entailing target enrichment and bar coding, purification, and normalization. The HaloPlex Target Enrichment System (Agilent Technologies) uses a capture-based enrichment system with restriction enzyme digestion to generate fragments containing custom-selected markers. It offers another possible workflow for typing reference samples. Its efficacy was assessed using a panel of 275 human identity SNPs, 88 short tandem repeats, and amelogenin. The data analyzed included locus typing success, depth of sequence coverage, heterozygote balance, and concordance. The results indicate that the HaloPlex Target Enrichment System provides genetic data similar to that obtained by conventional polymerase chain reaction-CE methods with the advantage of analyzing substantially more markers in 1 sequencing run. The genetic typing performance of HaloPlex is comparable to other MPS-based sample preparation systems that utilize primer-based target enrichment.
Croatian Medical Journal | 2017
Xiangpei Zeng; Jonathan L. King; Bruce Budowle
Aim To characterize the noise and stutter distribution of 23 short tandem repeats (STRs) included in the PowerSeqTM Auto System. Methods Raw FASTQ files were analyzed using STRait Razor v2s to display alleles and coverage. The sequence noise was divided into several categories: noise at allele position, noise at -1 repeat position, and artifact. The average relative percentages of locus coverage for each noise, stutter, and allele were calculated from the samples used for this locus noise analysis. Results Stutter products could be routinely observed at the -2 repeat position, -1 repeat position, and +1 repeat position of alleles. Sequence noise at the allele position ranged from 10.22% to 28.81% of the total locus coverage. At the allele position, individual noise reads were relatively low. Conclusion The data indicate that noise generally will be low. In addition, the PowerSeqTM Auto System could capture nine flanking region single nucleotide polymorphisms (SNPs) that would not be observed by other current kits for massively parallel sequencing (MPS) of STRs.
International Journal of Legal Medicine | 2016
Xiangpei Zeng; David H. Warshauer; Jonathan L. King; Jennifer D. Churchill; Ranajit Chakraborty; Bruce Budowle
AbstractAncestry informative markers (AIMs) can be used to determine population affiliation of the donors of forensic samples. In order to examine ancestry evaluations of the four major populations in the USA, 23 highly informative AIMs were identified from the International HapMap project. However, the efficacy of these 23 AIMs could not be fully evaluated in silico. In this study, these 23 SNPs were multiplexed to test their actual performance in ancestry evaluations. Genotype data were obtained from 189 individuals collected from four American populations. One SNP (rs12149261) on chromosome 16 was removed from this panel because it was duplicated on chromosome 1. The resultant 22-AIMs panel was able to empirically resolve the four major populations as in the in silico study. Eight individuals were assigned to a different group than indicated on their samples. The assignments of the 22 AIMs for these samples were consistent with AIMs results from the ForenSeqTM panel. No departures from Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) were detected for all 22 SNPs in four US populations (after removing the eight problematic samples). The principal component analysis (PCA) results indicated that 181 individuals from these populations were assigned to the expected groups. These 22 SNPs can contribute to the candidate AIMs pool for potential forensic identification purposes in major US populations.
International Journal of Legal Medicine | 2018
Xiangpei Zeng; Kyleen Elwick; Carrie Mayes; Maiko Takahashi; Jonathan L. King; David Gangitano; Bruce Budowle; Sheree Hughes-Stamm
Skeletal remains recovered from missing persons’ cases are often exposed to harsh environmental conditions resulting in the DNA being damaged, degraded, and/or the samples containing PCR inhibitors. In this study, the efficacy of common extraction methods was evaluated to remove high levels of PCR inhibitors commonly encountered with human remains, and their downstream compatibility with the two leading sequencing chemistries and platforms for human identification purposes. Blood, hair, and bone samples were spiked with high levels of inhibitors commonly identified in each particular substrate in order to test the efficiency of various DNA extraction methods prior to sequencing. Samples were extracted using three commercial extraction kits (DNA IQ™, DNA Investigator, and PrepFiler® BTA), organic (blood and hair only), and two total demineralization protocols (bone only)). Massively parallel sequencing (MPS) was performed using two different systems: Precision ID chemistry and a custom AmpliSeq™ STR and iiSNP panel on the Ion S5™ System and the ForenSeq DNA Signature Prep Kit on the MiSeq FGx™. The overall results showed that all DNA extraction methods were efficient and are fully compatible with both MPS systems. Key performance indicators such as STR and SNP reportable alleles, read depth, and heterozygote balance were comparable for each extraction method. In samples where CE-based STRs yielded partial profiles (bone), MPS-based STRs generated more complete or full profiles. Moreover, MPS panels contain more STR loci than current CE-based STR kits and also include SNPs, which can further increase the power of discrimination obtained from these samples, making MPS a desirable choice for the forensic analysis of such challenging samples.