Travis E. Doom | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Travis E. Doom is active.

Explore More

Publication

Featured researches published by Travis E. Doom.

systems man and cybernetics | 2003

Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm

Michael L. Raymer; Travis E. Doom; Leslie A. Kuhn; William F. Punch

A key element of bioinformatics research is the extraction of meaningful information from large experimental data sets. Various approaches, including statistical and graph theoretical methods, data mining, and computational pattern recognition, have been applied to this task with varying degrees of success. Using a novel classifier based on the Bayes discriminant function, we present a hybrid algorithm that employs feature selection and extraction to isolate salient features from large medical and other biological data sets. We have previously shown that a genetic algorithm coupled with a k-nearest-neighbors classifier performs well in extracting information about protein-water binding from X-ray crystallographic protein structure data. The effectiveness of the hybrid EC-Bayes classifier is demonstrated to distinguish the features of this data set that are the most statistically relevant and to weight these features appropriately to aid in the prediction of solvation sites.

Metabolomics | 2011

Dynamic adaptive binning: an improved quantification technique for NMR spectroscopic data

Paul E. Anderson; Deirdre A. Mahle; Travis E. Doom; Nicholas V. Reo; Nicholas J. DelRaso; Michael L. Raymer

The interpretation of nuclear magnetic resonance (NMR) experimental results for metabolomics studies requires intensive signal processing and multivariate data analysis techniques. A key step in this process is the quantification of spectral features, which is commonly accomplished by dividing an NMR spectrum into several hundred integral regions or bins. Binning attempts to minimize effects from variations in peak positions caused by sample pH, ionic strength, and composition, while reducing the dimensionality for multivariate statistical analyses. Herein we develop an improved novel spectral quantification technique, dynamic adaptive binning. With this technique, bin boundaries are determined by optimizing an objective function using a dynamic programming strategy. The objective function measures the quality of a bin configuration based on the number of peaks per bin. This technique shows a significant improvement over both traditional uniform binning and other adaptive binning techniques. This improvement is quantified via synthetic validation sets by analyzing an algorithm’s ability to create bins that do not contain more than a single peak and that maximize the distance from peak to bin boundary. The validation sets are developed by characterizing the salient distributions in experimental NMR spectroscopic data. Further, dynamic adaptive binning is applied to a 1H NMR-based experiment to monitor rat urinary metabolites to empirically demonstrate improved spectral quantification.

Journal of Forensic Sciences | 2005

Empirical Analysis of the STR Profiles Resulting from Conceptual Mixtures

David R. Paoletti; Travis E. Doom; Carissa M. Krane; Michael L. Raymer; Dan E. Krane

Samples containing DNA from two or more individuals can be difficult to interpret. Even ascertaining the number of contributors can be challenging and associated uncertainties can have dramatic effects on the interpretation of testing results. Using an FBI genotypes dataset, containing complete genotype information from the 13 Combined DNA Index System (CODIS) loci for 959 individuals, all possible mixtures of three individuals were exhaustively and empirically computed. Allele sharing between pairs of individuals in the original dataset, a randomized dataset and datasets of generated cousins and siblings was evaluated as were the number of loci that were necessary to reliably deduce the number of contributors present in simulated mixtures of four or less contributors. The relatively small number of alleles detectable at most CODIS loci and the fact that some alleles are likely to be shared between individuals within a population can make the maximum number of different alleles observed at any tested loci an unreliable indicator of the maximum number of contributors to a mixed DNA sample. This analysis does not use other data available from the electropherograms (such as peak height or peak area) to estimate the number of contributors to each mixture. As a result, the study represents a worst case analysis of mixture characterization. Within this dataset, approximately 3% of three-person mixtures would be mischaracterized as two-person mixtures and more than 70% of four-person mixtures would be mischaracterized as two- or three-person mixtures using only the maximum number of alleles observed at any tested locus.

Journal of Forensic Sciences | 2007

Run-Specific Limits of Detection and Quantitation for STR-based DNA Testing

Jason R. Gilder; Travis E. Doom; Keith Inman; Dan E. Krane

ABSTRACT: STR‐based DNA profiling is an exceptionally sensitive analytical technique that is often used to obtain results at the very limits of its sensitivity. The challenge of reliably distinguishing between signal and noise in such situations is one that has been rigorously addressed in numerous other analytical disciplines. However, an inability to determine accurately the height of electropherogram baselines has caused forensic DNA profiling laboratories to utilize alternative approaches. Minimum thresholds established during laboratory validation studies have become the de facto standard for distinguishing between reliable signal and noise/technical artifacts. These minimum peak height thresholds generally fail to consider variability in the sensitivity of instruments, reagents, and the skill of human analysts involved in the DNA profiling process over the course of time. Software (BatchExtract) made publicly available by the National Center for Biotechnology Information now provides an alternative means of establishing limits of detection and quantitation that is more consistent with those used in other analytical disciplines. We have used that software to determine the height of each data collection point for each dye along a control samples electropherogram trace. These values were then used to determine a limit of detection (the average amount of background noise plus three standard deviations) and a limit of quantitation (the average amount of background noise plus 10 standard deviations) for each control sample. Analyses of the electropherogram data associated with the positive, negative, and reagent blank controls included in 50 different capillary electrophoresis runs validate that this approach could be used to determine run‐specific thresholds objectively for use in forensic DNA casework.

IEEE Transactions on Education | 2003

Crossing the interdisciplinary barrier: a baccalaureate computer science option in bioinformatics

Travis E. Doom; Michael L. Raymer; Dan E. Krane; Oscar N. Garcia

Bioinformatics is a new and rapidly evolving discipline that has emerged from the fields of experimental molecular biology and biochemistry, and from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. Largely because of the inherently interdisciplinary nature of bioinformatics research, academia has been slow to respond to strong industry and government demands for trained scientists to develop and apply novel bioinformatic techniques to the rapidly growing freely available repositories of genetic and proteomic data. While some institutions are responding to this demand by establishing graduate programs in bioinformatics, the entrance barriers for these programs are high, largely because of the significant amount of prerequisite knowledge in the disparate fields of biochemistry and computer science required for sophisticated new approaches to the analysis and interpretation of bioinformatics data. The authors present an undergraduate-level bioinformatics curriculum in computer science designed for the baccalaureate student. This program is designed to be tailored easily to the needs and resources of a variety of institutions.

congress on evolutionary computation | 2005

GA-facilitated KNN classifier optimization with varying similarity measures

Michael R. Peterson; Travis E. Doom; Michael L. Raymer

Genetic algorithms are powerful tools for k-nearest neighbors classifier optimization. While traditional knn classification techniques typically employ Euclidian distance to assess pattern similarity, other measures may also be utilized. Previous research demonstrates that GAs can improve predictive accuracy by searching for optimal feature weights and offsets for a cosine similarity-based knn classifier. GA-selected weights determine the classification relevance of each feature, while offsets provide alternative points of reference when assessing angular similarity. Such optimized classifiers perform competitively with other contemporary classification techniques. This paper explores the effectiveness of GA weight and offset optimization for knowledge discovery using knn classifiers with varying similarity measures. Using Euclidian distance, cosine similarity, and Pearson correlation, untrained classifiers are compared with weight-optimized classifiers for several datasets. Simultaneous weight and offset optimization experiments are also performed for cosine similarity and Pearson correlation. This type of optimization represents a novel technique for maximizing Pearson correlation-based knn performance. While unoptimized cosine and Pearson classifiers often perform worse than their Euclidian counterparts, optimized cosine and Pearson classifiers typically show equivalent or improved performance over optimized Euclidian classifiers. In some cases, offset optimization provides further improvement for knn classifiers employing cosine similarity or Pearson correlation.

Science | 2009

Time for DNA Disclosure

Dan E. Krane; V. Bahn; David J. Balding; B. Barlow; H. Cash; B. L. Desportes; P. D'Eustachio; Keith Devlin; Travis E. Doom; Itiel E. Dror; Simon Ford; C. Funk; Jason R. Gilder; G. Hampikian; Keith Inman; Allan Jamieson; P. E. Kent; Roger Koppl; Irving L. Kornfield; Sheldon Krimsky; Jennifer L. Mnookin; Laurence D. Mueller; E. Murphy; David R. Paoletti; Dmitri A. Petrov; Michael L. Raymer; D. M. Risinger; Alvin E. Roth; Norah Rudin; W. Shields

The legislation that established the U.S. National DNA Index System (NDIS) in 1994 explicitly anticipated that database records would be available for purposes of research and quality control “if personally identifiable information is removed” [42 U.S.C. Sec 14132(b)(3)(D)]. However, the Federal

technical symposium on computer science education | 2002

A proposed undergraduate bioinformatics curriculum for computer scientists

Travis E. Doom; Michael L. Raymer; Dan E. Krane; Oscar N. Garcia

Bioinformatics is a new and rapidly evolving discipline that has emerged from the fields of experimental molecular biology and biochemistry, and from the the artificial intelligence, database, and algorithms disciplines of computer science. Largely because of the inherently interdisciplinary nature of bioinformatics research, academia has been slow to respond to strong industry and government demands for trained scientists to develop and apply novel bioinformatics techniques to the rapidly-growing, freely-available repositories of genetic and proteomic data. While some institutions are responding to this demand by establishing graduate programs in bioinformatics, the entrance barriers for these programs are high, largely due to the significant amount of prerequisite knowledge in the disparate fields of biochemistry and computer science required to author sophisticated new approaches to the analysis of bioinformatics data. We present a proposal for an undergraduate-level bioinformatics curriculum in computer science that lowers these barriers.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2012

Inferring the Number of Contributors to Mixed DNA Profiles

David R. Paoletti; Dan E. Krane; Michael L. Raymer; Travis E. Doom

Forensic samples containing DNA from two or more individuals can be difficult to interpret. Even ascertaining the number of contributors to the sample can be challenging. These uncertainties can dramatically reduce the statistical weight attached to evidentiary samples. A probabilistic mixture algorithm that takes into account not just the number and magnitude of the alleles at a locus, but also their frequency of occurrence allows the determination of likelihood ratios of different hypotheses concerning the number of contributors to a specific mixture. This probabilistic mixture algorithm can compute the probability of the alleles in a sample being present in a 2-person mixture, 3-person mixture, etc. The ratio of any two of these probabilities then constitutes a likelihood ratio pertaining to the number of contributors to such a mixture.

genetic and evolutionary computation conference | 2005