Randall H. Brown
Washington University in St. Louis
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Randall H. Brown.
Genome Biology | 2006
Manimozhiyan Arumugam; Chaochun Wei; Randall H. Brown; Michael R. Brent
BackgroundThis paper describes Pairagon+N-SCAN_EST, a gene annotation pipeline that uses only native alignments. For each expressed sequence it chooses the best genomic alignment. Systems like ENSEMBL and ExoGean rely on trans alignments, in which expressed sequences are aligned to the genomic loci of putative homologs. Trans alignments contain a high proportion of mismatches, gaps, and/or apparently unspliceable introns, compared to alignments of cDNA sequences to their native loci. The Pairagon+N-SCAN_EST pipelines first stage is Pairagon, a cDNA-to-genome alignment program based on a PairHMM probability model. This model relies on prior knowledge, such as the fact that introns must begin with GT, GC, or AT and end with AG or AC. It produces very precise alignments of high quality cDNA sequences. In the genomic regions between Pairagons cDNA alignments, the pipeline combines EST alignments with de novo gene prediction by using N-SCAN_EST. N-SCAN_EST is based on a generalized HMM probability model augmented with a phylogenetic conservation model and EST alignments. It can predict complete transcripts by extending or merging EST alignments, but it can also predict genes in regions without EST alignments. Because they are based on probability models, both Pairagon and N-SCAN_EST can be trained automatically for new genomes and data sets.ResultsOn the ENCODE regions of the human genome, Pairagon+N-SCAN_EST was as accurate as any other system tested in the EGASP assessment, including ENSEMBL and ExoGean.ConclusionWith sufficient mRNA/EST evidence, genome annotation without trans alignments can compete successfully with systems like ENSEMBL and ExoGean, which use trans alignments.
Proceedings of the National Academy of Sciences of the United States of America | 2016
Drew G. Michael; Ezekiel Maier; Holly Brown; Stacey R. Gish; Christopher Fiore; Randall H. Brown; Michael R. Brent
Significance The ability to engineer specific behaviors into cells would have a significant impact on biomedicine and biotechnology, including applications to regenerative medicine and biofuels production. One way to coax cells to behave in a desired way is to globally modify their gene expression state, making it more like the state of cells with the desired behavior. This paper introduces a broadly applicable algorithm for transcriptome engineering—designing transcription factor deletions or overexpressions to move cells to a gene expression state that is associated with a desired phenotype. This paper also presents an approach to benchmarking and validating such algorithms. The availability of systematic, objective benchmarks for a computational task often stimulates increased effort and rapid progress on that task. The ability to rationally manipulate the transcriptional states of cells would be of great use in medicine and bioengineering. We have developed an algorithm, NetSurgeon, which uses genome-wide gene-regulatory networks to identify interventions that force a cell toward a desired expression state. We first validated NetSurgeon extensively on existing datasets. Next, we used NetSurgeon to select transcription factor deletions aimed at improving ethanol production in Saccharomyces cerevisiae cultures that are catabolizing xylose. We reasoned that interventions that move the transcriptional state of cells using xylose toward that of cells producing large amounts of ethanol from glucose might improve xylose fermentation. Some of the interventions selected by NetSurgeon successfully promoted a fermentative transcriptional state in the absence of glucose, resulting in strains with a 2.7-fold increase in xylose import rates, a 4-fold improvement in xylose integration into central carbon metabolism, or a 1.3-fold increase in ethanol production rate. We conclude by presenting an integrated model of transcriptional regulation and metabolic flux that will enable future efforts aimed at improving xylose fermentation to prioritize functional regulators of central carbon metabolism.
Solid State Communications | 1987
Randall H. Brown; A. E. Carlsson
Abstract The coordination number effects on the effective-pair-interaction of a model binary transition metal alloy are calculated as a function of band-filling and concentration. A six moment expansion along with the maximum-entropy method is used to approximate the density of states of a d -band, nearest-neighbor, tight-binding Hamiltonian, and a decoupled form is used for the higher order correlation functions. We find that the coordination number effects are large. The effective-pair-interaction scales approximately as Z - 3 2 where Z is the coordination number.
Archive | 2011
Ezekiel Maier; Randall H. Brown; Michael R. Brent
Determining the beginning and end positions of each exon in each protein coding gene within a genome can be difficult because the DNA patterns that signal a gene’s presence have multiple weakly related alternate forms and the DNA fragments that comprise a gene are generally small in comparison to the size of the genome. In response to this challenge, automated gene predictors were created to generate putative gene structures. N SCAN identifies gene structures in a target DNA sequence and can use conservation patterns learned from alignments between a target and one or more informant DNA sequences. N SCAN uses a Bayesian network, generated from a phylogenetic tree, to probabilistically relate the target sequence to the aligned sequence(s). Phylogenetic substitution models are used to estimate substitution likelihood along the branches of the tree. Although N SCAN’s predictive accuracy is already a benchmark for de novo HMM based gene predictors, optimizing its use of substitution models will allow for improved conservation pattern estimates leading to even better accuracy. Selecting optimal substitution models requires avoiding overfitting as more detailed models require more free parameters; unfortunately, the number of parameters is limited by the number of known genes available for parameter estimation (training). In order to optimize substitution model selection, we tested eight Type of Report: Other Department of Computer Science & Engineering Washington University in St. Louis Campus Box 1045 St. Louis, MO 63130 ph: (314) 935-6160 1 Optimization of Gene Prediction via More Accurate Phylogenetic Substitution Models Ezekiel Maier, Randall H Brown, and Michael R Brent Department of Computer Science and Engineering, Washington University, Saint Louis, MO, 63130 Abstract: Determining the beginning and end positions of each exon in each protein coding gene within a genome can be difficult because the DNA patterns that signal a gene’s presence have multiple weakly related alternate forms and the DNA fragments that comprise a gene are generally small in comparison to the size of the genome. In response to this challenge, automated gene predictors were created to generate putative gene structures. N-SCAN identifies gene structures in a target DNA sequence and can use conservation patterns learned from alignments between a target and one or more informant DNA sequences. N-SCAN uses a Bayesian network, generated from a phylogenetic tree, to probabilistically relate the target sequence to the aligned sequence(s). Phylogenetic substitution models are used to estimate substitution likelihood along the branches of the tree. Although N-SCAN’s predictive accuracy is already a benchmark for de novo HMM based gene predictors, optimizing its use of substitution models will allow for improved conservation pattern estimates leading to even better accuracy. Selecting optimal substitution models requires avoiding overfitting as more detailed models require more free parameters; unfortunately, the number of parameters is limited by the number of known genes available for parameter estimation (training). In order to optimize substitution model selection, we tested eight models on the entire genome including General, Reversible, HKY, Jukes-Cantor, and Kimura. In addition to testing models on the entire genome, genome feature based model selection strategies were investigated by assessing the ability of each model to accurately reflex the unique conservation patterns present in each genome region. Context dependency was examined using Determining the beginning and end positions of each exon in each protein coding gene within a genome can be difficult because the DNA patterns that signal a gene’s presence have multiple weakly related alternate forms and the DNA fragments that comprise a gene are generally small in comparison to the size of the genome. In response to this challenge, automated gene predictors were created to generate putative gene structures. N-SCAN identifies gene structures in a target DNA sequence and can use conservation patterns learned from alignments between a target and one or more informant DNA sequences. N-SCAN uses a Bayesian network, generated from a phylogenetic tree, to probabilistically relate the target sequence to the aligned sequence(s). Phylogenetic substitution models are used to estimate substitution likelihood along the branches of the tree. Although N-SCAN’s predictive accuracy is already a benchmark for de novo HMM based gene predictors, optimizing its use of substitution models will allow for improved conservation pattern estimates leading to even better accuracy. Selecting optimal substitution models requires avoiding overfitting as more detailed models require more free parameters; unfortunately, the number of parameters is limited by the number of known genes available for parameter estimation (training). In order to optimize substitution model selection, we tested eight models on the entire genome including General, Reversible, HKY, Jukes-Cantor, and Kimura. In addition to testing models on the entire genome, genome feature based model selection strategies were investigated by assessing the ability of each model to accurately reflex the unique conservation patterns present in each genome region. Context dependency was examined using zeroth, first, and second order models. All models were tested on the human and D. melanogaster genomes. Analysis of the data suggests that the nucleotide equilibrium frequency assumption (denoted as i) is the strongest predictor of a model’s accuracy, followed by reversibility and transition/transversion inequality. Furthermore, second order models are shown to give an average of 0.6% improvement over first order models, which give an 18% improvement over zeroth order models. Finally, by limiting parameter usage by the number of training examples available for each feature, genome feature based model selection better estimates substitution likelihood leading to a significant improvement in N-SCAN’s gene annotation accuracy.
Genome Research | 2004
Aaron E. Tenney; Randall H. Brown; Charles J. Vaske; Jennifer K. Lodge; Tamara L. Doering; Michael R. Brent
Genome Research | 2005
Randall H. Brown; Samuel S. Gross; Michael R. Brent
Physical Review B | 1985
Randall H. Brown; A. E. Carlsson
Bioinformatics | 2009
David V. Lu; Randall H. Brown; Manimozhiyan Arumugam; Michael R. Brent