Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Eric P. Nawrocki is active.

Publication


Featured researches published by Eric P. Nawrocki.


The ISME Journal | 2012

An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea

Daniel McDonald; Morgan N. Price; Julia K. Goodrich; Eric P. Nawrocki; Todd Z. DeSantis; Alexander J. Probst; Gary L. Andersen; Rob Knight; Philip Hugenholtz

Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a ‘taxonomy to tree’ approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408 315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/.


Bioinformatics | 2009

Infernal 1.0: inference of RNA alignments

Eric P. Nawrocki; Diana L. Kolbe; Sean R. Eddy

Summary: infernal builds consensus RNA secondary structure profiles called covariance models (CMs), and uses them to search nucleic acid sequence databases for homologous RNAs, or to create new sequence- and structure-based multiple sequence alignments. Availability: Source code, documentation and benchmark downloadable from http://infernal.janelia.org. infernal is freely licensed under the GNU GPLv3 and should be portable to any POSIX-compliant operating system, including Linux and Mac OS/X. Contact: nawrockie,kolbed,gro.imhh.ailenaj@sydde


Bioinformatics | 2009

Infernal 1.0

Eric P. Nawrocki; Diana L. Kolbe; Sean R. Eddy

SUMMARY INFERNAL builds consensus RNA secondary structure profiles called covariance models (CMs), and uses them to search nucleic acid sequence databases for homologous RNAs, or to create new sequence- and structure-based multiple sequence alignments. AVAILABILITY Source code, documentation and benchmark downloadable from http://infernal.janelia.org. INFERNAL is freely licensed under the GNU GPLv3 and should be portable to any POSIX-compliant operating system, including Linux and Mac OS/X.


Nucleic Acids Research | 2009

Rfam: updates to the RNA families database.

Paul P. Gardner; Jennifer Daub; John G. Tate; Eric P. Nawrocki; Diana L. Kolbe; Stinus Lindgreen; Adam C. Wilkinson; Robert D. Finn; Sam Griffiths-Jones; Sean R. Eddy; Alex Bateman

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.


Nucleic Acids Research | 2013

Rfam 11.0: 10 years of RNA families

Sarah W. Burge; Jennifer Daub; Ruth Y. Eberhardt; John G. Tate; Lars Barquist; Eric P. Nawrocki; Sean R. Eddy; Paul P. Gardner; Alex Bateman

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.


Bioinformatics | 2013

Infernal 1.1: 100-fold faster RNA homology searches.

Eric P. Nawrocki; Sean R. Eddy

Summary: Infernal builds probabilistic profiles of the sequence and secondary structure of an RNA family called covariance models (CMs) from structurally annotated multiple sequence alignments given as input. Infernal uses CMs to search for new family members in sequence databases and to create potentially large multiple sequence alignments. Version 1.1 of Infernal introduces a new filter pipeline for RNA homology search based on accelerated profile hidden Markov model (HMM) methods and HMM-banded CM alignment methods. This enables ∼100-fold acceleration over the previous version and ∼10 000-fold acceleration over exhaustive non-filtered CM searches. Availability: Source code, documentation and the benchmark are downloadable from http://infernal.janelia.org. Infernal is freely licensed under the GNU GPLv3 and should be portable to any POSIX-compliant operating system, including Linux and Mac OS/X. Documentation includes a user’s guide with a tutorial, a discussion of file formats and user options and additional details on methods implemented in the software. Contact: [email protected]


Nucleic Acids Research | 2015

Rfam 12.0: updates to the RNA families database

Eric P. Nawrocki; Sarah W. Burge; Alex Bateman; Jennifer Daub; Ruth Y. Eberhardt; Sean R. Eddy; Evan W. Floden; Paul P. Gardner; Thomas A. Jones; John G. Tate; Robert D. Finn

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.


Nucleic Acids Research | 2011

Rfam: Wikipedia, clans and the “decimal” release

Paul P. Gardner; Jennifer Daub; John G. Tate; Benjamin L. Moore; Isabelle H. Osuch; Sam Griffiths-Jones; Robert D. Finn; Eric P. Nawrocki; Diana L. Kolbe; Sean R. Eddy; Alex Bateman

The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.


PLOS Computational Biology | 2005

Query-dependent banding (QDB) for faster RNA similarity searches.

Eric P. Nawrocki; Sean R. Eddy

When searching sequence databases for RNAs, it is desirable to score both primary sequence and RNA secondary structure similarity. Covariance models (CMs) are probabilistic models well-suited for RNA similarity search applications. However, the computational complexity of CM dynamic programming alignment algorithms has limited their practical application. Here we describe an acceleration method called query-dependent banding (QDB), which uses the probabilistic query CM to precalculate regions of the dynamic programming lattice that have negligible probability, independently of the target database. We have implemented QDB in the freely available Infernal software package. QDB reduces the average case time complexity of CM alignment from LN 2.4 to LN 1.3 for a query RNA of N residues and a target database of L residues, resulting in a 4-fold speedup for typical RNA queries. Combined with other improvements to Infernal, including informative mixture Dirichlet priors on model parameters, benchmarks also show increased sensitivity and specificity resulting from improved parameterization.


Nucleic Acids Research | 2011

RNIE: genome-wide prediction of bacterial intrinsic terminators

Paul P. Gardner; Lars Barquist; Alex Bateman; Eric P. Nawrocki; Zasha Weinberg

Bacterial Rho-independent terminators (RITs) are important genomic landmarks involved in gene regulation and terminating gene expression. In this investigation we present RNIE, a probabilistic approach for predicting RITs. The method is based upon covariance models which have been known for many years to be the most accurate computational tools for predicting homology in structural non-coding RNAs. We show that RNIE has superior performance in model species from a spectrum of bacterial phyla. Further analysis of species where a low number of RITs were predicted revealed a highly conserved structural sequence motif enriched near the genic termini of the pathogenic Actinobacteria, Mycobacterium tuberculosis. This motif, together with classical RITs, account for up to 90% of all the significantly structured regions from the termini of M. tuberculosis genic elements. The software, predictions and alignments described below are available from http://github.com/ppgardne/RNIE.

Collaboration


Dive into the Eric P. Nawrocki's collaboration.

Top Co-Authors

Avatar

Sean R. Eddy

Howard Hughes Medical Institute

View shared research outputs
Top Co-Authors

Avatar

Alex Bateman

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Diana L. Kolbe

University of Iowa Hospitals and Clinics

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John G. Tate

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Robert D. Finn

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Sarah W. Burge

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Rob Knight

University of California

View shared research outputs
Top Co-Authors

Avatar

Sam Griffiths-Jones

Howard Hughes Medical Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge