Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Scott J. Emrich is active.

Publication


Featured researches published by Scott J. Emrich.


Science | 2015

Extensive introgression in a malaria vector species complex revealed by phylogenomics

Michael Fontaine; James B. Pease; Aaron Steele; Robert M. Waterhouse; Daniel E. Neafsey; Igor V. Sharakhov; Xiaofang Jiang; Andrew Brantley Hall; Flaminia Catteruccia; Evdoxia G. Kakani; Sara N. Mitchell; Yi-Chieh Wu; Hilary A. Smith; R. Rebecca Love; Mara K. N. Lawniczak; Michel A. Slotman; Scott J. Emrich; Matthew W. Hahn; Nora J. Besansky

Introduction The notion that species boundaries can be porous to introgression is increasingly accepted. Yet the broader role of introgression in evolution remains contentious and poorly documented, partly because of the challenges involved in accurately identifying introgression in the very groups where it is most likely to occur. Recently diverged species often have incomplete reproductive barriers and may hybridize where they overlap. However, because of retention and stochastic sorting of ancestral polymorphisms, inference of the correct species branching order is notoriously challenging for recent speciation events, especially those closely spaced in time. Without knowledge of species relationships, it is impossible to identify instances of introgression. Rationale Since the discovery that the single mosquito taxon described in 1902 as Anopheles gambiae was actually a complex of several closely related and morphologically indistinguishable sibling species, the correct species branching order has remained controversial and unresolved. This Afrotropical complex contains the world’s most important vectors of human malaria, owing to their close association with humans, as well as minor vectors and species that do not bite humans. On the basis of ecology and behavior, one might predict phylogenetic clustering of the three highly anthropophilic vector species. However, previous phylogenetic analyses of the complex based on a limited number of markers strongly disagree about relationships between the major vectors, potentially because of historical introgression between them. To investigate the history of the species complex, we used whole-genome reference assemblies, as well as dozens of resequenced individuals from the field. Results We observed a large amount of phylogenetic discordance between trees generated from the autosomes and X chromosome. The autosomes, which make up the majority of the genome, overwhelmingly supported the grouping of the three major vectors of malaria, An. gambiae, An. coluzzii, and An. arabiensis. In stark contrast, the X chromosome strongly supported the grouping of An. arabiensis with a species that plays no role in malaria transmission, An. quadriannulatus. Although the whole-genome consensus phylogeny unequivocally agrees with the autosomal topology, we found that the topology most often located on the X chromosome follows the historical species branching order, with pervasive introgression on the autosomes producing relationships that group the three highly anthropophilic species together. With knowledge of the correct species branching order, we are further able to uncover introgression between another species pair, as well as a complex history of balancing selection, introgression, and local adaptation of a large autosomal inversion that confers aridity tolerance. Conclusion We identify the correct species branching order of the An. gambiae species complex, resolving a contentious phylogeny. Notably, lineages leading to the principal vectors of human malaria were among the first in the complex to radiate and are not most closely related to each other. Pervasive autosomal introgression between these human malaria vectors, including nonsister vector species, suggests that traits enhancing vectorial capacity can be acquired not only through de novo mutation but also through a more rapid process of interspecific genetic exchange. Time-lapse photographs of an adult anopheline mosquito emerging from its pupal case. RELATED ITEMS IN ScienceD. E. Neafsey et al., Science 347, 1258522 (2015) Introgressive hybridization is now recognized as a widespread phenomenon, but its role in evolution remains contested. Here, we use newly available reference genome assemblies to investigate phylogenetic relationships and introgression in a medically important group of Afrotropical mosquito sibling species. We have identified the correct species branching order to resolve a contentious phylogeny and show that lineages leading to the principal vectors of human malaria were among the first to split. Pervasive autosomal introgression between these malaria vectors means that only a small fraction of the genome, mainly on the X chromosome, has not crossed species boundaries. Our results suggest that traits enhancing vectorial capacity may be gained through interspecific gene flow, including between nonsister species. Mosquito adaptability across genomes Virtually everyone has first-hand experience with mosquitoes. Few recognize the subtle biological distinctions among these bloodsucking flies that render some bites mere nuisances and others the initiation of a potentially life-threatening infection. By sequencing the genomes of several mosquitoes in depth, Neafsey et al. and Fontaine et al. reveal clues that explain the mystery of why only some species of one genus of mosquitoes are capable of transmitting human malaria (see the Perspective by Clark and Messer). Science, this issue 10.1126/science.1258524 and 10.1126/science.1258522; see also p. 27 Comparison of several genomes reveals the genetic history of mosquitoes’ ability to vector malaria among humans. [Also see Perspective by Clark and Messer]


Nucleic Acids Research | 2015

VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases

Gloria I. Giraldo-Calderón; Scott J. Emrich; Robert M. MacCallum; Gareth Maslen; Emmanuel Dialynas; Pantelis Topalis; Nicholas Ho; Sandra Gesing; Gregory R. Madey; Frank H. Collins; Daniel Lawson

VectorBase is a National Institute of Allergy and Infectious Diseases supported Bioinformatics Resource Center (BRC) for invertebrate vectors of human pathogens. Now in its 11th year, VectorBase currently hosts the genomes of 35 organisms including a number of non-vectors for comparative analysis. Hosted data range from genome assemblies with annotated gene features, transcript and protein expression data to population genetics including variation and insecticide-resistance phenotypes. Here we describe improvements to our resource and the set of tools available for interrogating and accessing BRC data including the integration of Web Apollo to facilitate community annotation and providing Galaxy to support user-based workflows. VectorBase also actively supports our community through hands-on workshops and online tutorials. All information and data are freely available from our website at https://www.vectorbase.org/.


Bioinformatics | 2004

A strategy for assembling the maize (Zea mays L.) genome

Scott J. Emrich; Srinivas Aluru; Yan Fu; Tsui-Jung Wen; Mahesh Narayanan; Ling Guo; Daniel Ashlock

UNLABELLEDnBecause the bulk of the maize (Zea mays L.) genome consists of repetitive sequences, sequencing efforts are being targeted to its gene-rich fraction. Traditional assembly programs are inadequate for this approach because they are optimized for a uniform sampling of the genome and inherently lack the ability to differentiate highly similar paralogs.nnnRESULTSnWe report the development of bioinformatics tools for the accurate assembly of the maize genome. This software, which is based on innovative parallel algorithms to ensure scalability, assembled 730,974 genomic survey sequences fragments in 4 h using 64 Pentium III 1.26 GHz processors of a commodity cluster. Algorithmic innovations are used to reduce the number of pairwise alignments significantly without sacrificing quality. Clone pair information was used to estimate the error rate for improved differentiation of polymorphisms versus sequencing errors. The assembly was also used to evaluate the effectiveness of various filtering strategies and thereby provide information that can be used to focus subsequent sequencing efforts.


GigaScience | 2013

Assemblathon 2: evaluating de novo

Keith Bradnam; Joseph Fass; Anton Alexandrov; Paul Baranay; Michael Bechner; Inanc Birol; Sébastien Boisvert; Jarrod Chapman; Guillaume Chapuis; Rayan Chikhi; Hamidreza Chitsaz; Wen-Chi Chou; Jacques Corbeil; Cristian Del Fabbro; T. Roderick Docking; Richard Durbin; Dent Earl; Scott J. Emrich; Pavel Fedotov; Nuno A. Fonseca; Ganeshkumar Ganapathy; Richard A. Gibbs; Sante Gnerre; Élénie Godzaridis; Steve Goldstein; Matthias Haimel; Giles Hall; David Haussler; Joseph Hiatt; Isaac Ho

BackgroundThe process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.ResultsIn Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.ConclusionsMany current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.


Ecology Letters | 2015

Experimental evidence of genome-wide impact of ecological selection during early stages of speciation-with-gene-flow

Scott P. Egan; Gregory J. Ragland; Lauren A. Assour; Thomas H. Q. Powell; Glen R. Hood; Scott J. Emrich; Patrik Nosil; Jeffrey L. Feder

Abstract Theory predicts that speciation‐with‐gene‐flow is more likely when the consequences of selection for population divergence transitions from mainly direct effects of selection acting on individual genes to a collective property of all selected genes in the genome. Thus, understanding the direct impacts of ecologically based selection, as well as the indirect effects due to correlations among loci, is critical to understanding speciation. Here, we measure the genome‐wide impacts of host‐associated selection between hawthorn and apple host races of Rhagoletis pomonella (Diptera: Tephritidae), a model for contemporary speciation‐with‐gene‐flow. Allele frequency shifts of 32 455 SNPs induced in a selection experiment based on host phenology were genome wide and highly concordant with genetic divergence between co‐occurring apple and hawthorn flies in nature. This striking genome‐wide similarity between experimental and natural populations of R. pomonella underscores the importance of ecological selection at early stages of divergence and calls for further integration of studies of eco‐evolutionary dynamics and genome divergence.


Proceedings of the National Academy of Sciences of the United States of America | 2016

Radical remodeling of the Y chromosome in a recent radiation of malaria mosquitoes

Andrew Brantley Hall; Philippos-Aris Papathanos; Atashi Sharma; Changde Cheng; Omar S. Akbari; Lauren A. Assour; Nicholas H. Bergman; Alessia Cagnetti; Andrea Crisanti; Tania Dottorini; Elisa Fiorentini; Roberto Galizi; Jonathan Hnath; Xiaofang Jiang; Sergey Koren; Tony Nolan; Diane Radune; Maria V. Sharakhova; Aaron Steele; Vladimir A. Timoshevskiy; Nikolai Windbichler; Simo Zhang; Matthew W. Hahn; Adam M. Phillippy; Scott J. Emrich; Igor V. Sharakhov; Zhijian Jake Tu; Nora J. Besansky

Significance Interest in male mosquitoes has been motivated by the potential to develop novel vector control strategies, exploiting the fact that males do not feed on blood or transmit diseases, such as malaria. However, genetic studies of male Anopheles mosquitoes have been impeded by the lack of molecular characterization of the Y chromosome. Here we show that the Anopheles gambiae Y chromosome contains a very small repertoire of genes, with massively amplified tandem arrays of a small number of satellites and transposable elements constituting the vast majority of the sequence. These genes and repeats evolve rapidly, bringing about remodeling of the Y, even among closely related species. Our study provides a long-awaited foundation for studying mosquito Y chromosome biology and evolution. Y chromosomes control essential male functions in many species, including sex determination and fertility. However, because of obstacles posed by repeat-rich heterochromatin, knowledge of Y chromosome sequences is limited to a handful of model organisms, constraining our understanding of Y biology across the tree of life. Here, we leverage long single-molecule sequencing to determine the content and structure of the nonrecombining Y chromosome of the primary African malaria mosquito, Anopheles gambiae. We find that the An. gambiae Y consists almost entirely of a few massively amplified, tandemly arrayed repeats, some of which can recombine with similar repeats on the X chromosome. Sex-specific genome resequencing in a recent species radiation, the An. gambiae complex, revealed rapid sequence turnover within An. gambiae and among species. Exploiting 52 sex-specific An. gambiae RNA-Seq datasets representing all developmental stages, we identified a small repertoire of Y-linked genes that lack X gametologs and are not Y-linked in any other species except An. gambiae, with the notable exception of YG2, a candidate male-determining gene. YG2 is the only gene conserved and exclusive to the Y in all species examined, yet sequence similarity to YG2 is not detectable in the genome of a more distant mosquito relative, suggesting rapid evolution of Y chromosome genes in this highly dynamic genus of malaria vectors. The extensive characterization of the An. gambiae Y provides a long-awaited foundation for studying male mosquito biology, and will inform novel mosquito control strategies based on the manipulation of Y chromosomes.


PLOS ONE | 2014

Standardized Metadata for Human Pathogen/Vector Genomic Sequences

Vivien G. Dugan; Scott J. Emrich; Gloria I. Giraldo-Calderón; Omar S. Harb; Ruchi M. Newman; Brett E. Pickett; Lynn M. Schriml; Timothy B. Stockwell; Christian J. Stoeckert; Daniel E. Sullivan; Indresh Singh; Doyle V. Ward; Alison Yao; Jie Zheng; Tanya Barrett; Bruce W. Birren; Lauren M. Brinkac; Vincent M. Bruno; Elizabet Caler; Sinéad B. Chapman; Frank H. Collins; Christina A. Cuomo; Valentina Di Francesco; Scott Durkin; Mark Eppinger; Michael Feldgarden; Claire M. Fraser; W. Florian Fricke; Maria Giovanni; Matthew R. Henn

High throughput sequencing has accelerated the determination of genome sequences for thousands of human infectious disease pathogens and dozens of their vectors. The scale and scope of these data are enabling genotype-phenotype association studies to identify genetic determinants of pathogen virulence and drug/insecticide resistance, and phylogenetic studies to track the origin and spread of disease outbreaks. To maximize the utility of genomic sequences for these purposes, it is essential that metadata about the pathogen/vector isolate characteristics be collected and made available in organized, clear, and consistent formats. Here we report the development of the GSCID/BRC Project and Sample Application Standard, developed by representatives of the Genome Sequencing Centers for Infectious Diseases (GSCIDs), the Bioinformatics Resource Centers (BRCs) for Infectious Diseases, and the U.S. National Institute of Allergy and Infectious Diseases (NIAID), part of the National Institutes of Health (NIH), informed by interactions with numerous collaborating scientists. It includes mapping to terms from other data standards initiatives, including the Genomic Standards Consortium’s minimal information (MIxS) and NCBI’s BioSample/BioProjects checklists and the Ontology for Biomedical Investigations (OBI). The standard includes data fields about characteristics of the organism or environmental source of the specimen, spatial-temporal information about the specimen isolation event, phenotypic characteristics of the pathogen/vector isolated, and project leadership and support. By modeling metadata fields into an ontology-based semantic framework and reusing existing ontologies and minimum information checklists, the application standard can be extended to support additional project-specific data fields and integrated with other data represented with comparable standards. The use of this metadata standard by all ongoing and future GSCID sequencing projects will provide a consistent representation of these data in the BRC resources and other repositories that leverage these data, allowing investigators to identify relevant genomic sequences and perform comparative genomics analyses that are both statistically meaningful and biologically relevant.


Molecular Ecology | 2014

Gene expression in closely related species mirrors local adaptation: consequences for responses to a warming world

Shawn T. O'Neil; Jason D. K. Dzurisin; Caroline M. Williams; Neil F. Lobo; Jessica K. Higgins; Jillian M. Deines; Rory Carmichael; Erliang Zeng; John C. Tan; Grace C. Wu; Scott J. Emrich; Jessica J. Hellmann

Local adaptation of populations could preclude or slow range expansions in response to changing climate, particularly when dispersal is limited. To investigate the differential responses of populations to changing climatic conditions, we exposed poleward peripheral and central populations of two Lepidoptera to reciprocal, common‐garden climatic conditions and compared their whole‐transcriptome expression. We found evidence of simple population differentiation in both species, and in the species with previously identified population structure and phenotypic local adaptation, we found several hundred genes that responded in a synchronized and localized fashion. These genes were primarily involved in energy metabolism and oxidative stress, and expression levels were most divergent between populations in the same environment in which we previously detected divergence for metabolism. We found no localized genes in the species with less population structure and for which no local adaptation was previously detected. These results challenge the assumption that species are functionally similar across their ranges and poleward peripheral populations are preadapted to warmer conditions. Rather, some taxa deserve population‐level consideration when predicting the effects of climate change because they respond in genetically based, distinctive ways to changing conditions.


Bioinformatics | 2015

RNA-Rocket: an RNA-Seq analysis resource for infectious disease research

Andrew S. Warren; Cristina Aurrecoechea; Brian P. Brunk; Prerak T. Desai; Scott J. Emrich; Gloria I. Giraldo-Calderón; Omar S. Harb; Deborah Hix; Daniel Lawson; Dustin Machi; Chunhong Mao; Michael McClelland; Eric K. Nordberg; Maulik Shukla; Leslie B. Vosshall; Alice R. Wattam; Rebecca Will; Hyun Seung Yoo; Bruno W. S. Sobral

Motivation: RNA-Seq is a method for profiling transcription using high-throughput sequencing and is an important component of many research projects that wish to study transcript isoforms, condition specific expression and transcriptional structure. The methods, tools and technologies used to perform RNA-Seq analysis continue to change, creating a bioinformatics challenge for researchers who wish to exploit these data. Resources that bring together genomic data, analysis tools, educational material and computational infrastructure can minimize the overhead required of life science researchers. Results: RNA-Rocket is a free service that provides access to RNA-Seq and ChIP-Seq analysis tools for studying infectious diseases. The site makes available thousands of pre-indexed genomes, their annotations and the ability to stream results to the bioinformatics resources VectorBase, EuPathDB and PATRIC. The site also provides a combination of experimental data and metadata, examples of pre-computed analysis, step-by-step guides and a user interface designed to enable both novice and experienced users of RNA-Seq data. Availability and implementation: RNA-Rocket is available at rnaseq.pathogenportal.org. Source code for this project can be found at github.com/cidvbi/PathogenPortal. Contact: [email protected] Supplementary information: Supplementary materials are available at Bioinformatics online.


Insect Molecular Biology | 2013

The characterization of the Phlebotomus papatasi transcriptome

Jenica Abrudan; Marcelo Ramalho-Ortigão; Shawn T. O'Neil; Gwen Stayback; Mariha Wadsworth; Megan Bernard; Doug Shoue; Scott J. Emrich; Phillip G. Lawyer; Shaden Kamhawi; Edgar D. Rowton; Michael J. Lehane; Paul A. Bates; Jesus G. Valenzeula; Chad Tomlinson; Elizabeth L. Appelbaum; Deborah Moeller; Brenda Thiesing; Rod J. Dillon; Sandra W. Clifton; Neil F. Lobo; Richard Wilson; Frank H. Collins; Mary Ann McDowell

As important vectors of human disease, phlebotomine sand flies are of global significance to human health, transmitting several emerging and re‐emerging infectious diseases. The most devastating of the sand fly transmitted infections are the leishmaniases, causing significant mortality and morbidity in both the Old and New World. Here we present the first global transcriptome analysis of the Old World vector of cutaneous leishmaniasis, Phlebotomus papatasi (Scopoli) and compare this transcriptome to that of the New World vector of visceral leishmaniasis, Lutzomyia longipalpis. A normalized cDNA library was constructed using pooled mRNA from Phlebotomus papatasi larvae, pupae, adult males and females fed sugar, blood, or blood infected with Leishmania major. A total of 47u2009615 generated sequences was cleaned and assembled into 17u2009120 unique transcripts. Of the assembled sequences, 50% (8837 sequences) were classified using Gene Ontology (GO) terms. This collection of transcripts is comprehensive, as demonstrated by the high number of different GO categories. An in‐depth analysis revealed 245 sequences with putative homology to proteins involved in blood and sugar digestion, immune response and peritrophic matrix formation. Twelve of the novel genes, including one trypsin, two peptidoglycan recognition proteins (PGRP) and nine chymotrypsins, have a higher expression level during larval stages. Two novel chymotrypsins and one novel PGRP are abundantly expressed upon blood feeding. This study will greatly improve the available genomic resources for P.u2009papatasi and will provide essential information for annotation of the full genome.

Collaboration


Dive into the Scott J. Emrich's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Douglas Thain

University of Notre Dame

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Neil F. Lobo

University of Notre Dame

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Aaron Steele

University of Notre Dame

View shared research outputs
Researchain Logo
Decentralizing Knowledge