Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where William Salerno is active.

Publication


Featured researches published by William Salerno.


BMC Bioinformatics | 2014

Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

Jeffrey G. Reid; Andrew Carroll; Narayanan Veeraraghavan; Mahmoud Dahdouli; Andreas Sundquist; Adam C English; Matthew N. Bainbridge; Simon White; William Salerno; Christian Buhay; Fuli Yu; Donna M. Muzny; Richard Daly; Geoff Duyk; Richard A. Gibbs; Eric Boerwinkle

BackgroundMassively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results.ResultsTo address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts.ConclusionsBy taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.


Genome Biology | 2011

Comparative genomics of the social amoebae Dictyostelium discoideum and Dictyostelium purpureum

Richard Sucgang; Alan Kuo; Xiangjun Tian; William Salerno; Anup Parikh; Christa L. Feasley; Eileen Dalin; Hank Tu; Eryong Huang; Kerrie Barry; Erika Lindquist; Harris Shapiro; David Bruce; Jeremy Schmutz; Asaf Salamov; Petra Fey; Pascale Gaudet; Christophe Anjard; M. Madan Babu; Siddhartha Basu; Yulia A. Bushmanova; Hanke van der Wel; Mariko Katoh-Kurasawa; Christopher Dinh; Pedro M. Coutinho; Tamao Saito; Marek Eliáš; Pauline Schaap; Robert R. Kay; Bernard Henrissat

BackgroundThe social amoebae (Dictyostelia) are a diverse group of Amoebozoa that achieve multicellularity by aggregation and undergo morphogenesis into fruiting bodies with terminally differentiated spores and stalk cells. There are four groups of dictyostelids, with the most derived being a group that contains the model species Dictyostelium discoideum.ResultsWe have produced a draft genome sequence of another group dictyostelid, Dictyosteliumpurpureum, and compare it to the D. discoideum genome. The assembly (8.41 × coverage) comprises 799 scaffolds totaling 33.0 Mb, comparable to the D. discoideum genome size. Sequence comparisons suggest that these two dictyostelids shared a common ancestor approximately 400 million years ago. In spite of this divergence, most orthologs reside in small clusters of conserved synteny. Comparative analyses revealed a core set of orthologous genes that illuminate dictyostelid physiology, as well as differences in gene family content. Interesting patterns of gene conservation and divergence are also evident, suggesting function differences; some protein families, such as the histidine kinases, have undergone little functional change, whereas others, such as the polyketide synthases, have undergone extensive diversification. The abundant amino acid homopolymers encoded in both genomes are generally not found in homologous positions within proteins, so they are unlikely to derive from ancestral DNA triplet repeats. Genes involved in the social stage evolved more rapidly than others, consistent with either relaxed selection or accelerated evolution due to social conflict.ConclusionsThe findings from this new genome sequence and comparative analysis shed light on the biology and evolution of the Dictyostelia.


BMC Bioinformatics | 2014

PBHoney: identifying genomic variants via long-read discordance and interrupted mapping

Adam C English; William Salerno; Jeffrey G. Reid

BackgroundAs resequencing projects become more prevalent across a larger number of species, accurate variant identification will further elucidate the nature of genetic diversity and become increasingly relevant in genomic studies. However, the identification of larger genomic variants via DNA sequencing is limited by both the incomplete information provided by sequencing reads and the nature of the genome itself. Long-read sequencing technologies provide high-resolution access to structural variants often inaccessible to shorter reads.ResultsWe present PBHoney, software that considers both intra-read discordance and soft-clipped tails of long reads (>10,000 bp) to identify structural variants. As a proof of concept, we identify four structural variants and two genomic features in a strain of Escherichia coli with PBHoney and validate them via de novo assembly. PBHoney is available for download at http://sourceforge.net/projects/pb-jelly/.ConclusionsImplementing two variant-identification approaches that exploit the high mappability of long reads, PBHoney is demonstrated as being effective at detecting larger structural variants using whole-genome Pacific Biosciences RS II Continuous Long Reads. Furthermore, PBHoney is able to discover two genomic features: the existence of Rac-Phage in isolate; evidence of E. coli’s circular genome.


Proceedings of the National Academy of Sciences of the United States of America | 2006

Scale-invariant structure of strongly conserved sequence in genomic intersections and alignments

William Salerno; Paul Havlak; Jonathan Miller

A power-law distribution of the length of perfectly conserved sequence from mouse/human whole-genome intersection and alignment is exhibited. Spatial correlations of these elements within the mouse genome are studied. It is argued that these power-law distributions and correlations are comprised in part by functional noncoding sequence and ought to be accounted for in estimating the statistical significance of apparent sequence conservation. These inter-genomic correlations of conservation are placed in the context of previously observed intra-genomic correlations, and their possible origins and consequences are discussed.


Neurology Genetics | 2017

The Alzheimer's Disease Sequencing Project: Study design and sample selection

Gary W. Beecham; J. C. Bis; Eden R. Martin; Seung-Hoan Choi; Anita L. DeStefano; C. M. van Duijn; Myriam Fornage; Stacey Gabriel; Daniel C. Koboldt; D.E. Larson; Adam C. Naj; Bruce M. Psaty; William Salerno; William S. Bush; Tatiana Foroud; Ellen M. Wijsman; Lindsay A. Farrer; A. Goate; J.L. Haines; Margaret A. Pericak-Vance; Eric Boerwinkle; Richard Mayeux; Sudha Seshadri; Gerard D. Schellenberg

Late-onset Alzheimer disease (LOAD) is the leading cause of dementia worldwide, with substantial economic and public health implications.1 LOAD is a neurodegenerative disease characterized by progressive dementia typically manifesting in the seventh to ninth decades. Neuropathological changes precede clinical symptoms by 10–20 years, resulting in clinically asymptomatic individuals carrying neuropathologic features of LOAD.2 Much of the heritability of LOAD remains unexplained, despite LOAD having a high heritability (60%–80%) and despite the identification of the APOE locus, a major genetic determinant for LOAD.3 Genetic analyses have identified more than 25 other variants associated with smaller individual effects on disease risk.4


Nature Genetics | 2017

Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology

Jennifer A. Brody; Alanna C. Morrison; Joshua C. Bis; Jeffrey R. O'Connell; Michael R. Brown; Jennifer E. Huffman; Darren C. Ames; Andrew J. Carroll; Matthew P. Conomos; Stacey Gabriel; Richard A. Gibbs; Stephanie M. Gogarten; Namrata Gupta; Andrew D. Johnson; Joshua P. Lewis; Xiaoming Liu; Alisa K. Manning; George J. Papanicolaou; Achilleas N. Pitsillides; Kenneth Rice; William Salerno; Colleen M. Sitlani; Nicholas L. Smith; Susan R. Heckbert; Cathy C. Laurie; Braxton D. Mitchell; Stephen S. Rich; Jerome I. Rotter; James G. Wilson; Eric Boerwinkle

Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology


BMC Genomics | 2017

SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads

Oliver A. Hampton; Adam C English; Mark Wang; William Salerno; Yue Liu; Donna M. Muzny; Yi Han; David A. Wheeler; Kim C. Worley; James R. Lupski; Richard A. Gibbs

BackgroundCharacterization of genomic structural variation (SV) is essential to expanding the research and clinical applications of genome sequencing. Reliance upon short DNA fragment paired end sequencing has yielded a wealth of single nucleotide variants and internal sequencing read insertions-deletions, at the cost of limited SV detection. Multi-kilobase DNA fragment mate pair sequencing has supplemented the void in SV detection, but introduced new analytic challenges requiring SV detection tools specifically designed for mate pair sequencing data. Here, we introduce SVachra – Structural Variation Assessment of CHRomosomal Aberrations, a breakpoint calling program that identifies large insertions-deletions, inversions, inter- and intra-chromosomal translocations utilizing both inward and outward facing read types generated by mate pair sequencing.ResultsWe demonstrate SVachra’s utility by executing the program on large-insert (Illumina Nextera) mate pair sequencing data from the personal genome of a single subject (HS1011). An additional data set of long-read (Pacific BioSciences RSII) was also generated to validate SV calls from SVachra and other comparison SV calling programs. SVachra exhibited the highest validation rate and reported the widest distribution of SV types and size ranges when compared to other SV callers.ConclusionsSVachra is a highly specific breakpoint calling program that exhibits a more unbiased SV detection methodology than other callers.


Molecular Psychiatry | 2018

Whole exome sequencing study identifies novel rare and common Alzheimer’s-Associated variants involved in immune response and transcriptional regulation

Joshua C. Bis; Xueqiu Jian; Brian W. Kunkle; Yuning Chen; Kara L. Hamilton-Nelson; William S. Bush; William Salerno; Daniel Lancour; Yiyi Ma; Alan E. Renton; Edoardo Marcora; John J. Farrell; Yi Zhao; Liming Qu; Shahzad Ahmad; Najaf Amin; Philippe Amouyel; Gary W. Beecham; Jennifer E. Below; Dominique Campion; Camille Charbonnier; Jaeyoon Chung; Paul K. Crane; Carlos Cruchaga; L. Adrienne Cupples; Jean-François Dartigues; Stéphanie Debette; Jean-François Deleuze; Lucinda Fulton; Stacey Gabriel

The Alzheimer’s Disease Sequencing Project (ADSP) undertook whole exome sequencing in 5,740 late-onset Alzheimer disease (AD) cases and 5,096 cognitively normal controls primarily of European ancestry (EA), among whom 218 cases and 177 controls were Caribbean Hispanic (CH). An age-, sex- and APOE based risk score and family history were used to select cases most likely to harbor novel AD risk variants and controls least likely to develop AD by age 85 years. We tested ~1.5 million single nucleotide variants (SNVs) and 50,000 insertion-deletion polymorphisms (indels) for association to AD, using multiple models considering individual variants as well as gene-based tests aggregating rare, predicted functional, and loss of function variants. Sixteen single variants and 19 genes that met criteria for significant or suggestive associations after multiple-testing correction were evaluated for replication in four independent samples; three with whole exome sequencing (2,778 cases, 7,262 controls) and one with genome-wide genotyping imputed to the Haplotype Reference Consortium panel (9,343 cases, 11,527 controls). The top findings in the discovery sample were also followed-up in the ADSP whole-genome sequenced family-based dataset (197 members of 42 EA families and 501 members of 157 CH families). We identified novel and predicted functional genetic variants in genes previously associated with AD. We also detected associations in three novel genes: IGHG3 (p = 9.8 × 10−7), an immunoglobulin gene whose antibodies interact with β-amyloid, a long non-coding RNA AC099552.4 (p = 1.2 × 10−7), and a zinc-finger protein ZNF655 (gene-based p = 5.0 × 10−6). The latter two suggest an important role for transcriptional regulation in AD pathogenesis.


Dementia and Geriatric Cognitive Disorders | 2018

Genetic Variation in Genes Underlying Diverse Dementias May Explain a Small Proportion of Cases in the Alzheimer’s Disease Sequencing Project

Elizabeth E. Blue; Joshua C. Bis; Michael O. Dorschner; Debby W. Tsuang; Sandra Barral; Gary W. Beecham; Jennifer E. Below; William S. Bush; Mariusz Butkiewicz; Carlos Cruchaga; Anita L. DeStefano; Lindsay A. Farrer; Alison Goate; Jonathan L. Haines; Jim Jaworski; Gyungah Jun; Brian W. Kunkle; Amanda Kuzma; Jenny J. Lee; Kathryn L. Lunetta; Yiyi Ma; Eden R. Martin; Adam C. Naj; Alejandro Q. Nato; Patrick A. Navas; Hiep Nguyen; Christiane Reitz; Dolly Reyes; William Salerno; Gerard D. Schellenberg

Background/Aims: The Alzheimer’s Disease Sequencing Project (ADSP) aims to identify novel genes influencing Alzheimer’s disease (AD). Variants within genes known to cause dementias other than AD have previously been associated with AD risk. We describe evidence of co-segregation and associations between variants in dementia genes and clinically diagnosed AD within the ADSP. Methods: We summarize the properties of known pathogenic variants within dementia genes, describe the co-segregation of variants annotated as “pathogenic” in ClinVar and new candidates observed in ADSP families, and test for associations between rare variants in dementia genes in the ADSP case-control study. The participants were clinically evaluated for AD, and they represent European, Caribbean Hispanic, and isolate Dutch populations. Results/Conclusions: Pathogenic variants in dementia genes were predominantly rare and conserved coding changes. Pathogenic variants within ARSA, CSF1R, and GRN were observed, and candidate variants in GRN and CHMP2B were nominated in ADSP families. An independent case-control study provided evidence of an association between variants in TREM2, APOE, ARSA, CSF1R, PSEN1, and MAPT and risk of AD. Variants in genes which cause dementing disorders may influence the clinical diagnosis of AD in a small proportion of cases within the ADSP.


bioRxiv | 2018

SVCollector: Optimized sample selection for validating and long-read resequencing of structural variants

Fritz J Sedlazeck; Zachary H. Lemmon; Sebastian Soyk; William Salerno; Zachary Lippman; Michael C. Schatz

Summary Structural Variations (SVs) are increasingly recognized for their importance in genomics. Short-read sequencing is the most widely-used approach for genotyping large numbers of samples for SVs but suffers from relatively poor accuracy. Here we present SVCollector, an open-source method that optimally selects samples to maximize variant discovery and validation using long read resequencing or PCR-based validation. SVCollector has two modes: selecting those samples that are individually the most diverse or those that collectively capture the largest number of variations. Availability https://github.com/fritzsedlazeck/SVCollector Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Collaboration


Dive into the William Salerno's collaboration.

Top Co-Authors

Avatar

Eric Boerwinkle

University of Texas Health Science Center at Houston

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Donna M. Muzny

Baylor College of Medicine

View shared research outputs
Top Co-Authors

Avatar

Joshua C. Bis

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Richard A. Gibbs

Baylor College of Medicine

View shared research outputs
Top Co-Authors

Avatar

Adam C. Naj

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Amanda Kuzma

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Li-San Wang

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge