Wayne Pfeiffer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wayne Pfeiffer is active.

Explore More

Publication

Featured researches published by Wayne Pfeiffer.

grid computing environments | 2010

Creating the CIPRES Science Gateway for inference of large phylogenetic trees

Mark A. Miller; Wayne Pfeiffer; Terri Schwartz

Understanding the evolutionary history of living organisms is a central problem in biology. Until recently the ability to infer evolutionary relationships was limited by the amount of DNA sequence data available, but new DNA sequencing technologies have largely removed this limitation. As a result, DNA sequence data are readily available or obtainable for a wide spectrum of organisms, thus creating an unprecedented opportunity to explore evolutionary relationships broadly and deeply across the Tree of Life. Unfortunately, the algorithms used to infer evolutionary relationships are NP-hard, so the dramatic increase in available DNA sequence data has created a commensurate increase in the need for access to powerful computational resources. Local laptop or desktop machines are no longer viable for analysis of the larger data sets available today, and progress in the field relies upon access to large, scalable high-performance computing resources. This paper describes development of the CIPRES Science Gateway, a web portal designed to provide researchers with transparent access to the fastest available community codes for inference of phylogenetic relationships, and implementation of these codes on scalable computational resources. Meeting the needs of the community has included developing infrastructure to provide access, working with the community to improve existing community codes, developing infrastructure to insure the portal is scalable to the entire systematics community, and adopting strategies that make the project sustainable by the community. The CIPRES Science Gateway has allowed more than 1800 unique users to run jobs that required 2.5 million Service Units since its release in December 2009. (A Service Unit is a CPU-hour at unit priority).

Genome Research | 2014

Somatic mutations found in the healthy blood compartment of a 115-yr-old woman demonstrate oligoclonal hematopoiesis

Henne Holstege; Wayne Pfeiffer; Daoud Sie; Marc Hulsman; Thomas J. Nicholas; Clarence Lee; Tristen Ross; Jue Lin; Mark A. Miller; Bauke Ylstra; Hanne Meijers-Heijboer; Martijn H. Brugman; Frank J. T. Staal; Gert Holstege; Marcel J. T. Reinders; Timothy T. Harkins; Samuel Levy; Erik A. Sistermans

The somatic mutation burden in healthy white blood cells (WBCs) is not well known. Based on deep whole-genome sequencing, we estimate that approximately 450 somatic mutations accumulated in the nonrepetitive genome within the healthy blood compartment of a 115-yr-old woman. The detected mutations appear to have been harmless passenger mutations: They were enriched in noncoding, AT-rich regions that are not evolutionarily conserved, and they were depleted for genomic elements where mutations might have favorable or adverse effects on cellular fitness, such as regions with actively transcribed genes. The distribution of variant allele frequencies of these mutations suggests that the majority of the peripheral white blood cells were offspring of two related hematopoietic stem cell (HSC) clones. Moreover, telomere lengths of the WBCs were significantly shorter than telomere lengths from other tissues. Together, this suggests that the finite lifespan of HSCs, rather than somatic mutation effects, may lead to hematopoietic clonal evolution at extreme ages.

teragrid conference | 2011

The CIPRES science gateway: a community resource for phylogenetic analyses

Mark A. Miller; Wayne Pfeiffer; Terri Schwartz

The CIPRES Science Gateway (CSG) provides researchers and educators with browser-based access to community codes for inference of phylogenetic relationships from DNA and protein sequence data. The CSG allows users to deploy jobs on the high-performance computers of the TeraGrid without requiring detailed knowledge of their complexities. Use of the CSG has grown rapidly; through March 2011 it had more than 2,200 users and enabled more than 180 peer-reviewed publications. The rapid growth in resource consumption was accommodated by deploying codes on Trestles, a new TeraGrid computer. Tools and policies were developed to insure efficient and effective resource use. This paper describes progress in managing the growth of this public cyberinfrastructure resource and reviews the domain science that it has enabled.

ieee international symposium on parallel distributed processing workshops and phd forum | 2010

Hybrid MPI/Pthreads parallelization of the RAxML phylogenetics code

Wayne Pfeiffer; Alexandros Stamatakis

A hybrid MPI/Pthreads parallelization was implemented in the RAxML phylogenetics code. New MPI code was added to the existing Pthreads production code to exploit parallelism at two algorithmic levels simultaneously: coarse-grained with MPI and fine-grained with Pthreads. This hybrid, multi-grained approach is well suited for current high-performance computers, which typically are clusters of multi-core, shared-memory nodes.

extreme science and engineering discovery environment | 2012

The CIPRES science gateway: enabling high-impact science for phylogenetics researchers with limited resources

Mark A. Miller; Wayne Pfeiffer; Terri Schwartz

The CIPRES Science Gateway (CSG) provides browser-based access to computationally demanding phylogenetic codes run on large HPC resources. Since its release in December 2009, there has been a sustained, near-linear growth in the rate of CSG use, both in terms of number of users submitting jobs each month and number of jobs submitted. The average amount of computational time used per month by CSG increased more than 5-fold since its initial release. As of April 2012, more than 4,000 unique users have run parallel tree inference jobs on TeraGrid/XSEDE resources using the CSG. The steady growth in resource use suggests that the CSG is meeting an important need for computational resources in the Systematics/Evolutionary Biology community. To ensure that XSEDE resources accessed through the CSG are used effectively, policies for resource consumption were developed, and an advanced set of management tools was implemented. Studies of usage trends show that these new management tools helped in distributing XSEDE resources across a large user population that has low-to-moderate computational needs. In the first quarter of 2012, 30% of all active XSEDE users accessed computational resources through the CSG, while the analyses conducted by these users accounted for 0.7% of all allocable XSEDE computational resources. User survey results showed that the easy access to XSEDE/TeraGrid resources through the CSG had a critical and measurable scientific impact: at least 300 scholarly publications spanning all major groups within the Tree of Life have been enabled by the CSG since 2009. The same users reported that 82% of these publications would not have been possible without access to computational resources available through the CSG. The results indicate that the CSG is a critical and cost-effective enabler of science for phylogenetic researchers with limited resources.

international parallel and distributed processing symposium | 2008

Modeling and predicting application performance on parallel computers using HPC challenge benchmarks

Wayne Pfeiffer; Nicholas J. Wright

A method is presented for modeling application performance on parallel computers in terms of the performance of microkernels from the HPC Challenge benchmarks. Specifically, the application run time is expressed as a linear combination of inverse speeds and latencies from microkernels or system characteristics. The model parameters are obtained by an automated series of least squares fits using backward elimination to ensure statistical significance. If necessary, outliers are deleted to ensure that the final fit is robust. Typically three or four terms appear in each model: at most one each for floating-point speed, memory bandwidth, interconnect bandwidth, and interconnect latency. Such models allow prediction of application performance on future computers from easier-to-make predictions of microkernel performance. The method was used to build models for four benchmark problems involving the PARATEC and MILC scientific applications. These models not only describe performance well on the ten computers used to build the models, but also do a good job of predicting performance on three additional computers with newer design features. For the four application benchmark problems with six predictions each, the relative root mean squared error in the predicted run times varies between 13 and 16%. The method was also used to build models for the HPL and G-FFTE benchmarks in HPCC, including functional dependences on problem size and core count from complexity analysis. The model for HPL predicts performance even better than the application models do, while the model for G-FFTE systematically underpredicts run times.

Concurrency and Computation: Practice and Experience | 1990

Benchmarking advanced architecture computers

Paul C. Messina; Clive F. Baillie; Edward W. Felten; Paul G. Hipes; Ray Williams; Arnold Alagar; Anke Kamrath; Robert H. Leary; Wayne Pfeiffer; Jack M. Rogers; David W. Walker

Recently, a number of advanced architecture machines have become commercially available. These new machines promise better cost performance than traditional computers, and some of them have the potential of competing with current supercomputers, such as the CRAY X-MP, in terms of maximum performance. This paper describes the methodology and results of a pilot study of the performance of a broad range of advanced architecture computers using a number of complete scientific application programs. The computers evaluated include: 1shared-memory bus architecture machines such as the Alliant FX/8, the Encore Multimax, and the Sequent Balance and Symmetry 2shared-memory network-connected machines such as the Butterfly 3distributed-memory machines such as the NCUBE, Intel and Jet Propulsion Laboratory (JPL)/Caltech hypercubes 4very long instruction word machines such as the Cydrome Cydra-5 5SIMD machines such as the Connection Machine 6‘traditional’ supercomputers such as the CRAY X-MP, CRAY-2 and SCS-40. Seven application codes from a number of scientific disciplines have been used in the study, although not all the codes were run on every machine. The methodology and guidelines for establishing a standard set of benchmark programs for advanced architecture computers are discussed. The CRAYs offer the best performance on the benchmark suite; the shared memory multiprocessor machines generally permitted some parallelism, and when coupled with substantial floating point capabilities (as in the Alliant FX/8 and Sequent Symmetry), provided an order of magnitude less speed than the CRAYs. Likewise, the early generation hypercubes studied here generally ran slower than the CRAYs, but permitted substantial parallelism from each of the application codes.

BMC Bioinformatics | 2015

Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies

Kristopher A. Standish; Tristan M. Carland; Glenn K. Lockwood; Wayne Pfeiffer; Mahidhar Tatineni; C. Chris Huang; S. Lamberth; Y. Cherkas; Carrie Brodmerkel; Ed Jaeger; Lance Smith; Gunaretnam Rajagopal; Mark E. Curran; Nicholas J. Schork

MotivationNext-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost.ResultsWe describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study.ConclusionsWe ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging ‘big data’ problems in biomedical research brought on by the expansion of NGS technologies.

international conference on cluster computing | 2006

Marching Towards Nirvana: Configurations for Very High Performance Parallel File Systems

Phil Andrews; Chris Jordan; Wayne Pfeiffer

Over the past 7 years, the San Diego Supercomputer Center has worked to produce the highest possible performance file systems available to the National Science Foundation community in the USA. Most of this was done with GPFS, IBMs parallel file system, but several distinctly different configurations were designed and implemented with numerous lessons learned in the process. All of these systems provided transfer rates in the multiple GB/s range. In this paper, we detail the configurations and their intended modes of operation and, as much as possible, show the resulting performance. We attempt to describe the advantages and disadvantages of each approach with an emphasis on the implications for future systems

The Journal of Supercomputing | 1990

Benchmarking and optimization of scientific codes on the CRAY X-MP, CRAY-2, and SCS-40 vector computers

Wayne Pfeiffer; Arnold Alagar; Anke Kamrath; Robert H. Leary; Jack M. Rogers

Various scientific codes were benchmarked on three vector computers: the CRAY X-MP/48 and CRAY-2 supercomputers and the SCS-40/XM minisupercomputer. On the X-MP, two Fortran compilers were also compared. The benchmarks, which were initially all in Fortran, consisted of six research codes from Caltech, the 24 Livermore loops, and two cases from the LINPACK benchmark. As a corollary effort, the effect of manual optimization on the Caltech codes was also considered, including the selected use of assembly-language math routines.On each machine the ratio of the maximum to the minimum speeds for the various benchmarks was more than a factor of 50, even though the study was restricted to unitasked (i.e., single CPU) runs. The maximum speed for all-Fortran codes was more than 80% of the peak speed on the X-MP and SCS, but less than 40% of the peak speed on the CRAY-2.Despite having a clock that is 2.3 times faster, the CRAY-2 generally runs slower than the X-MP, typically by a factor of 1.3 for scalar code and even slower for moderately vectorized code. Only for highly vectorized codes does the CRAY-2 marginally outperform the X-MP, at least for in-core benchmarks. The poorer performance of the CRAY-2 is due to its slower scalar speed, its lack of chaining, its single port between each CPU and memory, and its relatively slow memory.The SCS runs slower than the X-MP by a factor of 2.6 in the scalar limit and by a factor of 4.7 (the clock ratio) in the vector limit when the same CFT compiler is used on both machines. Use of the newer CFT77 compiler on the X-MP negates the relative enhancement of the SCS scalar performance.On the X-MP, the CFT77 3.0 compiler produces significantly faster code than CFT 1.14, typically by a factor of 1.4. This is obtained, however, at the expense of compilation times that are three to five times longer. Regardless of the compiler, manual optimization is still worthwhile. For three of the six Caltech codes compiled with CFT77, run time speedups of 2, 4, and 16 were achieved due to Fortran optimization only.

Explore More