Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Le-Shin Wu is active.

Publication


Featured researches published by Le-Shin Wu.


extreme science and engineering discovery environment | 2012

Trinity RNA-Seq assembler performance optimization

Robert Henschel; Matthias Lieber; Le-Shin Wu; Phillip M. Nista; Brian J. Haas; Richard D. LeDuc

RNA-sequencing is a technique to study RNA expression in biological material. It is quickly gaining popularity in the field of transcriptomics. Trinity is a software tool that was developed for efficient de novo reconstruction of transcriptomes from RNA-Seq data. In this paper we first conduct a performance study of Trinity and compare it to previously published data from 2011. The version from 2011 is much slower than many other de novo assemblers and biologists have thus been forced to choose between quality and speed. We examine the runtime behavior of Trinity as a whole as well as its individual components and then optimize the most performance critical parts. We find that standard best practices for HPC applications can also be applied to Trinity, especially on systems with large amounts of memory. When combining best practices for HPC applications along with our specific performance optimization, we can decrease the runtime of Trinity by a factor of 3.9. This brings the runtime of Trinity in line with other de novo assemblers while maintaining superior quality. The purpose of this paper is to describe a series of improvements to Trinity, quantify the execution improvements achieved, and document the new version of the software.


extreme science and engineering discovery environment | 2013

National Center for Genome Analysis support leverages XSEDE to support life science research

Richard D. LeDuc; Thomas G. Doak; Le-Shin Wu; Philip D. Blood; Carrie L. Ganote; Matthew W. Vaughn

The National Center for Genome Analysis Support (NCGAS) is a response to the concern that NSF-funded life scientists were underutilizing the national cyberinfrastructure, because there has been little effort to tailor these resources to the life scientist communities needs. NCGAS is a multi-institutional service center that provides computational resources, specialized systems support to both the end-user and systems administrators, curated sets of applications, and most importantly scientific consultations for domain scientists unfamiliar with next generation DNA sequence data analysis. NCGAS is a partnership between Indiana University Pervasive Technology Institute, Texas Advanced Computing Center, San Diego Supercomputing Center, and the Pittsburgh Supercomputing Center. NCGAS provides hardened bioinformatic applications and user support on all aspects of a users data analysis, including data management, systems usage, bioinformatics, and biostatistics related issues.


Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure | 2015

Cyberinfrastructure resources enabling creation of the loblolly pine reference transcriptome

Le-Shin Wu; Carrie L. Ganote; Thomas G. Doak; William K. Barnett; Keithanne Mockaitis; Craig A. Stewart

Todays genomics technologies generate more sequence data than ever before possible, and at substantially lower costs, serving researchers across biological disciplines in transformative ways. Building transcriptome assemblies from RNA sequencing reads is one application of next-generation sequencing (NGS) that has held a central role in biological discovery in both model and non-model organisms, with and without whole genome sequence references. A major limitation in effective building of transcriptome references is no longer the sequencing data generation itself, but the computing infrastructure and expertise needed to assemble, analyze and manage the data. Here we describe a currently available resource dedicated to achieving such goals, and its use for extensive RNA assembly of up to 1.3 billion reads representing the massive transcriptome of loblolly pine, using four major assembly software installations. The Mason cluster, an XSEDE second tier resource at Indiana University, provides the necessary fast CPU cycles, large memory, and high I/O throughput for conducting large-scale genomics research. The National Center for Genome Analysis Support, or NCGAS, provides technical support in using HPC systems, bioinformatic support for determining the appropriate method to analyze a given dataset, and practical assistance in running computations. We demonstrate that a sufficient supercomputing resource and good workflow design are elements that are essential to large eukaryotic genomics and transcriptomics projects such as the complex transcriptome of loblolly pine, gene expression data that inform annotation and functional interpretation of the largest genome sequence reference to date.


Archive | 2013

Using Prior Knowledge to Improve Scoring in High-Throughput Top-Down Proteomics Experiments

Richard D. LeDuc; Le-Shin Wu


Archive | 2013

Galaxy based BLAST submission to distributed national high throughput computing resources

Soichi Hayashi; Sandra Gesing; Rob Quick; Scott Teige; Carrie Ganote; Le-Shin Wu; Elizabeth Prout


Archive | 2015

The National Center for Genomic Analysis Support: creating a national cyberinfrastructure environment for genomics researchers.

William K. Barnett; Thomas G. Doak; Le-Shin Wu; Carrie Ganote


Archive | 2015

Introduction to Galaxy 2015

Carrie Ganote; Le-Shin Wu; Thomas G. Doak


Archive | 2015

ACI-REF Mission: User Sensitivity 101

Carrie Ganote; Le-Shin Wu; Thomas G. Doak


Archive | 2015

Automating work in Galaxy

Carrie Ganote; Le-Shin Wu; Thomas G. Doak


Proceedings of International Symposium on Grids and Clouds (ISGC) 2014 — PoS(ISGC2014) | 2014

Galaxy based BLAST submission to Open Science Grid resources

Soichi Hayashi; Sandra Gesing; Robert Quick; Scott Teige; Carrie Ganote; Le-Shin Wu; Elizabeth Prout

Collaboration


Dive into the Le-Shin Wu's collaboration.

Top Co-Authors

Avatar

Thomas G. Doak

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

William K. Barnett

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Craig A. Stewart

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Carrie L. Ganote

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar

Scott Teige

Indiana University Bloomington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge