Douglas L. Maskell
Nanyang Technological University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Douglas L. Maskell.
BMC Research Notes | 2009
Yongchao Liu; Douglas L. Maskell; Bertil Schmidt
BackgroundThe Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware.FindingsOur CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card) provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS.ConclusionCUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.
BMC Research Notes | 2010
Yongchao Liu; Bertil Schmidt; Douglas L. Maskell
BackgroundDue to its high sensitivity, the Smith-Waterman algorithm is widely used for biological database searches. Unfortunately, the quadratic time complexity of this algorithm makes it highly time-consuming. The exponential growth of biological databases further deteriorates the situation. To accelerate this algorithm, many efforts have been made to develop techniques in high performance architectures, especially the recently emerging many-core architectures and their associated programming models.FindingsThis paper describes the latest release of the CUDASW++ software, CUDASW++ 2.0, which makes new contributions to Smith-Waterman protein database searches using compute unified device architecture (CUDA). A parallel Smith-Waterman algorithm is proposed to further optimize the performance of CUDASW++ 1.0 based on the single instruction, multiple thread (SIMT) abstraction. For the first time, we have investigated a partitioned vectorized Smith-Waterman algorithm using CUDA based on the virtualized single instruction, multiple data (SIMD) abstraction. The optimized SIMT and the partitioned vectorized algorithms were benchmarked, and remarkably, have similar performance characteristics. CUDASW++ 2.0 achieves performance improvement over CUDASW++ 1.0 as much as 1.74 (1.72) times using the optimized SIMT algorithm and up to 1.77 (1.66) times using the partitioned vectorized algorithm, with a performance of up to 17 (30) billion cells update per second (GCUPS) on a single-GPU GeForce GTX 280 (dual-GPU GeForce GTX 295) graphics card.ConclusionsCUDASW++ 2.0 is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant performance improvement over CUDASW++ 1.0 using either the optimized SIMT algorithm or the partitioned vectorized algorithm for Smith-Waterman protein database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.
Bioinformatics | 2010
Yongchao Liu; Bertil Schmidt; Douglas L. Maskell
MOTIVATION Multiple sequence alignment is of central importance to bioinformatics and computational biology. Although a large number of algorithms for computing a multiple sequence alignment have been designed, the efficient computation of highly accurate multiple alignments is still a challenge. RESULTS We present MSAProbs, a new and practical multiple alignment algorithm for protein sequences. The design of MSAProbs is based on a combination of pair hidden Markov models and partition functions to calculate posterior probabilities. Furthermore, two critical bioinformatics techniques, namely weighted probabilistic consistency transformation and weighted profile-profile alignment, are incorporated to improve alignment accuracy. Assessed using the popular benchmarks: BAliBASE, PREFAB, SABmark and OXBENCH, MSAProbs achieves statistically significant accuracy improvements over the existing top performing aligners, including ClustalW, MAFFT, MUSCLE, ProbCons and Probalign. Furthermore, MSAProbs is optimized for multi-core CPUs by employing a multi-threaded design, leading to a competitive execution time compared to other aligners. AVAILABILITY The source code of MSAProbs, written in C++, is freely and publicly available from http://msaprobs.sourceforge.net.
Bioinformatics | 2012
Yongchao Liu; Bertil Schmidt; Douglas L. Maskell
MOTIVATION New high-throughput sequencing technologies have promoted the production of short reads with dramatically low unit cost. The explosive growth of short read datasets poses a challenge to the mapping of short reads to reference genomes, such as the human genome, in terms of alignment quality and execution speed. RESULTS We present CUSHAW, a parallelized short read aligner based on the compute unified device architecture (CUDA) parallel programming model. We exploit CUDA-compatible graphics hardware as accelerators to achieve fast speed. Our algorithm uses a quality-aware bounded search approach based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini index to reduce the search space and achieve high alignment quality. Performance evaluation, using simulated as well as real short read datasets, reveals that our algorithm running on one or two graphics processing units achieves significant speedups in terms of execution time, while yielding comparable or even better alignment quality for paired-end alignments compared with three popular BWT-based aligners: Bowtie, BWA and SOAP2. CUSHAW also delivers competitive performance in terms of single-nucleotide polymorphism calling for an Escherichia coli test dataset. AVAILABILITY http://cushaw.sourceforge.net
Bioinformatics | 2005
Timothy F. Oliver; Bertil Schmidt; Darran Nathan; Ralf Clemens; Douglas L. Maskell
Aligning hundreds of sequences using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. We present a new approach to compute multiple sequence alignments in far shorter time using reconfigurable hardware. This results in an implementation of ClustalW with significant runtime savings on a standard off-the-shelf FPGA.
field programmable gate arrays | 2005
Timothy F. Oliver; Bertil Schmidt; Douglas L. Maskell
Protein sequences with unknown functionality are often compared to a set of known sequences to detect functional similarities. Efficient dynamic-programming algorithms exist for solving this problem, however current solutions still require significant scan times. These scan time requirements are likely to become even more severe due to exponential database growth. In this paper we present a new approach to bio-sequence database scanning using re-configurable FPGA-based hardware platforms to gain high performance at low cost. Efficient mappings of the Smith-Waterman algorithm using fine-grained parallel processing elements (PEs) that are tailored towards the parameters of a query have been designed. We use customization opportunities available at run-time to dynamically hyper customize the systolic array to make better use of available resource. Our FPGA implementation achieves a speedup of approximately 170 for linear gap penalties and 125 for affine gap penalties as compared to a standard desktop computing platform. We show how hyper-customization at run-time can be used to further improve the performance.
IEEE Transactions on Circuits and Systems Ii-express Briefs | 2005
Timothy F. Oliver; Bertil Schmidt; Douglas L. Maskell
Protein sequences with unknown functionality are often compared to a set of known sequences to detect functional similarities. Efficient dynamic-programming algorithms exist for solving this problem, however current solutions still require significant scan times. These scan time requirements are likely to become even more severe due to the rapid growth in the size of these databases. In this paper, we present a new approach to bio-sequence database scanning using re-configurable field-programmable gate array (FPGA)-based hardware platforms to gain high performance at low cost. Efficient mappings of the Smith-Waterman algorithm using fine-grained parallel processing elements (PEs) that are tailored toward the parameters of a query have been designed. We use customization opportunities available at run time to dynamically reconfigure the PEs to make better use of available resources. Our FPGA implementation achieves a speedup of approximately 170 for linear gap penalties and 125 for affine gap penalties compared to a standard desktop computing platform. We show how run-time reconfiguration can be used to further improve performance.
application specific systems architectures and processors | 2009
Yongchao Liu; Bertil Schmidt; Douglas L. Maskell
Progressive alignment is a widely used approach for computing multiple sequence alignments (MSAs). However, aligning several hundred or thousand sequences with popular progressive alignment tools such as ClustalW requires hours or even days on state-of-the-art workstations. This paper presents MSA-CUDA, a parallel MSA program, which parallelizes all three stages of the ClustalW processing pipeline using CUDA and achieves significant speedups compared to the sequential ClustalW for a variety of large protein sequence datasets. Our tests on a GeForce GTX 280 GPU demonstrate average speedups of 36.91 (for long protein sequences), 18.74 (for average-length protein sequences), and 11.27 (for short protein sequences) compared to the sequential ClustalW running on a Pentium 4 3.0 GHz processor. Our MSA-CUDA outperforms ClustalW-MPI running on 32 cores of a high performance workstation cluster.
Pattern Recognition Letters | 2010
Yongchao Liu; Bertil Schmidt; Weiguo Liu; Douglas L. Maskell
Motif discovery in biological sequences is of prime importance and a major challenge in computational biology. Consequently, numerous motif discovery tools have been developed to date. However, the rapid growth of both genomic sequence and gene transcription data, establishes the need for the development of scalable motif discovery tools. An approach to improve the runtime of motif discovery by an order-of-magnitude without losing sensitivity is to employ emerging many-core architectures such as CUDA-enabled GPUs. In this paper, we present a highly parallel formulation and implementation of the MEME motif discovery algorithm using the CUDA programming model. To achieve high efficiency, we introduce two parallelization approaches: sequence-level and substring-level parallelization. Furthermore, a hybrid computing framework is described to take advantage of both CPU and GPU compute resources. Our performance evaluation on a GeForce GTX 280 GPU, results in average runtime speedups of 21.4 (19.3) for the starting point search and 20.5 (16.4) for the overall runtime using the OOPS (ZOOPS) motif search model. The runtime speedups of CUDA-MEME on a single GPU are also comparable to those of ParaMEME running on 16 CPU cores of a high-performance workstation cluster. In addition to the fast speed, CUDA-MEME has the capability of finding motif instances consistent with the sequential MEME.
IEEE Transactions on Instrumentation and Measurement | 1993
Graham S. Woods; Douglas L. Maskell; Michael V. Mahoney
A microwave ranging system that employs a composite frequency modulated continuous wave/continuous wave (FMCW/CW) measurement technique is described. Conventional FMCW radar techniques are employed to find the approximate range of the target. An ambiguous but very accurate set of range solutions is also determined through a CW measurement. The correct, precision CW distance measurement is resolved on the basis of the approximate FMCW solution. An adaptive, spatial digital filtering routine applied to the FMCW radar measurements reduces the influence of clutter, ensuring reliable operation. An X-band prototype system that achieves submillimeter accuracy is described. >