Srinivas Aluru | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Srinivas Aluru is active.

Explore More

Publication

Featured researches published by Srinivas Aluru.

Plant Journal | 2011

A brassinosteroid transcriptional network revealed by genome‐wide identification of BESI target genes in Arabidopsis thaliana

Xiaofei Yu; Lei Li; Jaroslaw Zola; Maneesha Aluru; Huaxun Ye; Andrew Foudree; Hongqing Guo; Sarah Anderson; Srinivas Aluru; Peng Liu; Steve Rodermel; Yanhai Yin

Brassinosteroids (BRs) are important regulators for plant growth and development. BRs signal to control the activities of the BES1 and BZR1 family transcription factors. The transcriptional network through which BES1 and BZR regulate large number of target genes is mostly unknown. By combining chromatin immunoprecipitation coupled with Arabidopsis tiling arrays (ChIP-chip) and gene expression studies, we have identified 1609 putative BES1 target genes, 404 of which are regulated by BRs and/or in gain-of-function bes1-D mutant. BES1 targets contribute to BR responses and interactions with other hormonal or light signaling pathways. Computational modeling of gene expression data using Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) reveals that BES1-targeted transcriptional factors form a gene regulatory network (GRN). Mutants of many genes in the network displayed defects in BR responses. Moreover, we found that BES1 functions to inhibit chloroplast development by repressing the expression of GLK1 and GLK2 transcription factors, confirming a hypothesis generated from the GRN. Our results thus provide a global view of BR regulated gene expression and a GRN that guides future studies in understanding BR-regulated plant growth.

combinatorial pattern matching | 2003

Space efficient linear time construction of suffix arrays

Pang Ko; Srinivas Aluru

We present a linear time algorithm to sort all the suffixes of a string over a large alphabet of integers. The sorted order of suffixes of a string is also called suffix array, a data structure introduced by Manber and Myers that has numerous applications in pattern matching, string processing, and computational biology. Though the suffix tree of a string can be constructed in linear time and the sorted order of suffixes derived from it, a direct algorithm for suffix sorting is of great interest due to the space requirements of suffix trees. Our result improves upon the best known direct algorithm for suffix sorting, which takes O(n log n) time. We also show how to construct suffix trees in linear time from our suffix sorting result. Apart from being simple and applicable for alphabets not necessarily of fixed size, this method of constructing suffix trees is more space efficient.

Briefings in Bioinformatics | 2013

A survey of error-correction methods for next-generation sequencing

Xiao Yang; Sriram P. Chockalingam; Srinivas Aluru

UNLABELLED Error Correction is important for most next-generation sequencing applications because highly accurate sequenced reads will likely lead to higher quality results. Many techniques for error correction of sequencing data from next-gen platforms have been developed in the recent years. However, compared with the fast development of sequencing technologies, there is a lack of standardized evaluation procedure for different error-correction methods, making it difficult to assess their relative merits and demerits. In this article, we provide a comprehensive review of many error-correction methods, and establish a common set of benchmark data and evaluation criteria to provide a comparative assessment. We present experimental results on quality, run-time, memory usage and scalability of several error-correction methods. Apart from providing explicit recommendations useful to practitioners, the review serves to identify the current state of the art and promising directions for future research. AVAILABILITY All error-correction programs used in this article are downloaded from hosting websites. The evaluation tool kit is publicly available at: http://aluru-sun.ece.iastate.edu/doku.php?id=ecr.

Archive | 2005

Handbook of Computational Molecular Biology

Srinivas Aluru

Sequence Alignments Pairwise Sequence Alignments Benjamin N. Jackson and Srinivas Aluru Spliced Alignment and Similarity-Based Gene Recognition Alexey D. Neverov, Andrey A. Mironov, and Mikhail S. Gelfand Multiple Sequence Alignment Osamu Gotoh, Shinsuke Yamada, and Tetsushi Yada Parametric Sequence Alignment David Fernandez-Baca and Balaji Venkatachalam String Data Structures Lookup Tables, Suffix Trees and Suffix Arrays Srinivas Aluru Suffix Tree Applications in Computational Biology Pang Ko and Srinivas Aluru Enhanced Suffix Arrays and Applications Mohamed I. Abouelhoda, Stefan Kurtz, and Enno Ohlebusch Genome Assembly and EST Clustering Computational Methods for Genome Assembly Xiaoqiu Huang Assembling the Human Genome Richa Agarwala Comparative Methods for Sequence Assembly Vamsi Veeramachaneni Information Theoretic Approach to Genome Reconstruction Suchendra Bhandarkar, Jinling Huang, and Jonathan Arnold Expressed Sequence Tags: Clustering and Applications Anantharaman Kalyanaraman and Srinivas Aluru Algorithms for Large-Scale Clustering and Assembly of Biological Sequence Data Scott J. Emrich, Anantharaman Kalyanaraman, and Srinivas Aluru Genome-Scale Computational Methods Comparisons of Long Genomic Sequences: Algorithms and Applications Michael Brudno and Inna Dubchak Chaining Algorithms and Applications in Comparative Genomics Enno Ohlebusch and Mohamed I. Abouelhoda Computational Analysis of Alternative Splicing Mikhail S. Gelfand Human Genetic Linkage Analysis Alejandro A. Schaffer Combinatorial Methods for Haplotype Inference Dan Gusfield and Steven Hecht Orzack Phylogenetics An Overview of Phylogeny Reconstruction C. Randal Linder and Tandy Warnow Consensus Trees and Supertrees Oliver Eulenstein Large-Scale Phylogenetic Analysis Tandy Warnow High-Performance Phylogeny Reconstruction David A. Bader and Mi Yan Microarrays and Gene Expression Analysis Microarray Data: Annotation, Storage, Retrieval and Communication Catherine A. Ball and Gavin Sherlock Computational Methods for Microarray Design Hui-Hsien Chou Clustering Algorithms for Gene Expression Analysis Pierre Baldi, G. Wesley Hatfield, and Li M. Fu Biclustering Algorithms: A Survey Amos Tanay, Roded Sharan, and Ron Shamir Identifying Gene Regulatory Networks from Gene Expression Data Vladimir Filkov Modeling and Analysis of Gene Networks Using Feedback Control Theory Hana El Samad and Mustafa Khammash Computational Structural Biology Predicting Protein Secondary and Supersecondary Structure Mona Singh Protein Structure Prediction with Lattice Models William E. Hart and Alantha Newman Protein Structure Determination via NMR Spectral Data Guohui Lin, Xin Tu, and Xiang Wan Geometric Processing of Reconstructed 3D Maps of Molecular Complexes Chandrajit Bajaj and Zeyun Yu In Search of Remote Homolog Dong Xu, Ognen Duzlevski, and Xiu-Feng Wan Biomolecular Modeling using Parallel Supercomputers Laxmikant V. Kale, Klaus Schulten, Robert D. Skeel, Glenn Martyna, Mark Tuckerman, James C. Phillips, Sameer Kumar, and Gengbin Zheng Bioinformatic Databases and Data Mining String Search in External Memory: Data Structures and Algorithms Paolo Ferragina Index Structures for Approximate Matching in Sequence Databases Tamer Kahveci and Ambuj K. Singh Algorithms for Motif Search Sanguthevar Rajasekaran Data Mining in Computational Biology Mohammed J. Zaki and Karlton Sequeira Index

Bioinformatics | 2010

Reptile: representative tiling for short read error correction

Xiao Yang; Karin S. Dorman; Srinivas Aluru

MOTIVATION Error correction is critical to the success of next-generation sequencing applications, such as resequencing and de novo genome sequencing. It is especially important for high-throughput short-read sequencing, where reads are much shorter and more abundant, and errors more frequent than in traditional Sanger sequencing. Processing massive numbers of short reads with existing error correction methods is both compute and memory intensive, yet the results are far from satisfactory when applied to real datasets. RESULTS We present a novel approach, termed Reptile, for error correction in short-read data from next-generation sequencing. Reptile works with the spectrum of k-mers from the input reads, and corrects errors by simultaneously examining: (i) Hamming distance-based correction possibilities for potentially erroneous k-mers; and (ii) neighboring k-mers from the same read for correct contextual information. By not needing to store input data, Reptile has the favorable property that it can handle data that does not fit in main memory. In addition to sequence data, Reptile can make use of available quality score information. Our experiments show that Reptile outperforms previous methods in the percentage of errors removed from the data and the accuracy in true base assignment. In addition, a significant reduction in run time and memory usage have been achieved compared with previous methods, making it more practical for short-read error correction when sampling larger genomes. AVAILABILITY Reptile is implemented in C++ and is available through the link: http://aluru-sun.ece.iastate.edu/doku.php?id=software CONTACT [email protected].

conference on high performance computing (supercomputing) | 2007

Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/L

Michael Ott; Jaroslaw Zola; Alexandros Stamatakis; Srinivas Aluru

Phylogenetic inference is a grand challenge in Bioinformatics due to immense computational requirements. The increasing popularity of multi-gene alignments in biological studies, which typically provide a stable topological signal due to a more favorable ratio of the number of base pairs to the number of sequences, coupled with rapid accumulation of sequence data in general, poses new challenges for high performance computing. In this paper, we demonstrate how state-of-the-art Maximum Likelihood (ML) programs can be efficiently scaled to the IBM BlueGene/L (BG/L) architecture, by porting RAxML, which is currently among the fastest and most accurate programs for phylogenetic inference under the ML criterion. We simultaneously exploit coarse-grained and fine-grained parallelism that is inherent in every ML-based biological analysis. Performance is assessed using datasets consisting of 212 sequences and 566,470 base pairs, and 2,182 sequences and 51,089 base pairs, respectively. To the best of our knowledge, these are the largest datasets analyzed under ML to date. The capability to analyze such datasets will help to address novel biological questions via phylogenetic analyses. Our experimental results indicate that the fine-grained parallelization scales well up to 1, 024 processors. Moreover, a larger number of processors can be efficiently exploited by a combination of coarse-grained and fine-grained parallelism. Finally, we demonstrate that our parallelization scales equally well on an AMD Opteron cluster with a less favorable network latency to processor speed ratio. We recorded super-linear speedups in several cases due to increased cache efficiency.

Journal of Parallel and Distributed Computing | 2003

Parallel biological sequence comparison using prefix computations

Srinivas Aluru; Natsuhiko Futamura; Kishan G. Mehrotra

We present practical parallel algorithms using prefix computations for various problems that arise in pairwise comparison of biological sequences. We consider both constant and affine gap penalty functions, full-sequence and subsequence matching, and space-saving algorithms. Commonly used sequential algorithms solve the sequence comparison problems in O(mn) time and O(m + n) space, where m and n are the lengths of the sequences being compared. All the algorithms presented in this paper are time optimal with respect to the sequential algorithms and can use O(n/log n) processors where n is the length of the larger sequence. While optimal parallel algorithms for many of these problems are known, we use a simple framework and demonstrate how these problems can be solved systematically using repeated parallel prefix operations. We also present a space-saving algorithm that uses O(m + n/p) space and runs in optimal time where p is the number of the processors used. We implemented the parallel space-saving algorithm and provide experimental results on an IBM SP-2 and a Pentium cluster.

Genetics | 2006

Nearly Identical Paralogs: Implications for Maize (Zea mays L.) Genome Evolution

Scott J. Emrich; Li Li; Tsui-Jung Wen; Marna D. Yandeau-Nelson; Yan Fu; Ling Guo; Hui-Hsien Chou; Srinivas Aluru; Daniel Ashlock

As an ancient segmental tetraploid, the maize (Zea mays L.) genome contains large numbers of paralogs that are expected to have diverged by a minimum of 10% over time. Nearly identical paralogs (NIPs) are defined as paralogous genes that exhibit ≥98% identity. Sequence analyses of the “gene space” of the maize inbred line B73 genome, coupled with wet lab validation, have revealed that, conservatively, at least ∼1% of maize genes have a NIP, a rate substantially higher than that in Arabidopsis. In most instances, both members of maize NIP pairs are expressed and are therefore at least potentially functional. Of evolutionary significance, members of many NIP families also exhibit differential expression. The finding that some families of maize NIPs are closely linked genetically while others are genetically unlinked is consistent with multiple modes of origin. NIPs provide a mechanism for the maize genome to circumvent the inherent limitation that diploid genomes can carry at most two “alleles” per “locus.” As such, NIPs may have played important roles during the evolution and domestication of maize and may contribute to the success of long-term selection experiments in this important crop species.

ieee international conference on high performance computing, data, and analytics | 1997

Parallel domain decomposition and load balancing using space-filling curves

Srinivas Aluru; Fatih Erdogan Sevilgen

Partitioning techniques based on space filling curves have received much recent attention due to their low running time and good load balance characteristics. The basic idea underlying these methods is to order the multidimensional data according to a space filling curve and partition the resulting one dimensional order. However, space filling curves are defined for points that lie on a uniform grid of a particular resolution. It is typically assumed that the coordinates of the points are representable using a fixed number of bits, and the run times of the algorithms depend upon the number of bits used. We present a simple and efficient technique for ordering arbitrary and dynamic multidimensional data using space filling curves and its application to parallel domain decomposition and load balancing. Our technique is based on a comparison routine that determines the relative position of two points in the order induced by a space filling curve. The comparison routine could then be used in conjunction with any parallel sorting algorithm to effect parallel domain decomposition.

IEEE ACM Transactions on Networking | 2005

Scalable, memory efficient, high-speed IP lookup algorithms

Rama Sangireddy; Natsuhiko Futamura; Srinivas Aluru; Arun K. Somani

One of the central issues in router performance is IP address lookup based on longest prefix matching. IP address lookup algorithms can be evaluated on a number of metrics-lookup time, update time, memory usage, and to a less important extent, the time to construct the data structure used to support lookups and updates. Many of the existing methods are geared toward optimizing a specific metric, and do not scale well with the ever expanding routing tables and the forthcoming IPv6 where the IP addresses are 128 bits long. In contrast, our effort is directed at simultaneously optimizing multiple metrics and provide solutions that scale to IPv6, with its longer addresses and much larger routing tables. In this paper, we present two IP address lookup schemes-Elevator-Stairs algorithm and logW-Elevators algorithm. For a routing table with N prefixes, The Elevator-Stairs algorithm uses optimal O(N) memory, and achieves better lookup and update times than other methods with similar memory requirements. The logW-Elevators algorithm gives O(logW) lookup time, where W is the length of an IP address, while improving upon update time and memory usage. Experimental results using the MAE-West router with 29 487 prefixes show that the Elevator-Stairs algorithm gives an average throughput of 15.7 Million lookups per second (Mlps) using 459KB of memory, and the logW-Elevators algorithm gives an average throughput of 21.41Mlps with a memory usage of 1259KB.

Explore More