Sudha Balla
University of Connecticut
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sudha Balla.
Nature Structural & Molecular Biology | 2008
Katsutomo Okamura; Sudha Balla; Raquel Martin; Na Liu; Eric C. Lai
Cis-natural antisense transcripts (cis-NATs) have been speculated to be substrates for endogenous RNA interference (RNAi), but little experimental evidence for such a pathway in animals has been reported. Analysis of massive Drosophila melanogaster small RNA data sets now reveals two mechanisms that yield endogenous small interfering RNAs (siRNAs) via bidirectional transcription. First, >100 cis-NATs with overlapping 3′ exons generate 21-nt and, based on previously published small RNA data, Dicer-2 (Dcr-2)–dependent, 3′-end modified siRNAs. The processing of cis-NATs by RNA interference (RNAi) seems to be actively restricted, and the selected loci are enriched for nucleic acid–based functions and include Argonaute-2 (AGO2) itself. Second, we report that extended intervals of the thickveins and klarsicht genes generate exceptionally abundant siRNAs from both strands. These siRNA clusters derive from atypical cis-NAT arrangements involving introns and 5′ or internal exons, but their biogenesis is similarly Dcr-2– and AGO2-dependent. These newly recognized siRNA pathways broaden the scope of regulatory networks mediated by small RNAs.
Nature Methods | 2006
Sudha Balla; Vishal Thapar; Snigdha Verma; ThaiBinh Luong; Tanaz Faghri; Chun-Hsi Huang; Sanguthevar Rajasekaran; Jacob J. del Campo; Jessica H Shinn; William A. Mohler; Mark W. Maciejewski; Michael R. Gryk; Bryan Piccirillo; Stanley R Schiller; Martin R. Schiller
In addition to large domains, many short motifs mediate functional post-translational modification of proteins as well as protein-protein interactions and protein trafficking functions. We have constructed a motif database comprising 312 unique motifs and a web-based tool for identifying motifs in proteins. Functional motifs predicted by MnM can be ranked by several approaches, and we validated these scores by analyzing thousands of confirmed examples and by confirming prediction of previously unidentified 14-3-3 motifs in EFF-1.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2007
Jaime Davila; Sudha Balla; Sanguthevar Rajasekaran
We consider the planted (I, d) motif search problem, which consists of finding a substring of length I that occurs in a set of input sequences {si,. ..,sn} with up to d errors, a problem that arises from the need to find transcription factor-binding sites in genomic information. We propose a sequence of practical algorithms, which start based on the ideas considered in PMS1. These algorithms are exact, have little space requirements, and are able to tackle challenging instances with bigger d, taking less time in the instances reported solved by exact algorithms. In particular, one of the proposed algorithms, PMSprune, is able to solve the challenging instances, such as (17, 6) and (19, 7), which were not previously reported as solved in the literature.
Journal of Computational Biology | 2005
Sanguthevar Rajasekaran; Sudha Balla; Chun-Hsi Huang
The problem of identifying meaningful patterns (i.e., motifs) from biological data has been studied extensively due to its paramount importance. Three versions of this problem have been identified in the literature. One of these three problems is the planted (l, d)-motif problem. Several instances of this problem have been posed as a challenge. Numerous algorithms have been proposed in the literature that address this challenge. Many of these algorithms fall under the category of heuristic algorithms. In this paper we present algorithms for the planted (l, d)-motif problem that always find the correct answer(s). Our algorithms are very simple and are based on some ideas that are fundamentally different from the ones employed in the literature. We believe that the techniques we introduce in this paper will find independent applications.
Nucleic Acids Research | 2009
Sanguthevar Rajasekaran; Sudha Balla; Patrick R. Gradie; Michael R. Gryk; Krishna Kadaveru; Vamsi Kundeti; Mark W. Maciejewski; Tian Mi; Nicholas Rubino; Jay Vyas; Martin R. Schiller
Minimotif Miner (MnM) consists of a minimotif database and a web-based application that enables prediction of motif-based functions in user-supplied protein queries. We have revised MnM by expanding the database more than 10-fold to approximately 5000 motifs and standardized the motif function definitions. The web-application user interface has been redeveloped with new features including improved navigation, screencast-driven help, support for alias names and expanded SNP analysis. A sample analysis of prion shows how MnM 2 can be used. Weblink: http://mnm.engr.uconn.edu, weblink for version 1 is http://sms.engr.uconn.edu.
asia-pacific bioinformatics conference | 2005
Sanguthevar Rajasekaran; Sudha Balla; Chun-Hsi Huang
The problem of identifying meaningful patterns (i.e., motifs) from biological data has been studied extensively due to its paramount importance. Three versions of this problem have been identified in the literature. One of these three problems is the planted (l, d)-motif problem. Several instances of this problem have been posed as a challenge. Numerous algorithms have been proposed in the literature that address this challenge. Many of these algorithms fall under the category of approximation algorithms. In this paper we present algorithms for the planted (l, d)-motif problem that always find the correct answer(s). Our algorithms are very simple and are based on some ideas that are fundamentally different from the ones employed in the literature. We believe that the techniques we introduce in this paper will find independent applications. This research has been supported in part by the NSF Grants CCR-9912395 and ITR-0326155.
international conference on computational science | 2006
Jaime Davila; Sudha Balla; Sanguthevar Rajasekaran
We consider the (l,d) Planted Motif Search Problem, a problem that arises from the need to find transcription factor-binding sites in genomic information. We propose the algorithms PMSi and PMSP which are based on ideas considered in PMS1 [10]. These algorithms are exact, make use of less space than the known exact algorithms such as PMS and are able to tackle instances with large values of d. In particular algorithm PMSP is able to solve the challenge instance (17,6), which has not reported solved before in the literature.
Journal of Clinical Monitoring and Computing | 2005
Sanguthevar Rajasekaran; Sudha Balla; Chun-Hsi Huang; Vishal Thapar; Michael R. Gryk; Mark W. Maciejewski; Martin R. Schiller
Objective. The human genome project has resulted in the generation of voluminous biological data. Novel computational techniques are called for to extract useful information from this data. One such technique is that of finding patterns that are repeated over many sequences (and possibly over many species). In this paper we study the problem of identifying meaningful patterns (i.e., motifs) from biological data, the motif search problem. Methods. The general version of the motif search problem is NP-hard. Numerous algorithms have been proposed in the literature to solve this problem. Many of these algorithms fall under the category of heuristics. We concentrate on exact algorithms in this paper. In particular, we concentrate on two different versions of the motif search problem and offer exact algorithms for them. Results. In this paper we present algorithms for two versions of the motif search problem. All of our algorithms are elegant and use only such simple data structures as arrays. For the first version of the problem described as Problem 1 in the paper, we present a simple sorting based algorithm, SMS (Simple Motif Search). This algorithm has been coded and experimental results have been obtained. For the second version of the problem (described in the paper as Problem 2), we present two different algorithms – a deterministic algorithm (called DMS) and a randomized algorithm (Monte Carlo algorithm). We also show how these algorithms can be parallelized.Conclusions. All the algorithms proposed in this paper are improvements over existing algorithms for these versions of motif search in biological sequence data. The algorithms presented have the potential of performing well in practice.
asia-pacific bioinformatics conference | 2005
Sanguthevar Rajasekaran; Sudha Balla; Chun-Hsi Huang; Vishal Thapar; Michael R. Gryk; Mark W. Maciejewski; Martin R. Schiller
In this paper we study the problem of identifying meaningful patterns (i.e., motifs) from biological data. The general version of this problem is NP-hard. Numerous algorithms have been proposed in the literature to solve this problem. Many of these algorithms fall under the category of approximation algorithms. We concentrate on exact algorithms in this paper. In particular, we concentrate on two different versions of the motif search problem and offer exact algorithms for two of them. The proposed algorithms perform better than some of the bestknown algorithms. * This research was supported in part by the NSF Grants CCR-9912395 and ITR0326155.
IEEE Transactions on Nanobioscience | 2007
Sudha Balla; Sanguthevar Rajasekaran
Selecting degenerate primers for multiplex polymerase chain reaction (MP-PCR) experiments, called the degenerate primer design problem (DPDP), is an important problem in computational molecular biology and has drawn the attention of numerous researchers in the recent past. Several variants of DPDP were formulated by Linhart and Shamir and proven to be NP-complete. A number of algorithms have been proposed for one such variant, namely, the maximum coverage degenerate primer design problem (MC-DPDP). In this paper, we consider another important variant called the minimum degeneracy degenerate primer design with errors problem (MD-DPDEP), propose an algorithm to design a degenerate primer of minimum degeneracy for a given set of DNA sequences and show experimental results of its performance on random and real biological datasets. Our algorithm combines methodologies in motif discovery and an iterative technique to design the primer