Sahely Bhadra
Indian Institute of Science
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sahely Bhadra.
Mathematical Programming | 2011
Aharon Ben-Tal; Sahely Bhadra; Chiranjib Bhattacharyya; J. Saketha Nath
This paper studies the problem of constructing robust classifiers when the training is plagued with uncertainty. The problem is posed as a Chance-Constrained Program (CCP) which ensures that the uncertain data points are classified correctly with high probability. Unfortunately such a CCP turns out to be intractable. The key novelty is in employing Bernstein bounding schemes to relax the CCP as a convex second order cone program whose solution is guaranteed to satisfy the probabilistic constraint. Prior to this work, only the Chebyshev based relaxations were exploited in learning algorithms. Bernstein bounds employ richer partial information and hence can be far less conservative than Chebyshev bounds. Due to this efficient modeling of uncertainty, the resulting classifiers achieve higher classification margins and hence better generalization. Methodologies for classifying uncertain test data points and error measures for evaluating classifiers robust to uncertain data are discussed. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle data uncertainty and outperform state-of-the-art in many cases.
Nucleic Acids Research | 2009
Shivakumar Keerthikumar; Rajesh Raju; Kumaran Kandasamy; Atsushi Hijikata; Subhashri Ramabadran; Lavanya Balakrishnan; Mukhtar Ahmed; Sandhya Rani; Lakshmi Dhevi N. Selvan; Devi S. Somanathan; Somak Ray; Mitali Bhattacharjee; Sashikanth Gollapudi; Yl Ramachandra; Sahely Bhadra; Chiranjib Bhattacharyya; Kohsuke Imai; Shigeaki Nonoyama; Hirokazu Kanegane; Toshio Miyawaki; Akhilesh Pandey; Osamu Ohara; S. Sujatha Mohan
Availability of a freely accessible, dynamic and integrated database for primary immunodeficiency diseases (PID) is important both for researchers as well as clinicians. To build a PID informational platform and also as a part of action to initiate a network of PID research in Asia, we have constructed a web-based compendium of molecular alterations in PID, named Resource of Asian Primary Immunodeficiency Diseases (RAPID), which is available as a worldwide web resource at http://rapid.rcai.riken.jp/. It hosts information on sequence variations and expression at the mRNA and protein levels of all genes reported to be involved in PID patients. The main objective of this database is to provide detailed information pertaining to genes and proteins involved in primary immunodeficiency diseases along with other relevant information about protein–protein interactions, mouse studies and microarray gene-expression profiles in various organs and cells of the immune system. RAPID also hosts a tool, mutation viewer, to predict deleterious and novel mutations and also to obtain mutation-based 3D structures for PID genes. Thus, information contained in this database should help physicians and other biomedical investigators to further investigate the role of these molecules in PID.
international world wide web conferences | 2011
Sandeepkumar Satpal; Sahely Bhadra; Sundararajan Sellamanickam; Rajeev Rastogi; Prithviraj Sen
In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages and sites. We use Markov Logic Networks (MLNs) to capture both content and structural features in a single unified framework, and this enables us to perform more accurate inference. We show that inference in our information extraction scenario reduces to solving an instance of the maximum weight subgraph problem. We develop specialized procedures for solving the maximum subgraph variants that are far more efficient than previously proposed inference methods for MLNs that solve variants of MAX-SAT. Experiments with real-life datasets demonstrate the effectiveness of our approach.
DNA Research | 2009
Shivakumar Keerthikumar; Sahely Bhadra; Kumaran Kandasamy; Rajesh Raju; Yl Ramachandra; Chiranjib Bhattacharyya; Kohsuke Imai; Osamu Ohara; S. Sujatha Mohan; Akhilesh Pandey
Screening and early identification of primary immunodeficiency disease (PID) genes is a major challenge for physicians. Many resources have catalogued molecular alterations in known PID genes along with their associated clinical and immunological phenotypes. However, these resources do not assist in identifying candidate PID genes. We have recently developed a platform designated Resource of Asian PDIs, which hosts information pertaining to molecular alterations, protein–protein interaction networks, mouse studies and microarray gene expression profiling of all known PID genes. Using this resource as a discovery tool, we describe the development of an algorithm for prediction of candidate PID genes. Using a support vector machine learning approach, we have predicted 1442 candidate PID genes using 69 binary features of 148 known PID genes and 3162 non-PID genes as a training data set. The power of this approach is illustrated by the fact that six of the predicted genes have recently been experimentally confirmed to be PID genes. The remaining genes in this predicted data set represent attractive candidates for testing in patients where the etiology cannot be ascribed to any of the known PID genes.
knowledge discovery and data mining | 2009
Sahely Bhadra; J. Saketha Nath; Aharon Ben-Tal; Chiranjib Bhattacharyya
This paper presents a Chance-constraint Programming approach for constructing maximum-margin classifiers which are robust to interval-valued uncertainty in training examples. The methodology ensures that uncertain examples are classified correctly with high probability by employing chance-constraints. The main contribution of the paper is to pose the resultant optimization problem as a Second Order Cone Program by using large deviation inequalities, due to Bernstein. Apart from support and mean of the uncertain examples these Bernstein based relaxations make no further assumptions on the underlying uncertainty. Classifiers built using the proposed approach are less conservative, yield higher margins and hence are expected to generalize better than existing methods. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle interval-valued uncertainty than state-of-the-art.
knowledge discovery and data mining | 2011
Sandeepkumar Satpal; Sahely Bhadra; Sundararajan Sellamanickam; Rajeev Rastogi; Prithviraj Sen
In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages and sites. We use Markov Logic Networks (MLNs) to capture both content and structural features in a single unified framework, and this enables us to perform more accurate inference. MLNs allow us to model a wide range of rich structural features like proximity, precedence, alignment, and contiguity, using first-order clauses. We show that inference in our information extraction scenario reduces to solving an instance of the maximum weight subgraph problem. We develop specialized procedures for solving the maximum subgraph variants that are far more efficient than previously proposed inference methods for MLNs that solve variants of MAX-SAT. Experiments with real-life datasets demonstrate the effectiveness of our MLN-based approach compared to existing state-of-the-art extraction methods.
Algorithms for Molecular Biology | 2009
Sahely Bhadra; Chiranjib Bhattacharyya; Nagasuma Chandra; I. Saira Mian
BackgroundA genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data.ResultsThe structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the IN SILICO 1, IN SILICO 2 and IN SILICO 3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification.ConclusionA statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
international conference on machine learning | 2010
Sahely Bhadra; Sourangshu Bhattacharya; Chiranjib Bhattacharyya; Aharon Ben-Tal
Journal of Machine Learning Research | 2012
Aharon Ben-Tal; Sahely Bhadra; Chiranjib Bhattacharyya; Arkadi Nemirovski
Archive | 2012
Aharon Ben-Tal; William Davidson; Sahely Bhadra; Chiranjib Bhattacharyya; Arkadi Nemirovski; Milton Stewart