Chiranjib Bhattacharyya
Indian Institute of Science
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chiranjib Bhattacharyya.
Neural Computation | 2001
S. Sathiya Keerthi; Shirish Krishnaj Shevade; Chiranjib Bhattacharyya; K. R. K. Murthy
This article points out an important source of inefficiency in Platts sequential minimal optimization (SMO) algorithm that is caused by the use of a single threshold value. Using clues from the KKT conditions for the dual problem, two threshold parameters are employed to derive modifications of SMO. These modified algorithms perform significantly faster than the original SMO on all benchmark data sets tried.
IEEE Transactions on Neural Networks | 2000
Shirish Krishnaj Shevade; S. Sathiya Keerthi; Chiranjib Bhattacharyya; K. R. K. Murthy
This paper points out an important source of inefficiency in Smola and Schölkopfs sequential minimal optimization (SMO) algorithm for support vector machine (SVM) regression that is caused by the use of a single threshold value. Using clues from the KKT conditions for the dual problem, two threshold parameters are employed to derive modifications of SMO for regression. These modified algorithms perform significantly faster than the original SMO on the datasets tried.
Journal of Machine Learning Research | 2003
Gert R. G. Lanckriet; Laurent El Ghaoui; Chiranjib Bhattacharyya; Michael I. Jordan
When constructing a classifier, the probability of correct classification of future data points should be maximized. We consider a binary classification problem where the mean and covariance matrix of each class are assumed to be known. No further assumptions are made with respect to the class-conditional distributions. Misclassification probabilities are then controlled in a worst-case setting: that is, under all possible choices of class-conditional densities with given mean and covariance matrix, we minimize the worst-case (maximum) probability of misclassification of future data points. For a linear decision boundary, this desideratum is translated in a very direct way into a (convex) second order cone optimization problem, with complexity similar to a support vector machine problem. The minimax problem can be interpreted geometrically as minimizing the maximum of the Mahalanobis distances to the two classes. We address the issue of robustness with respect to estimation errors (in the means and covariances of the classes) via a simple modification of the input data. We also show how to exploit Mercer kernels in this setting to obtain nonlinear decision boundaries, yielding a classifier which proves to be competitive with current methods, including support vector machines. An important feature of this method is that a worst-case bound on the probability of misclassification of future data is always obtained explicitly.
IEEE Transactions on Neural Networks | 2000
S. Sathiya Keerthi; Shirish Krishnaj Shevade; Chiranjib Bhattacharyya; K. R. K. Murthy
In this paper we give a new fast iterative algorithm for support vector machine (SVM) classifier design. The basic problem treated is one that does not allow classification violations. The problem is converted to a problem of computing the nearest point between two convex polytopes. The suitability of two classical nearest point algorithms, due to Gilbert, and Mitchell et al., is studied. Ideas from both these algorithms are combined and modified to derive our fast algorithm. For problems which require classification violations to be allowed, the violations are quadratically penalized and an idea due to Cortes and Vapnik and Friess is used to convert it to a problem in which there are no classification violations. Comparative computational evaluation of our algorithm against powerful SVM methods such as Platts sequential minimal optimization shows that our algorithm is very competitive.
international conference on document analysis and recognition | 2007
R Sriraghavendra; K. Karthik; Chiranjib Bhattacharyya
We propose a novel, language-neutral approach for searching online handwritten text using Frechet distance. Online handwritten data, which is available as a time series (x,y,t), is treated as representing a parameterized curve in two-dimensions and the problem of searching online handwritten text is posed as a problem of matching two curves in a two-dimensional Euclidean space. Frechet distance is a natural measure for matching curves. The main contribution of this paper is the formulation of a variant of Frechet distance that can be used for retrieving words even when only a prefix of the word is given as query. Extensive experiments on UNIPEN dataset consisting of over 16,000 words written by 7 users show that our method outperforms the state-of-the-art DTW method. Experiments were also conducted on a multilingual dataset, generated on a PDA, with encouraging results. Our approach can be used to implement useful, exciting features like auto-completion of handwriting in PDAs.
Mathematical Programming | 2011
Aharon Ben-Tal; Sahely Bhadra; Chiranjib Bhattacharyya; J. Saketha Nath
This paper studies the problem of constructing robust classifiers when the training is plagued with uncertainty. The problem is posed as a Chance-Constrained Program (CCP) which ensures that the uncertain data points are classified correctly with high probability. Unfortunately such a CCP turns out to be intractable. The key novelty is in employing Bernstein bounding schemes to relax the CCP as a convex second order cone program whose solution is guaranteed to satisfy the probabilistic constraint. Prior to this work, only the Chebyshev based relaxations were exploited in learning algorithms. Bernstein bounds employ richer partial information and hence can be far less conservative than Chebyshev bounds. Due to this efficient modeling of uncertainty, the resulting classifiers achieve higher classification margins and hence better generalization. Methodologies for classifying uncertain test data points and error measures for evaluating classifiers robust to uncertain data are discussed. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle data uncertainty and outperform state-of-the-art in many cases.
Nucleic Acids Research | 2009
Shivakumar Keerthikumar; Rajesh Raju; Kumaran Kandasamy; Atsushi Hijikata; Subhashri Ramabadran; Lavanya Balakrishnan; Mukhtar Ahmed; Sandhya Rani; Lakshmi Dhevi N. Selvan; Devi S. Somanathan; Somak Ray; Mitali Bhattacharjee; Sashikanth Gollapudi; Yl Ramachandra; Sahely Bhadra; Chiranjib Bhattacharyya; Kohsuke Imai; Shigeaki Nonoyama; Hirokazu Kanegane; Toshio Miyawaki; Akhilesh Pandey; Osamu Ohara; S. Sujatha Mohan
Availability of a freely accessible, dynamic and integrated database for primary immunodeficiency diseases (PID) is important both for researchers as well as clinicians. To build a PID informational platform and also as a part of action to initiate a network of PID research in Asia, we have constructed a web-based compendium of molecular alterations in PID, named Resource of Asian Primary Immunodeficiency Diseases (RAPID), which is available as a worldwide web resource at http://rapid.rcai.riken.jp/. It hosts information on sequence variations and expression at the mRNA and protein levels of all genes reported to be involved in PID patients. The main objective of this database is to provide detailed information pertaining to genes and proteins involved in primary immunodeficiency diseases along with other relevant information about protein–protein interactions, mouse studies and microarray gene-expression profiles in various organs and cells of the immune system. RAPID also hosts a tool, mutation viewer, to predict deleterious and novel mutations and also to obtain mutation-based 3D structures for PID genes. Thus, information contained in this database should help physicians and other biomedical investigators to further investigate the role of these molecules in PID.
knowledge discovery and data mining | 2006
J. Saketha Nath; Chiranjib Bhattacharyya; M. N. Murty
This paper presents a novel Second Order Cone Programming (SOCP) formulation for large scale binary classification tasks. Assuming that the class conditional densities are mixture distributions, where each component of the mixture has a spherical covariance, the second order statistics of the components can be estimated efficiently using clustering algorithms like BIRCH. For each cluster, the second order moments are used to derive a second order cone constraint via a Chebyshev-Cantelli inequality. This constraint ensures that any data point in the cluster is classified correctly with a high probability. This leads to a large margin SOCP formulation whose size depends on the number of clusters rather than the number of training data points. Hence, the proposed formulation scales well for large datasets when compared to the state-of-the-art classifiers, Support Vector Machines (SVMs). Experiments on real world and synthetic datasets show that the proposed algorithm outperforms SVM solvers in terms of training time and achieves similar accuracies.
DNA Research | 2009
Shivakumar Keerthikumar; Sahely Bhadra; Kumaran Kandasamy; Rajesh Raju; Yl Ramachandra; Chiranjib Bhattacharyya; Kohsuke Imai; Osamu Ohara; S. Sujatha Mohan; Akhilesh Pandey
Screening and early identification of primary immunodeficiency disease (PID) genes is a major challenge for physicians. Many resources have catalogued molecular alterations in known PID genes along with their associated clinical and immunological phenotypes. However, these resources do not assist in identifying candidate PID genes. We have recently developed a platform designated Resource of Asian PDIs, which hosts information pertaining to molecular alterations, protein–protein interaction networks, mouse studies and microarray gene expression profiling of all known PID genes. Using this resource as a discovery tool, we describe the development of an algorithm for prediction of candidate PID genes. Using a support vector machine learning approach, we have predicted 1442 candidate PID genes using 69 binary features of 148 known PID genes and 3162 non-PID genes as a training data set. The power of this approach is illustrated by the fact that six of the predicted genes have recently been experimentally confirmed to be PID genes. The remaining genes in this predicted data set represent attractive candidates for testing in patients where the etiology cannot be ascribed to any of the known PID genes.
conference on information and knowledge management | 2011
Dyut Kumar Sil; Srinivasan H. Sengamedu; Chiranjib Bhattacharyya
Comments constitute an important part of Web 2.0. In this paper, we consider comments on news articles. To simplify the task of relating the comment content to the article content the comments are about, we propose the idea of showing comments alongside article segments and explore automatic mapping of comments to article segments. This task is challenging because of the vocabulary mismatch between the articles and the comments. We present supervised and unsupervised techniques for aligning comments to segments the of article the comments are about. More specifically, we provide a novel formulation of supervised alignment problem using the framework of structured classification. Our experimental results show that structured classification model performs better than unsupervised matching and binary classification model.