J. Saketha Nath
Indian Institute of Technology Bombay
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by J. Saketha Nath.
Mathematical Programming | 2011
Aharon Ben-Tal; Sahely Bhadra; Chiranjib Bhattacharyya; J. Saketha Nath
This paper studies the problem of constructing robust classifiers when the training is plagued with uncertainty. The problem is posed as a Chance-Constrained Program (CCP) which ensures that the uncertain data points are classified correctly with high probability. Unfortunately such a CCP turns out to be intractable. The key novelty is in employing Bernstein bounding schemes to relax the CCP as a convex second order cone program whose solution is guaranteed to satisfy the probabilistic constraint. Prior to this work, only the Chebyshev based relaxations were exploited in learning algorithms. Bernstein bounds employ richer partial information and hence can be far less conservative than Chebyshev bounds. Due to this efficient modeling of uncertainty, the resulting classifiers achieve higher classification margins and hence better generalization. Methodologies for classifying uncertain test data points and error measures for evaluating classifiers robust to uncertain data are discussed. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle data uncertainty and outperform state-of-the-art in many cases.
knowledge discovery and data mining | 2006
J. Saketha Nath; Chiranjib Bhattacharyya; M. N. Murty
This paper presents a novel Second Order Cone Programming (SOCP) formulation for large scale binary classification tasks. Assuming that the class conditional densities are mixture distributions, where each component of the mixture has a spherical covariance, the second order statistics of the components can be estimated efficiently using clustering algorithms like BIRCH. For each cluster, the second order moments are used to derive a second order cone constraint via a Chebyshev-Cantelli inequality. This constraint ensures that any data point in the cluster is classified correctly with a high probability. This leads to a large margin SOCP formulation whose size depends on the number of clusters rather than the number of training data points. Hence, the proposed formulation scales well for large datasets when compared to the state-of-the-art classifiers, Support Vector Machines (SVMs). Experiments on real world and synthetic datasets show that the proposed algorithm outperforms SVM solvers in terms of training time and achieves similar accuracies.
Pattern Recognition | 2006
J. Saketha Nath; Shirish Krishnaj Shevade
Support vector clustering involves three steps-solving an optimization problem, identification of clusters and tuning of hyper-parameters. In this paper, we introduce a pre-processing step that eliminates data points from the training data that are not crucial for clustering. Pre-processing is efficiently implemented using the R*-tree data structure. Experiments on real-world and synthetic datasets show that pre-processing drastically decreases the run-time of the clustering algorithm. Also, in many cases reduction in the number of support vectors is achieved. Further, we suggest an improvement for the step of identification of clusters.
international conference on machine learning | 2007
Rashmin Babaria; J. Saketha Nath; S Krishnan; Sivaramakrishnan K R; Chiranjib Bhattacharyya; M. N. Murty
In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone (SOC) constraint. The main contribution of the paper is a fast algorithm, CB-OR, which solves the proposed formulation more eficiently than general purpose solvers. Another main contribution of the paper is to pose the problem of focused crawling as a large scale Ordinal Regression problem and solve using the proposed CB-OR. Focused crawling is an efficient mechanism for discovering resources of interest on the web. Posing the problem of focused crawling as an Ordinal Regression problem avoids the need for a negative class and topic hierarchy, which are the main drawbacks of the existing focused crawling methods. Experiments on large synthetic and benchmark datasets show the scalability of CB-OR. Experiments also show that the proposed focused crawler outperforms the state-of-the-art.
knowledge discovery and data mining | 2009
Sahely Bhadra; J. Saketha Nath; Aharon Ben-Tal; Chiranjib Bhattacharyya
This paper presents a Chance-constraint Programming approach for constructing maximum-margin classifiers which are robust to interval-valued uncertainty in training examples. The methodology ensures that uncertain examples are classified correctly with high probability by employing chance-constraints. The main contribution of the paper is to pose the resultant optimization problem as a Second Order Cone Program by using large deviation inequalities, due to Bernstein. Apart from support and mean of the uncertain examples these Bernstein based relaxations make no further assumptions on the underlying uncertainty. Classifiers built using the proposed approach are less conservative, yield higher margins and hence are expected to generalize better than existing methods. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle interval-valued uncertainty than state-of-the-art.
knowledge discovery and data mining | 2016
Arun Shankar Iyer; J. Saketha Nath; Sunita Sarawagi
In this paper we present learning models for the class ratio estimation problem, which takes as input an unlabeled set of instances and predicts the proportions of instances in the set belonging to the different classes. This problem has applications in social and commercial data analysis. Existing models for class-ratio estimation however require instance-level supervision. Whereas in domains like politics, and demography, set-level supervision is more common. We present a new method for directly estimating class-ratios using set-level supervision. Another serious limitation in applying these techniques to sensitive domains like health is data privacy. We propose a novel label privacy-preserving mechanism that is well-suited for supervised class ratio estimation and has guarantees for achieving efficient differential privacy, provided the per-class counts are large enough. We derive learning bounds for the estimation with and without privacy constraints, which lead to important insights for the data-publisher. Extensive empirical evaluation shows that our model is more accurate than existing methods and that the proposed privacy mechanism and learning model are well-suited for each other.
siam international conference on data mining | 2007
J. Saketha Nath; Chiranjib Bhattacharyya
international conference on machine learning | 2014
Pratik Jawanpuria; Manik Varma; J. Saketha Nath
international conference on machine learning | 2014
Arun Shankar Iyer; J. Saketha Nath; Sunita Sarawagi
meeting of the association for computational linguistics | 2013
Ankit Ramteke; Akshat Malu; Pushpak Bhattacharyya; J. Saketha Nath