Krishnan Pillaipakkamnatt

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Krishnan Pillaipakkamnatt is active.

Explore More

Publication

Featured researches published by Krishnan Pillaipakkamnatt.

Journal of the ACM | 1996

How many queries are needed to learn

Lisa Hellerstein; Krishnan Pillaipakkamnatt; Vijay Raghavan; Dawn Wilkins

We investigate the query complexity of exact learning in the membership and (proper) equivalence query model. We give a complete characterization of concept classes that are learnable with a polynomial number of polynomial sized queries in this model. We give applications of this characterization, including results on learning a natural subclass of DNF formulas, and on learning with membership queries alone. Query complexity has previously been used to prove lower bounds on the time complexity of exact learning. We show a new relationship between query complexity and time complexity in exact learning: If any “honest” class is exactly and properly learnable with polynomial query complexity, but not learnable in polynomial time, then P = NP. In particular, we show that an honest class is exactly polynomial-query learnable if and only if it is learnable using an oracle for Γ p 4 .

international conference on data mining | 2009

A Practical Differentially Private Random Decision Tree Classifier

Geetha Jagannathan; Krishnan Pillaipakkamnatt; Rebecca N. Wright

In this paper, we study the problem of constructing private classifiers using decision trees, within the framework of differential privacy. We first construct privacy-preserving ID3 decision trees using differentially private sum queries. Our experiments show that for many data sets a reasonable privacy guarantee can only be obtained via this method at a steep cost of accuracy in predictions. We then present a differentially private decision tree ensemble algorithm using the random decision tree approach. We demonstrate experimentally that our approach yields good prediction accuracy even when the size of the datasets is small. We also present a differentially private algorithm for the situation in which new data is periodically appended to an existing database. Our experiments show that our differentially private random decision tree classifier handles data updates in a way that maintains the same level of privacy guarantee.

conference on learning theory | 1994

On the limits of proper learnability of subclasses of DNF formulas

Krishnan Pillaipakkamnatt; Vijay Raghavan

Bshouty, Goldman, Hancock and Matar have shown that up to log n-term DNF formulas can be properly learned in the exact model with equivalence and membership queries. Given standard complexity-theoretical assumptions, we show that this positive result for proper learning cannot be significantly improved in the exact model or the PAC model extended to allow membership queries. Our negative results are derived from two general techniques for proving such results in the exact model and the extended PAC model. As a further application of these techniques, we consider read-thrice DNF formulas. Here we improve on Aizenstein, Hellerstein, and Pitts negative result for proper learning in the exact model in two ways. First, we show that their assumption of NP ≠ co-NP can be replaced with the weaker assumption of P ≠ NP. Second, we show that read-thrice DNF formulas are not properly learnable in the extended PAC model, assuming RP ≠ NP.

Information & Computation | 1995

Read-Twice DNF Formulas Are Properly Learnable

Krishnan Pillaipakkamnatt; Vijay Raghavan

We show that read-twice DNF formulas-Boolean formulas in disjunctive normal form in which each variable appears at most twice-are exactly and properly learnable in polynomial time. Our algorithm uses membership queries and proper equivalence queries and is based on a simple, new characterization of minimal read-twice DNF formulas. The algorithm improves on earlier results of Hancock and Aizenstein and Pitt which showed that read-twice DNF formulas are learnable using more powerful equivalence queries, i.e., where the hypotheses could be arbitrary DNF formulas. We also improve on the time-complexity of these earlier algorithms. Other results which may be of independent interest outside of learning follow directly from this paper. Specifically, we show that read-twice DNF formulas can be tested for equivalence in polynomial time and that the smallest read-twice formula equivalent to a given one can be found in polynomial time.

international conference on artificial intelligence and law | 1997

The effectiveness of machine learning techniques for predicting time to case disposition

Dawn Wilkins; Krishnan Pillaipakkamnatt

One of the difficult tasks in the court system is the scheduling of the entities involved at the various stages of the criminal justice system. These include judges, jurors, witnesses, defendants, attorneys and court rooms. In this paper we examine the feasibility of using machine learning techniques for the task of predicting the elapsed time between the arrest of an offender and the final disposition of his or her case. Accurate prediction of time to case disposition will aid in the resolution of conflicts that arise in the scheduling of the above entities. Using a pre-existing dataset called Offender Based Transaction Statistics (1990) and two well-known learning algorithms we show that there is scope for the use of such techniques.

international conference on data mining | 2013

A Semi-Supervised Learning Approach to Differential Privacy

Geetha Jagannathan; Claire Monteleoni; Krishnan Pillaipakkamnatt

Motivated by the semi-supervised model in the data mining literature, we propose a model for differentially-private learning in which private data is augmented by public data to achieve better accuracy. Our main result is a differentially private classifier with significantly improved accuracy compared to previous work. We experimentally demonstrate that such a classifier produces good prediction accuracies even in those situations where the amount of private data is fairly limited. This expands the range of useful applications of differential privacy since typical results in the differential privacy model require large private data sets to obtain good accuracy.

international conference on data mining | 2007

A Secure Clustering Algorithm for Distributed Data Streams

Geetha Jagannathan; Krishnan Pillaipakkamnatt; Daryl Umano

We present a distributed privacy-preserving protocol for the clustering of data streams. The participants of the se- cure protocol learn cluster centers only on completion of the protocol. Our protocol does not reveal intermediate candidate cluster centers. It is also efficient in terms of communication. The protocol is based on a new memory- efficient clustering algorithm for data streams. Our experi- ments show that, on average, the accuracy of this algorithm is better than that of the well known k-means algorithm, and compares well with BIRCH, but has far smaller mem- ory requirements.

ACM Journal of Experimental Algorithms | 2008

Sum-of-squares heuristics for bin packing and memory allocation

Michael A. Bender; Bryan Bradley; Geetha Jagannathan; Krishnan Pillaipakkamnatt

The sum-of-squares algorithm (SS) was introduced by Csirik, Johnson, Kenyon, Shor, and Weber for online bin packing of integral-sized items into integral-sized bins. First, we show the results of experiments from two new variants of the SS algorithm. The first variant, which runs in time O(n&sqrt;BlogB), appears to have almost identical expected waste as the sum-of-squares algorithm on all the distributions mentioned in the original papers on this topic. The other variant, which runs in O(nlogB) time, performs well on most, but not on all of those distributions. We also apply SS to the online memory-allocation problem. Our experimental comparisons between SS and Best Fit indicate that neither algorithm is consistently better than the other. If the amount of randomness in item sizes is low, SS appears to have lower waste than Best Fit, whereas, if the amount of randomness is high Best Fit appears to have lower waste than SS. Our experiments suggest that in both real and synthetic traces, SS does not seem to have an asymptotic advantage over Best Fit, in contrast with the bin-packing problem.

siam international conference on data mining | 2006