Aryeh Kontorovich | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aryeh Kontorovich is active.

Explore More

Publication

Featured researches published by Aryeh Kontorovich.

algorithmic learning theory | 2013

On the learnability of shuffle ideals

Dana Angluin; James Aspnes; Aryeh Kontorovich

Although PAC learning unrestricted regular languages is long known to be a very difficult problem, one might suppose the existence (and even an abundance) of natural efficiently learnable sub-families. When our literature search for a natural efficiently learnable regular family came up empty, we proposed the shuffle ideals as a prime candidate. A shuffle ideal generated by a string u is simply the collection of all strings containing u as a (discontiguous) subsequence. This fundamental language family is of theoretical interest in its own right and also provides the building blocks for other important language families. Somewhat surprisingly, we discovered that even a class as simple as the shuffle ideals is not properly PAC learnable, unless RP=NP. In the positive direction, we give an efficient algorithm for properly learning shuffle ideals in the statistical query (and therefore also PAC) model under the uniform distribution.

IEEE Transactions on Signal Processing | 2011

Model Selection for Sinusoids in Noise: Statistical Analysis and a New Penalty Term

Boaz Nadler; Aryeh Kontorovich

Detection of the number of sinusoids embedded in noise is a fundamental problem in statistical signal processing. Most parametric methods minimize the sum of a data fit (likelihood) term and a complexity penalty term. The latter is often derived via information theoretic criteria, such as minimum description length (MDL), or via Bayesian approaches including Bayesian information criterion (BIC) or maximum a posteriori (MAP). While the resulting estimators are asymptotically consistent, empirically their finite sample performance is strongly dependent on the specific penalty term chosen. In this paper we elucidate the source of this behavior, by relating the detection performance to the extreme value distribution of the maximum of the periodogram and of related random fields. Based on this relation, we propose a combined detection-estimation algorithm with a new penalty term. Our proposed penalty term is sharp in the sense that the resulting estimator achieves a nearly constant false alarm rate. A series of simulations support our theoretical analysis and show the superior detection performance of the suggested estimator.

IEEE Transactions on Information Theory | 2014

Efficient Classification for Metric Data

Lee-Ad Gottlieb; Aryeh Kontorovich; Robert Krauthgamer

Recent advances in large-margin classification of data residing in general metric spaces (rather than Hilbert spaces) enable classification under various natural metrics, such as string edit and earthmover distance. A general framework developed for this purpose left open the questions of computational efficiency and of providing direct bounds on generalization error. We design a new algorithm for classification in general metric spaces, whose runtime and accuracy depend on the doubling dimension of the data points, and can thus achieve superior classification performance in many common scenarios. The algorithmic core of our approach is an approximate (rather than exact) solution to the classical problems of Lipschitz extension and of nearest neighbor search. The algorithms generalization performance is guaranteed via the fat-shattering dimension of Lipschitz classifiers, and we present experimental evidence of its superiority to some common kernel methods. As a by-product, we offer a new perspective on the nearest neighbor classifier, which yields significantly sharper risk asymptotics than the classic analysis.

Journal of Applied Probability | 2014

Uniform Chernoff and Dvoretzky-Kiefer-Wolfowitz-type inequalities for Markov chains and related processes

Aryeh Kontorovich; Roi Weiss

We observe that the technique of Markov contraction can be used to establish measure concentration for a broad class of non-contracting chains. In particular, geometric ergodicity provides a simple and versatile framework. This leads to a short, elementary proof of a general concentration inequality for Markov and hidden Markov chains (HMM), which supercedes some of the known results and easily extends to other processes such as Markov trees. As applications, we give a Dvoretzky-Kiefer-Wolfowitz-type inequality and a uniform Chernoff bound. All of our bounds are dimension-free and hold for countably infinite state spaces.

IEEE Transactions on Information Theory | 2014

Minimum KL-Divergence on Complements of

Daniel Berend; Peter Harremoës; Aryeh Kontorovich

Pinskers widely used inequality upper-bounds the total variation distance ∥P - Q∥1 in terms of the Kullback-Leibler divergence D(P∥Q). Although, in general, a bound in the reverse direction is impossible, in many applications the quantity of interest is actually D*(v, Q)-defined, for an arbitrary fixed Q, as the infimum of D(P∥Q) over all distributions P that are at least v-far away from Q in total variation. We show that D*(v, Q) ≤ Cv2 + O(v3), where C = C(Q) = 1/2 for balanced distributions, thereby providing a kind of reverse Pinsker inequality. Some of the structural results obtained in the course of the proof may be of independent interest. An application to large deviations is given.

Machine Learning | 2013

L_{1}

Lena Chekina; Dan Gutfreund; Aryeh Kontorovich; Lior Rokach; Bracha Shapira

Multi-label classification exhibits several challenges not present in the binary case. The labels may be interdependent, so that the presence of a certain label affects the probability of other labels’ presence. Thus, exploiting dependencies among the labels could be beneficial for the classifier’s predictive performance. Surprisingly, only a few of the existing algorithms address this issue directly by identifying dependent labels explicitly from the dataset. In this paper we propose new approaches for identifying and modeling existing dependencies between labels. One principal contribution of this work is a theoretical confirmation of the reduction in sample complexity that is gained from unconditional dependence. Additionally, we develop methods for identifying conditionally and unconditionally dependent label pairs; clustering them into several mutually exclusive subsets; and finally, performing multi-label classification incorporating the discovered dependencies. We compare these two notions of label dependence (conditional and unconditional) and evaluate their performance on various benchmark and artificial datasets. We also compare and analyze labels identified as dependent by each of the methods. Moreover, we define an ensemble framework for the new methods and compare it to existing ensemble methods. An empirical comparison of the new approaches to existing base-line and state-of-the-art methods on 12 various benchmark datasets demonstrates that in many cases the proposed single-classifier and ensemble methods outperform many multi-label classification algorithms. Perhaps surprisingly, we discover that the weaker notion of unconditional dependence plays the decisive role.

IEEE Transactions on Information Theory | 2017

Balls

Lee-Ad Gottlieb; Aryeh Kontorovich; Robert Krauthgamer

We present a framework for performing efficient regression in general metric spaces. Roughly speaking, our regressor predicts the value at a new point by computing an approximate Lipschitz extension— the smoothest function consistent with the observed data— after performing structural risk minimization to avoid overfitting. We obtain finite-sample risk bounds with minimal structural and noise assumptions, and a natural runtime-precision tradeoff. The offline (learning) and online (prediction) stages can be solved by convex programming, but this naive approach has runtime complexity

algorithmic learning theory | 2013

Exploiting label dependencies for improved sample complexity

Lee-Ad Gottlieb; Aryeh Kontorovich; Robert Krauthgamer

O(n^{3})

Theoretical Computer Science | 2016

Efficient Regression in Metric Spaces via Approximate Lipschitz Extension

Daniel Berend; Aryeh Kontorovich

, which is prohibitive for large data sets. We design instead a regression algorithm whose speed and generalization performance depend on the intrinsic dimension of the data, to which the algorithm adapts. While our main innovation is algorithmic, the statistical results may also be of independent interest.

international symposium on information theory | 2012

Adaptive Metric Dimensionality Reduction

Aryeh Kontorovich; Ari Trachtenberg

We study data-adaptive dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling, which yields a new theoretical explanation for empirically reported improvements gained by preprocessing Euclidean data by PCA (Principal Components Analysis) prior to constructing a linear classifier. On the algorithmic front, we describe an analogue of PCA for metric spaces, namely an efficient procedure that approximates the data’s intrinsic dimension, which is often much lower than the ambient dimension. Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.

Explore More