Avrim Blum | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Avrim Blum is active.

Explore More

Publication

Featured researches published by Avrim Blum.

conference on learning theory | 1998

Combining labeled and unlabeled data with co-training

Avrim Blum; Tom M. Mitchell

We consider the problem of using a large unlabeled sample to boost performance of a learning algorit,hrn when only a small set of labeled examples is available. In particular, we consider a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views. For example, the description of a web page can be partitioned into the words occurring on that page, and the words occurring in hyperlinks t,hat point to that page. We assume that either view of the example would be sufficient for learning if we had enough labeled data, but our goal is to use both views together to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples. Specifically, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each algorithm’s predictions on new unlabeled examples are used to enlarge the training set of the other. Our goal in this paper is to provide a PAC-style analysis for this setting, and, more broadly, a PAC-style framework for the general problem of learning from both labeled and unlabeled data. We also provide empirical results on real web-page data indicating that this use of unlabeled examples can lead to significant improvement of hypotheses in practice. *This research was supported in part by the DARPA HPKB program under contract F30602-97-1-0215 and by NSF National Young investigator grant CCR-9357793. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. TO copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. COLT 98 Madison WI USA Copyright ACM 1998 l-58113-057--0/98/ 7...%5.00 92 Tom Mitchell School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213-3891 [email protected]

Artificial Intelligence | 1997

Selection of relevant features and examples in machine learning

Avrim Blum; Pat Langley

In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area. @ 1997 Elsevier Science B.V.

Artificial Intelligence | 1997

Fast planning through planning graph analysis

Avrim Blum; Merrick L. Furst

We introduce a new approach to planning in STRIPS-like domains based on constructing and analyzing a compact structure we call a Planning Graph. We describe a new planner, Graphplan, that uses this paradigm. Graphplan always returns a shortest-possible partial-order plan, or states that no valid plan exists. We provide empirical evidence in favor of this approach, showing that Graphplan outperforms the total-order planner, Prodigy, and the partial-order planner, UCPOP, on a variety of interesting natural and artificial planning problems. We also give empirical evidence that the plans produced by Graphplan are quite sensible. Since searches made by this approach are fundamentally different from the searches of other common planning methods, they provide a new perspective on the planning problem.

Machine Learning archive | 2004

Correlation Clustering

Nikhil Bansal; Avrim Blum; Shuchi Chawla

We consider the following clustering problem: we have a complete graph on n vertices (items), where each edge (u, v) is labeled either + or − depending on whether u and v have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as much as possible with the edge labels. That is, we want a clustering that maximizes the number of + edges within clusters, plus the number of − edges between clusters (equivalently, minimizes the number of disagreements: the number of − edges inside clusters plus the number of + edges between clusters). This formulation is motivated from a document clustering problem in which one has a pairwise similarity function f learned from past data, and the goal is to partition the current set of documents in a way that correlates with f as much as possible; it can also be viewed as a kind of “agnostic learning” problem.An interesting feature of this clustering formulation is that one does not need to specify the number of clusters k as a separate parameter, as in measures such as k-median or min-sum or min-max clustering. Instead, in our formulation, the optimal number of clusters could be any value between 1 and n, depending on the edge labels. We look at approximation algorithms for both minimizing disagreements and for maximizing agreements. For minimizing disagreements, we give a constant factor approximation. For maximizing agreements we give a PTAS, building on ideas of Goldreich, Goldwasser, and Ron (1998) and de la Veg (1996). We also show how to extend some of these results to graphs with edge labels in [−1, +1], and give some results for the case of random noise.

symposium on principles of database systems | 2005

Practical privacy: the SuLQ framework

Avrim Blum; Cynthia Dwork; Frank McSherry; Kobbi Nissim

We consider a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database entries. In such a database, a query consists of a pair (S, f) where S is a set of rows in the database and f is a function mapping database rows to {0, 1}. The true answer is ΣiεS f(di), and a noisy version is released as the response to the query. Results of Dinur, Dwork, and Nissim show that a strong form of privacy can be maintained using a surprisingly small amount of noise -- much less than the sampling error -- provided the total number of queries is sublinear in the number of database rows. We call this query and (slightly) noisy reply the SuLQ (Sub-Linear Queries) primitive. The assumption of sublinearity becomes reasonable as databases grow increasingly large.We extend this work in two ways. First, we modify the privacy analysis to real-valued functions f and arbitrary row types, as a consequence greatly improving the bounds on noise required for privacy. Second, we examine the computational power of the SuLQ primitive. We show that it is very powerful indeed, in that slightly noisy versions of the following computations can be carried out with very few invocations of the primitive: principal component analysis, k means clustering, the Perceptron Algorithm, the ID3 algorithm, and (apparently!) all algorithms that operate in the in the statistical query learning model [11].

symposium on the theory of computing | 2008

A learning theory approach to non-interactive database privacy

Avrim Blum; Katrina Ligett; Aaron Roth

We demonstrate that, ignoring computational constraints, it is possible to release privacy-preserving databases that are useful for all queries over a discretized domain from any given concept class with polynomial VC-dimension. We show a new lower bound for releasing databases that are useful for halfspace queries over a continuous domain. Despite this, we give a privacy-preserving polynomial time algorithm that releases information useful for all halfspace queries, for a slightly relaxed definition of usefulness. Inspired by learning theory, we introduce a new notion of data privacy, which we call distributional privacy, and show that it is strictly stronger than the prevailing privacy notion, differential privacy.

Neural Networks | 1992

Original Contribution: Training a 3-node neural network is NP-complete

Avrim Blum; Ronald L. Rivest

We consider a 2-layer, 3-node, n-input neural network whose nodes compute linear threshold functions of their inputs. We show that it is NP-complete to decide whether there exist weights and thresholds for the three nodes of this network so that it will produce output consistent with a given set of training examples. We extend the result to other simple networks. This result suggests that those looking for perfect training algorithms cannot escape inherent computational difficulties just by considering only simple or very regular networks. It also suggests the importance, given a training problem, of finding an appropriate network and input encoding for that problem. It is left as an open problem to extend our result to nodes with non-linear functions such as sigmoids.

symposium on the theory of computing | 1994

The minimum latency problem

Avrim Blum; Prasad Chalasani; Don Coppersmith; Bill Pulleyblank; Prabhakar Raghavan; Madhu Sudan

We are given a set of points p1, . . . , pn and a symmetric distance matrix (dij) giving the distance between pi and pj . We wish to construct a tour that minimizes ∑n i=1 `(i), where `(i) is the latency of pi, defined to be the distance traveled before first visiting pi. This problem is also known in the literature as the deliveryman problem or the traveling repairman problem. It arises in a number of applications including diskhead scheduling, and turns out to be surprisingly different from the traveling salesman problem in character. We give exact and approximate solutions to a number of cases, including a constant-factor approximation algorithm whenever the distance matrix satisfies the triangle inequality. ∗School of Computer Science, CMU. Supported in part by NSF National Young Investigator grant CCR9357793. †School of Computer Science, CMU. ‡IBM T.J. Watson Research Center.

Machine Learning | 1997

Empirical Support for Winnow and Weighted-MajorityAlgorithms: Results on a Calendar Scheduling Domain

Avrim Blum

This paper describes experimental results on using Winnow and Weighted-Majority based algorithms on a real-world calendar scheduling domain. These two algorithms have been highly studied in the theoretical machine learning literature. We show here that these algorithms can be quite competitive practically, outperforming the decision-tree approach currently in use in the Calendar Apprentice system in terms of both accuracy and speed. One of the contributions of this paper is a new variant on the Winnow algorithm (used in the experiments) that is especially suited to conditions with string-valued classifications, and we give a theoretical analysis of its performance. In addition we show how Winnow can be applied to achieve a good accuracy/coverage tradeoff and explore issues that arise such as concept drift. We also provide an analysis of a policy for discarding predictors in Weighted-Majority that allows it to speed up as it learns.

symposium on the theory of computing | 1994

Weakly learning DNF and characterizing statistical query learning using Fourier analysis

Avrim Blum; Merrick L. Furst; Jeffrey C. Jackson; Michael J. Kearns; Yishay Mansour; Steven Rudich

We present new results, both positive and negative, on the well-studied problem of learning disjunctive normal form (DNF) expressions. We first prove that an algorithm due to Kushilevitz and Mansour [16] can be used to weakly learn DNF using membership queries in polynomial time, with respect to the uniform distribution on the inputs. This is the first positive result for learning unrestricted DNF expressions in polynomial time in any nontrivial formal model of learning. It provides a sharp contrast with the results of Kharitonov [15], who proved that ACO is not efficiently learnable in the same model (given certain plausible cryptographic assumptions). We also present efficient learning algorithms in various models for the read-k and SAT-k subclasses of DNF. For our negative results, we turn our attention to the recently introduced statistical query model of learning [11]. This model is a restricted version of the popular Probably Approximately Correct (PAC) model [23], and practically every class known to be efficiently learnable in the PAC model is in fact learnable in the statistical query model [11]. Here we give a general characterization of the complexity of statistical query learning in terms of the number of uncorrelated functions in the concept class. This is a distributiondependent quantity yielding upper and lower bounds on the number of st atistical queries required for learning on any input distribution. As a corollary, we obtain that DNF expressions and decision trees are not even weakly learnable with ●This research M sponsored in part by the Wr]ght Laboratory, Aeronautical Systems Center, Air Force Materiel Command, USAF, and the Advanced Research Projects Agency (ARPA) under grant number F33615-93-1-1330 Support also M sponsored by the National Sc]ence Foundation under Grant No CC-91 19319. Blum also supported m part by NSF National Young Investigator grant CCR9357793 Views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing official po!lcles or endorsements, either expressed or implied, of Wright Laboratory or the United States Government, or NSF tcontact ~“thor Address: AT&T Bell Laboratcmes, Room 2A423, 600 Mountain Avenue, P.O. Box 636, Murray Hill, NJ 07974 Electronic mail. mkearns@research .at t corn ~Thi~ research ~a~ ~“pported in p~~t by The Israel science Foun. datlon administered by The Israel Academy of Sc]ence and Humanities and by a grant of the Israeli Ministry of Science and Technology Permission to co y without fee all or part of this material is granted provided%atthe copies are not madeordistrftrutectfor direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association of Computing Machinery. To copy otherwise, or to republish, requires a fee ancf/or specific permission. STOC 945/94 Montreal, Quebec, Canada Q 1994 ACM 0-89791 -663-8/94/0005..

Explore More