Reid Andersen
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Reid Andersen.
foundations of computer science | 2006
Reid Andersen; Fan R. K. Chung; Kevin J. Lang
A local graph partitioning algorithm finds a cut near a specified starting vertex, with a running time that depends largely on the size of the small side of the cut, rather than the size of the input graph. In this paper, we present a local partitioning algorithm using a variation of PageRank with a specified starting distribution. We derive a mixing result for PageRank vectors similar to that for random walks, and show that the ordering of the vertices produced by a PageRank vector reveals a cut with small conductance. In particular, we show that for any set C with conductance Phi and volume k, a PageRank vector with a certain starting distribution can be used to produce a set with conductance (O(radic(Phi log k)). We present an improved algorithm for computing approximate PageRank vectors, which allows us to find such a set in time proportional to its size. In particular, we can find a cut with conductance at most oslash, whose small side has volume at least 2b in time O(2 log m/(2b log2 m/oslash2) where m is the number of edges in the graph. By combining small sets found by this local partitioning algorithm, we obtain a cut with conductance oslash and approximately optimal balance in time O(m log4 m/oslash)
international world wide web conferences | 2008
Reid Andersen; Christian Borgs; Jennifer T. Chayes; Uriel Feige; Abraham D. Flaxman; Adam Tauman Kalai; Vahab S. Mirrokni; Moshe Tennenholtz
High-quality, personalized recommendations are a key feature in many online systems. Since these systems often have explicit knowledge of social network structures, the recommendations may incorporate this information. This paper focuses on networks that represent trust and recommendation systems that incorporate these trust relationships. The goal of a trust-based recommendation system is to generate personalized recommendations by aggregating the opinions of other users in the trust network. In analogy to prior work on voting and ranking systems, we use the axiomatic approach from the theory of social choice. We develop a set of five natural axioms that a trust-based recommendation system might be expected to satisfy. Then, we show that no system can simultaneously satisfy all the axioms. However, for any subset of four of the five axioms we exhibit a recommendation system that satisfies those axioms. Next we consider various ways of weakening the axioms, one of which leads to a unique recommendation system based on random walks. We consider other recommendation systems, including systems based on personalized PageRank, majority of majorities, and minimum cuts, and search for alternative axiomatizations that uniquely characterize these systems. Finally, we determine which of these systems are incentive compatible, meaning that groups of agents interested in manipulating recommendations can not induce others to share their opinion by lying about their votes or modifying their trust links. This is an important property for systems deployed in a monetized environment.
international world wide web conferences | 2006
Reid Andersen; Kevin J. Lang
Expanding a seed set into a larger community is a common procedure in link-based analysis. We show how to adapt recent results from theoretical computer science to expand a seed set into a community with small conductance and a strong relationship to the seed, while examining only a small neighborhood of the entire graph. We extend existing results to give theoretical guarantees that apply to a variety of seed sets from specified communities. We also describe simple and flexible heuristics for applying these methods in practice, and present early experiments showing that these methods compare favorably with existing approaches.
workshop on algorithms and models for the web graph | 2009
Reid Andersen; Kumar Chellapilla
We consider the problem of finding dense subgraphs with specified upper or lower bounds on the number of vertices. We introduce two optimization problems: the densest at-least-k-subgraph problem (dalks), which is to find an induced subgraph of highest average degree among all subgraphs with at least k vertices, and the densest at-most-k-subgraph problem (damks), which is defined similarly. These problems are relaxed versions of the well-known densest k-subgraph problem (dks), which is to find the densest subgraph with exactly k vertices. Our main result is that dalks can be approximated efficiently, even for web-scale graphs. We give a (1/3)-approximation algorithm for dalks that is based on the core decomposition of a graph, and that runs in time O(m + n), where n is the number of nodes and m is the number of edges. In contrast, we show that damks is nearly as hard to approximate as the densest k-subgraph problem, for which no good approximation algorithm is known. In particular, we show that if there exists a polynomial time approximation algorithm for damks with approximation ratio γ, then there is a polynomial time approximation algorithm for dks with approximation ratio γ 2/8. In the experimental section, we test the algorithm for dalks on large publicly available web graphs. We observe that, in addition to producing near-optimal solutions for dalks, the algorithm also produces near-optimal solutions for dks for nearly all values of k.
workshop on algorithms and models for the web-graph | 2007
Reid Andersen; Christian Borgs; Jennifer T. Chayes; John Hopcraft; Vahab S. Mirrokni; Shang-Hua Teng
Motivated by the problem of detecting link-spam, we consider the following graph-theoretic primitive: Given a webgraph G, a vertex v in G, and a parameter δ ∈ (0, 1), compute the set of all vertices that contribute to v at least a δ fraction of vs PageRank. We call this set the δ-contributing set of v. To this end, we define the contribution vector of v to be the vector whose entries measure the contributions of every vertex to the PageRank of v. A local algorithm is one that produces a solution by adaptively examining only a small portion of the input graph near a specified vertex. We give an efficient local algorithm that computes an Ɛ-approximation of the contribution vector for a given vertex by adaptively examining O(1/Ɛ) vertices. Using this algorithm, we give a local approximation algorithm for the primitive defined above. Specifically, we give an algorithm that returns a set containing the δ-contributing set of v and at most O(1/δ) vertices from the δ/2-contributing set of v, and which does so by examining at most O(1/δ) vertices. We also give a local algorithm for solving the following problem: If there exist k vertices that contribute a ρ-fraction to the PageRank of v, find a set of k vertices that contribute at least a (ρ-Ɛ)-fraction to the PageRank of v. In this case, we prove that our algorithm examines at most O(k/Ɛ) vertices.
international conference on computer communications | 2004
Reid Andersen; Fan R. K. Chung; Arunabha Sen; Guoliang Xue
In a WDM optical network, each fiber link can carry a certain set of wavelengths /spl Lambda/= {/spl lambda//sub 1/,/spl lambda//sub 2/,...,/spl lambda//sub W/}. One scheme for tolerating a single link failure (or node failure) in the network is the path protection scheme, which establishes an active path and a link-disjoint (or node-disjoint) backup path, so that in the event of a link failure (node failure) on the active path, data can be quickly re-routed through the backup path. We consider a dynamic scenario, where requests to establish active-backup paths between a specified source-destination node pair arrive sequentially. If a link-disjoint (node-disjoint) active-backup path pair is found at the time of the request, the paths are established; otherwise, the request is blocked. In this scenario, at the time a request arrives, not every fiber link will have all W wavelengths available for new call establishment, as some of the wavelengths may already have been allocated to earlier requests and communication through these paths may still be in progress. We assume that the network nodes do not have any wavelength converters. This paper studies the existence of a pair of link-disjoint (node-disjoint) active-backup paths satisfying the wavelength continuity constraint between a specified source-destination node pair. First we prove that both the link-disjoint and node-disjoint versions of the problem are NP-complete. Then we focus on the link-disjoint version and present an approximation algorithm and an exact algorithm for the problem. Finally, through our experimental evaluations, we demonstrate that our approximation algorithm produces near-optimal solutions in almost all of the instances of the problem in a fraction of the time required by the exact algorithm.
workshop on algorithms and models for the web graph | 2007
Reid Andersen; Fan R. K. Chung; Kevin J. Lang
A local partitioning algorithm finds a set with small conductance near a specified seed vertex. In this paper, we present a generalization of a local partitioning algorithm for undirected graphs to strongly connected directed graphs. In particular, we prove that by computing a personalized PageRank vector in a directed graph, starting from a single seed vertex within a set S that has conductance at most α, and by performing a sweep over that vector, we can obtain a set of vertices S′ with conductance ΦM(S′) = O(√α log |S|). Here, the conductance function ΦM is defined in terms of the stationary distribution of a random walk in the directed graph. In addition, we describe how this algorithm may be applied to the PageRank Markov chain of an arbitrary directed graph, which provides a way to partition directed graphs that are not strongly connected.
Internet Mathematics | 2007
Reid Andersen; Fan R. K. Chung; Kevin J. Lang
A local graph partitioning algorithm finds a cut near a specified starting vertex, with a running time that depends largely on the size of the small side of the cut, rather than the size of the input graph. In this paper, we present a local partitioning algorithm using a variation of PageRank with a specified starting distribution. We derive a mixing result for PageRank vectors similar to that for random walks, and we show that the ordering of the vertices produced by a PageRank vector reveals a cut with small conductance. In particular, we show that for any set C with conductance Φ and volume k, a PageRank vector with a certain starting distribution can be used to produce a set with conductance . We present an improved algorithm for computing approximate PageRank vectors, which allows us to find such a set in time proportional to its size. In particular, we can find a cut with conductance at most ϕ, whose small side has volume at least 2 b , in time O(2 b log2 m/ϕ 2) where m is the number of edges in the graph. By combining small sets found by this local partitioning algorithm, we obtain a cut with conductance ϕ and approximately optimal balance in time O(m log4 m/ϕ 2).
adversarial information retrieval on the web | 2008
Reid Andersen; Christian Borgs; Jennifer T. Chayes; John E. Hopcroft; Kamal Jain; Vahab S. Mirrokni; Shang-Hua Teng
Since the link structure of the web is an important element in ranking systems on search engines, web spammers widely use the link structure of the web to increase the rank of their pages. Various link-based features of web pages have been introduced and have proven effective at identifying link spam. One particularly successful family of features (as described in the SpamRank algorithm), is based on examining the sets of pages that contribute most to the PageRank of a given vertex, called supporting sets. In a recent paper, the current authors described an algorithm for efficiently computing, for a single specified vertex, an approximation of its supporting sets. In this paper, we describe several link-based spam-detection features, both supervised and unsupervised, that can be derived from these approximate supporting sets. In particular, we examine the size of a nodes supporting sets and the approximate l2 norm of the PageRank contributions from other nodes. As a supervised feature, we examine the composition of a nodes supporting sets. We perform experiments on two labeled real data sets to demonstrate the effectiveness of these features for spam detection, and demonstrate that these features can be computed efficiently. Furthermore, we design a variation of PageRank (called Robust PageRank) that incorporates some of these features into its ranking, argue that this variation is more robust against link spam engineering, and give an algorithm for approximating Robust PageRank.
theory and applications of models of computation | 2007
Reid Andersen; Fan R. K. Chung
We show that whenever there is a sharp drop in the numerical rank defined by a personalized PageRank vector, the location of the drop reveals a cut with small conductance. We then show that for any cut in the graph, and for many starting vertices within that cut, an approximate personalized PageRank vector will have a sharp drop sufficient to produce a cut with conductance nearly as small as the original cut. Using this technique, we produce a nearly linear time local partitioning algorithm whose analysis is simpler than previous algorithms.