Computer Science Data Structures And Algorithms - Researchain

Featured Researches

Near-Quadratic Lower Bounds for Two-Pass Graph Streaming Algorithms

We prove that any two-pass graph streaming algorithm for the s - t reachability problem in n -vertex directed graphs requires near-quadratic space of n 2−o(1) bits. As a corollary, we also obtain near-quadratic space lower bounds for several other fundamental problems including maximum bipartite matching and (approximate) shortest path in undirected graphs. Our results collectively imply that a wide range of graph problems admit essentially no non-trivial streaming algorithm even when two passes over the input is allowed. Prior to our work, such impossibility results were only known for single-pass streaming algorithms, and the best two-pass lower bounds only ruled out o( n 7/6 ) space algorithms, leaving open a large gap between (trivial) upper bounds and lower bounds.

Data Structures And Algorithms

Near-linear Size Hypergraph Cut Sparsifiers

Cuts in graphs are a fundamental object of study, and play a central role in the study of graph algorithms. The problem of sparsifying a graph while approximately preserving its cut structure has been extensively studied and has many applications. In a seminal work, Benczúr and Karger (1996) showed that given any n -vertex undirected weighted graph G and a parameter ε∈(0,1) , there is a near-linear time algorithm that outputs a weighted subgraph G ′ of G of size O ~ (n/ ε 2 ) such that the weight of every cut in G is preserved to within a (1±ε) -factor in G ′ . The graph G ′ is referred to as a {\em (1±ε) -approximate cut sparsifier} of G . A natural question is if such cut-preserving sparsifiers also exist for hypergraphs. Kogan and Krauthgamer (2015) initiated a study of this question and showed that given any weighted hypergraph H where the cardinality of each hyperedge is bounded by r , there is a polynomial-time algorithm to find a (1±ε) -approximate cut sparsifier of H of size O ~ ( nr ε 2 ) . Since r can be as large as n , in general, this gives a hypergraph cut sparsifier of size O ~ ( n 2 / ε 2 ) , which is a factor n larger than the Benczúr-Karger bound for graphs. It has been an open question whether or not Benczúr-Karger bound is achievable on hypergraphs. In this work, we resolve this question in the affirmative by giving a new polynomial-time algorithm for creating hypergraph sparsifiers of size O ~ (n/ ε 2 ) .

Data Structures And Algorithms

Nearest Neighbor Search for Hyperbolic Embeddings

Embedding into hyperbolic space is emerging as an effective representation technique for datasets that exhibit hierarchical structure. This development motivates the need for algorithms that are able to effectively extract knowledge and insights from datapoints embedded in negatively curved spaces. We focus on the problem of nearest neighbor search, a fundamental problem in data analysis. We present efficient algorithmic solutions that build upon established methods for nearest neighbor search in Euclidean space, allowing for easy adoption and integration with existing systems. We prove theoretical guarantees for our techniques and our experiments demonstrate the effectiveness of our approach on real datasets over competing algorithms.

Data Structures And Algorithms

Nearly Linear-Time, Parallelizable Algorithms for Non-Monotone Submodular Maximization

We study parallelizable algorithms for maximization of a submodular function, not necessarily monotone, with respect to a cardinality constraint k . We improve the best approximation factor achieved by an algorithm that has optimal adaptivity and query complexity, up to logarithmic factors in the size n of the ground set, from 0.039−ϵ to 0.193−ϵ . We provide two algorithms; the first has approximation ratio 1/6−ϵ , adaptivity O(logn) , and query complexity O(nlogk) , while the second has approximation ratio 0.193−ϵ , adaptivity O( log 2 n) , and query complexity O(nlogk) . Heuristic versions of our algorithms are empirically validated to use a low number of adaptive rounds and total queries while obtaining solutions with high objective value in comparison with state-of-the-art approximation algorithms, including continuous algorithms that use the multilinear extension.

Data Structures And Algorithms

New Approximation Algorithms for Forest Closeness Centrality -- for Individual Vertices and Vertex Groups

The emergence of massive graph data sets requires fast mining algorithms. Centrality measures to identify important vertices belong to the most popular analysis methods in graph mining. A measure that is gaining attention is forest closeness centrality; it is closely related to electrical measures using current flow but can also handle disconnected graphs. Recently, [Jin et al., ICDM'19] proposed an algorithm to approximate this measure probabilistically. Their algorithm processes small inputs quickly, but does not scale well beyond hundreds of thousands of vertices. In this paper, we first propose a different approximation algorithm; it is up to two orders of magnitude faster and more accurate in practice. Our method exploits the strong connection between uniform spanning trees and forest distances by adapting and extending recent approximation algorithms for related single-vertex problems. This results in a nearly-linear time algorithm with an absolute probabilistic error guarantee. In addition, we are the first to consider the problem of finding an optimal group of vertices w.r.t. forest closeness. We prove that this latter problem is NP-hard; to approximate it, we adapt a greedy algorithm by [Li et al., WWW'19], which is based on (partial) matrix inversion. Moreover, our experiments show that on disconnected graphs, group forest closeness outperforms existing centrality measures in the context of semi-supervised vertex classification.

Data Structures And Algorithms

New Data Structures for Orthogonal Range Reporting and Range Minima Queries

In this paper we present new data structures for two extensively studied variants of the orthogonal range searching problem. First, we describe a data structure that supports two-dimensional orthogonal range minima queries in O(n) space and O( log ε n) time, where n is the number of points in the data structure and ε is an arbitrarily small positive constant. Previously known linear-space solutions for this problem require O( log 1+ε n) (Chazelle, 1988) or O(lognloglogn) time (Farzan et al., 2012). A modification of our data structure uses space O(nloglogn) and supports range minima queries in time O(loglogn) . Both results can be extended to support three-dimensional five-sided reporting queries. Next, we turn to the four-dimensional orthogonal range reporting problem and present a data structure that answers queries in optimal O(logn/loglogn+k) time, where k is the number of points in the answer. This is the first data structure that achieves the optimal query time for this problem. Our results are obtained by exploiting the properties of three-dimensional shallow cuttings.

Data Structures And Algorithms

New FPT algorithms for finding the temporal hybridization number for sets of phylogenetic trees

We study the problem of finding a temporal hybridization network for a set of phylogenetic trees that minimizes the number of reticulations. First, we introduce an FPT algorithm for this problem on an arbitrary set of m binary trees with n leaves each with a running time of O( 5 k ⋅n⋅m) , where k is the minimum temporal hybridization number. We also present the concept of temporal distance, which is a measure for how close a tree-child network is to being temporal. Then we introduce an algorithm for computing a tree-child network with temporal distance at most d and at most k reticulations in O((8k ) d 5 k ⋅n⋅m) time. Lastly, we introduce a O( 6 k k!⋅k⋅ n 2 ) time algorithm for computing a minimum temporal hybridization network for a set of two nonbinary trees. We also provide an implementation of all algorithms and an experimental analysis on their performance.

Data Structures And Algorithms

New Hardness Results for Planar Graph Problems in P and an Algorithm for Sparsest Cut

The Sparsest Cut is a fundamental optimization problem that has been extensively studied. For planar inputs the problem is in P and can be solved in O ~ ( n 3 ) time if all vertex weights are 1 . Despite a significant amount of effort, the best algorithms date back to the early 90's and can only achieve O(logn) -approximation in O ~ (n) time or a constant factor approximation in O ~ ( n 2 ) time [Rao, STOC92]. Our main result is an Ω( n 2−ϵ ) lower bound for Sparsest Cut even in planar graphs with unit vertex weights, under the (min,+) -Convolution conjecture, showing that approximations are inevitable in the near-linear time regime. To complement the lower bound, we provide a constant factor approximation in near-linear time, improving upon the 25-year old result of Rao in both time and accuracy. Our lower bound accomplishes a repeatedly raised challenge by being the first fine-grained lower bound for a natural planar graph problem in P. Moreover, we prove near-quadratic lower bounds under SETH for variants of the closest pair problem in planar graphs, and use them to show that the popular Average-Linkage procedure for Hierarchical Clustering cannot be simulated in truly subquadratic time. We prove an Ω(n/logn) lower bound on the number of communication rounds required to compute the weighted diameter of a network in the CONGEST model, even when the underlying graph is planar and all nodes are D=4 hops away from each other. This is the first poly( n ) + ω(D) lower bound in the planar-distributed setting, and it complements the recent poly (D,logn) upper bounds of Li and Parter [STOC 2019] for (exact) unweighted diameter and for ( 1+ϵ ) approximate weighted diameter.

Data Structures And Algorithms

New Quality Metrics for Dynamic Graph Drawing

In this paper, we present new quality metrics for dynamic graph drawings. Namely, we present a new framework for change faithfulness metrics for dynamic graph drawings, which compare the ground truth change in dynamic graphs and the geometric change in drawings. More specifically, we present two specific instances, cluster change faithfulness metrics and distance change faithfulness metrics. We first validate the effectiveness of our new metrics using deformation experiments. Then we compare various graph drawing algorithms using our metrics. Our experiments confirm that the best cluster (resp. distance) faithful graph drawing algorithms are also cluster (resp. distance) change faithful.

Data Structures And Algorithms

New Results and Bounds on Online Facility Assignment Problem

Consider an online facility assignment problem where a set of facilities F={ f 1 , f 2 , f 3 ,⋯, f |F| } of equal capacity l is situated on a metric space and customers arrive one by one in an online manner on that space. We assign a customer c i to a facility f j before a new customer c i+1 arrives. The cost of this assignment is the distance between c i and f j . The objective of this problem is to minimize the sum of all assignment costs. Recently Ahmed et al. (TCS, 806, pp. 455-467, 2020) studied the problem where the facilities are situated on a line and computed competitive ratio of "Algorithm Greedy" which assigns the customer to the nearest available facility. They computed competitive ratio of algorithm named "Algorithm Optimal-Fill" which assigns the new customer considering optimal assignment of all previous customers. They also studied the problem where the facilities are situated on a connected unweighted graph. In this paper we first consider that F is situated on the vertices of a connected unweighted grid graph G of size r×c and customers arrive one by one having positions on the vertices of G . We show that Algorithm Greedy has competitive ratio r×c+r+c and Algorithm Optimal-Fill has competitive ratio O(r×c) . We later show that the competitive ratio of Algorithm Optimal-Fill is 2|F| for any arbitrary graph. Our bound is tight and better than the previous result. We also consider the facilities are distributed arbitrarily on a plane and provide an algorithm for the scenario. We also provide an algorithm that has competitive ratio (2n−1) . Finally, we consider a straight line metric space and show that no algorithm for the online facility assignment problem has competitive ratio less than 9.001 .

Ready to get started?

Join us today

Archive Your Research