Computer Science Data Structures And Algorithms - Researchain

Featured Researches

Improved Algorithms for the General Exact Satisfiability Problem

The Exact Satisfiability problem asks if we can find a satisfying assignment to each clause such that exactly one literal in each clause is assigned 1 , while the rest are all assigned 0 . We can generalise this problem further by defining that a C j clause is solved iff exactly j of the literals in the clause are 1 and all others are 0 . We now introduce the family of Generalised Exact Satisfiability problems called G i XSAT as the problem to check whether a given instance consisting of C j clauses with j?�{0,1,??i} for each clause has a satisfying assignment. In this paper, we present faster exact polynomial space algorithms, using a nonstandard measure, to solve G i XSAT, for i?�{2,3,4} , in O( 1.3674 n ) time, O( 1.5687 n ) time and O( 1.6545 n ) time, respectively, using polynomial space, where n is the number of variables. This improves the current state of the art for polynomial space algorithms from O( 1.4203 n ) time for G 2 XSAT by Zhou, Jiang and Yin and from O( 1.6202 n ) time for G 3 XSAT by Dahllöf and from O( 1.6844 n ) time for G 4 XSAT which was by Dahllöf as well. In addition, we present faster exact algorithms solving G 2 XSAT, G 3 XSAT and G 4 XSAT in O( 1.3188 n ) time, O( 1.3407 n ) time and O( 1.3536 n ) time respectively at the expense of using exponential space.

Data Structures And Algorithms

Improved Analysis of RANKING for Online Vertex-Weighted Bipartite Matching

In this paper, we consider the online vertex-weighted bipartite matching problem in the random arrival model. We consider the generalization of the RANKING algorithm for this problem introduced by Huang, Tang, Wu, and Zhang (TALG 2019), who show that their algorithm has a competitive ratio of 0.6534. We show that assumptions in their analysis can be weakened, allowing us to replace their derivation of a crucial function g on the unit square with a linear program that computes the values of a best possible g under these assumptions on a discretized unit square. We show that the discretization does not incur much error, and show computationally that we can obtain a competitive ratio of 0.6629. To compute the bound over our discretized unit square we use parallelization, and still needed two days of computing on a 64-core machine. Furthermore, by modifying our linear program somewhat, we can show computationally an upper bound on our approach of 0.6688; any further progress beyond this bound will require either further weakening in the assumptions of g or a stronger analysis than that of Huang et al.

Data Structures And Algorithms

Improved Approximations for Min Sum Vertex Cover and Generalized Min Sum Set Cover

We study the generalized min sum set cover (GMSSC) problem, wherein given a collection of hyperedges E with arbitrary covering requirements k e , the goal is to find an ordering of the vertices to minimize the total cover time of the hyperedges; a hyperedge e is considered covered by the first time when k e many of its vertices appear in the ordering. We give a 4.642 approximation algorithm for GMSSC, coming close to the best possible bound of 4 , already for the classical special case (with all k e =1 ) of min sum set cover (MSSC) studied by Feige, Lovász and Tetali, and improving upon the previous best known bound of 12.4 due to Im, Sviridenko and van der Zwaan. Our algorithm is based on transforming the LP solution by a suitable kernel and applying randomized rounding. This also gives an LP-based 4 approximation for MSSC. As part of the analysis of our algorithm, we also derive an inequality on the lower tail of a sum of independent Bernoulli random variables, which might be of independent interest and broader utility. Another well-known special case is the min sum vertex cover (MSVC) problem, in which the input hypergraph is a graph and k e =1 , for every edge. We give a 16/9 approximation for MSVC, and show a matching integrality gap for the natural LP relaxation. This improves upon the previous best 1.999946 approximation of Barenholz, Feige and Peleg. (The claimed 1.79 approximation result of Iwata, Tetali and Tripathi for the MSVC turned out have an unfortunate, seemingly unfixable, mistake in it.) Finally, we revisit MSSC and consider the ℓ p norm of cover-time of the hyperedges. Using a dual fitting argument, we show that the natural greedy algorithm achieves tight, up to NP-hardness, approximation guarantees of (p+1 ) 1+1/p , for all p≥1 . For p=1 , this gives yet another proof of the 4 approximation for MSSC.

Data Structures And Algorithms

Improved Deterministic Network Decomposition

Network decomposition is a central tool in distributed graph algorithms. We present two improvements on the state of the art for network decomposition, which thus lead to improvements in the (deterministic and randomized) complexity of several well-studied graph problems. - We provide a deterministic distributed network decomposition algorithm with O( log 5 n) round complexity, using O(logn) -bit messages. This improves on the O( log 7 n) -round algorithm of Rozhoň and Ghaffari [STOC'20], which used large messages, and their O( log 8 n) -round algorithm with O(logn) -bit messages. This directly leads to similar improvements for a wide range of deterministic and randomized distributed algorithms, whose solution relies on network decomposition, including the general distributed derandomization of Ghaffari, Kuhn, and Harris [FOCS'18]. - One drawback of the algorithm of Rozhoň and Ghaffari, in the CONGEST model, was its dependence on the length of the identifiers. Because of this, for instance, the algorithm could not be used in the shattering framework in the CONGEST model. Thus, the state of the art randomized complexity of several problems in this model remained with an additive 2 O( loglogn √ ) term, which was a clear leftover of the older network decomposition complexity [Panconesi and Srinivasan STOC'92]. We present a modified version that remedies this, constructing a decomposition whose quality does not depend on the identifiers, and thus improves the randomized round complexity for various problems.

Data Structures And Algorithms

Improved Distance Sensitivity Oracles with Subcubic Preprocessing Time

We consider the problem of building Distance Sensitivity Oracles (DSOs). Given a directed graph G=(V,E) with edge weights in {1,2,…,M} , we need to preprocess it into a data structure, and answer the following queries: given vertices u,v∈V and a failed vertex or edge f∈(V∪E) , output the length of the shortest path from u to v that does not go through f . Our main result is a simple DSO with O ~ ( n 2.7233 M) preprocessing time and O(1) query time. Moreover, if the input graph is undirected, the preprocessing time can be improved to O ~ ( n 2.6865 M) . The preprocessing algorithm is randomized with correct probability ≥1−1/ n C , for a constant C that can be made arbitrarily large. Previously, there is a DSO with O ~ ( n 2.8729 M) preprocessing time and polylog(n) query time [Chechik and Cohen, STOC'20]. At the core of our DSO is the following observation from [Bernstein and Karger, STOC'09]: if there is a DSO with preprocessing time P and query time Q , then we can construct a DSO with preprocessing time P+ O ~ ( n 2 )⋅Q and query time O(1) . (Here O ~ (⋅) hides polylog(n) factors.)

Data Structures And Algorithms

Improved FPT Algorithms for Deletion to Forest-like Structures

The Feedback Vertex Set problem is undoubtedly one of the most well-studied problems in Parameterized Complexity. In this problem, given an undirected graph G and a non-negative integer k , the objective is to test whether there exists a subset S⊆V(G) of size at most k such that G−S is a forest. After a long line of improvement, recently, Li and Nederlof [SODA, 2020] designed a randomized algorithm for the problem running in time O ⋆ ( 2.7 k ) . In the Parameterized Complexity literature, several problems around Feedback Vertex Set have been studied. Some of these include Independent Feedback Vertex Set (where the set S should be an independent set in G ), Almost Forest Deletion and Pseudoforest Deletion. In Pseudoforest Deletion, each connected component in G−S has at most one cycle in it. However, in Almost Forest Deletion, the input is a graph G and non-negative integers k,ℓ∈N , and the objective is to test whether there exists a vertex subset S of size at most k , such that G−S is ℓ edges away from a forest. In this paper, using the methodology of Li and Nederlof [SODA, 2020], we obtain the current fastest algorithms for all these problems. In particular we obtain following randomized algorithms. 1) Independent Feedback Vertex Set can be solved in time O ⋆ ( 2.7 k ) . 2) Pseudo Forest Deletion can be solved in time O ⋆ ( 2.85 k ) . 3) Almost Forest Deletion can be solved in O ⋆ (min{ 2.85 k ⋅ 8.54 ℓ , 2.7 k ⋅ 36.61 ℓ , 3 k ⋅ 1.78 ℓ }) .

Data Structures And Algorithms

Improved Hierarchical Clustering on Massive Datasets with Broad Guarantees

Hierarchical clustering is a stronger extension of one of today's most influential unsupervised learning methods: clustering. The goal of this method is to create a hierarchy of clusters, thus constructing cluster evolutionary history and simultaneously finding clusterings at all resolutions. We propose four traits of interest for hierarchical clustering algorithms: (1) empirical performance, (2) theoretical guarantees, (3) cluster balance, and (4) scalability. While a number of algorithms are designed to achieve one to two of these traits at a time, there exist none that achieve all four. Inspired by Bateni et al.'s scalable and empirically successful Affinity Clustering [NeurIPs 2017], we introduce Affinity Clustering's successor, Matching Affinity Clustering. Like its predecessor, Matching Affinity Clustering maintains strong empirical performance and uses Massively Parallel Communication as its distributed model. Designed to maintain provably balanced clusters, we show that our algorithm achieves good, constant factor approximations for Moseley and Wang's revenue and Cohen-Addad et al.'s value. We show Affinity Clustering cannot approximate either function. Along the way, we also introduce an efficient k -sized maximum matching algorithm in the MPC model.

Data Structures And Algorithms

Improved LP-based Approximation Algorithms for Facility Location with Hard Capacities

We present LP-based approximation algorithms for the capacitated facility location problem (CFL), a long-standing problem with intriguing unsettled approximability in the literature dated back to the 90s. We present an elegant iterative rounding scheme for the MFN relaxation that yields an approximation guarantee of (10+ 67 ??????)/2??.0927 , a significant improvement upon the previous LP-based ratio of 288 due to An et al. in~2014. For CFL with cardinality facility cost (CFL-CFC), we present an LP-based 4 -approximation algorithm, which not only surpasses the long-standing ratio of~ 5 due to Levi et al. that ages up for decades since 2004 but also unties the long-time match to the best approximation for CFL that is obtained via local search in 2012. Our result considerably deepens the current understanding for the CFL problem and indicates that an LP-based ratio strictly better than 5 in polynomial time for the general problem may still be possible to pursue.

Data Structures And Algorithms

Improved Multi-Pass Streaming Algorithms for Submodular Maximization with Matroid Constraints

We give improved multi-pass streaming algorithms for the problem of maximizing a monotone or arbitrary non-negative submodular function subject to a general p -matchoid constraint in the model in which elements of the ground set arrive one at a time in a stream. The family of constraints we consider generalizes both the intersection of p arbitrary matroid constraints and p -uniform hypergraph matching. For monotone submodular functions, our algorithm attains a guarantee of p+1+ε using O(p/ε) -passes and requires storing only O(k) elements, where k is the maximum size of feasible solution. This immediately gives an O(1/ε) -pass (2+ε) -approximation algorithms for monotone submodular maximization in a matroid and (3+ε) -approximation for monotone submodular matching. Our algorithm is oblivious to the choice ε and can be stopped after any number of passes, delivering the appropriate guarantee. We extend our techniques to obtain the first multi-pass streaming algorithm for general, non-negative submodular functions subject to a p -matchoid constraint with a number of passes independent of the size of the ground set and k . We show that a randomized O(p/ε) -pass algorithm storing O( p 3 klog(k)/ ε 3 ) elements gives a (p+1+ γ ¯ +O(ε)) -approximation, where gamma ¯ is the guarantee of the best-known offline algorithm for the same problem.

Data Structures And Algorithms

Improved Weighted Additive Spanners

Graph spanners and emulators are sparse structures that approximately preserve distances of the original graph. While there has been an extensive amount of work on additive spanners, so far little attention was given to weighted graphs. Only very recently [ABSKS20] extended the classical +2 (respectively, +4) spanners for unweighted graphs of size O( n 3/2 ) (resp., O( n 7/5 ) ) to the weighted setting, where the additive error is +2W (resp., +4W ). This means that for every pair u,v , the additive stretch is at most +2 W u,v , where W u,v is the maximal edge weight on the shortest u−v path. In addition, [ABSKS20] showed an algorithm yielding a +8 W max spanner of size O( n 4/3 ) , here W max is the maximum edge weight in the entire graph. In this work we improve the latter result by devising a simple deterministic algorithm for a +(6+ε)W spanner for weighted graphs with size O( n 4/3 ) (for any constant ε>0 ), thus nearly matching the classical +6 spanner of size O( n 4/3 ) for unweighted graphs. Furthermore, we show a +(2+ε)W subsetwise spanner of size O(n⋅ |S| − − − √ ) , improving the +4 W max result of [ABSKS20] (that had the same size). We also show a simple randomized algorithm for a +4W emulator of size O ~ ( n 4/3 ) . In addition, we show that our technique is applicable for very sparse additive spanners, that have linear size. For weighted graphs, we use a variant of our simple deterministic algorithm that yields a linear size + O ~ ( n − − √ ⋅W) spanner, and we also obtain a tradeoff between size and stretch. Finally, generalizing the technique of [DHZ00] for unweighted graphs, we devise an efficient randomized algorithm producing a +2W spanner for weighted graphs of size O ~ ( n 3/2 ) in O ~ ( n 2 ) time.

Ready to get started?

Join us today

Archive Your Research