Computer Science Data Structures And Algorithms - Researchain

Featured Researches

Adjustable Coins

In this paper we consider a scenario where there are several algorithms for solving a given problem. Each algorithm is associated with a probability of success and a cost, and there is also a penalty for failing to solve the problem. The user may run one algorithm at a time for the specified cost, or give up and pay the penalty. The probability of success may be implied by randomization in the algorithm, or by assuming a probability distribution on the input space, which lead to different variants of the problem. The goal is to minimize the expected cost of the process under the assumption that the algorithms are independent. We study several variants of this problem, and present possible solution strategies and a hardness result.

Data Structures And Algorithms

Adwords in a Panorama

Three decades ago, Karp, Vazirani, and Vazirani (STOC 1990) defined the online matching problem and gave an optimal 1− 1 e ≈0.632 -competitive algorithm. Fifteen years later, Mehta, Saberi, Vazirani, and Vazirani (FOCS 2005) introduced the first generalization called AdWords driven by online advertising and obtained the optimal 1− 1 e competitive ratio in the special case of small bids. It has been open ever since whether there is an algorithm for general bids better than the 0.5 -competitive greedy algorithm. This paper presents a 0.5016 -competitive algorithm for AdWords, answering this open question on the positive end. The algorithm builds on several ingredients, including a combination of the online primal dual framework and the configuration linear program of matching problems recently explored by Huang and Zhang (STOC 2020), a novel formulation of AdWords which we call the panorama view, and a generalization of the online correlated selection by Fahrbach, Huang, Tao, and Zadimorghaddam (FOCS 2020) which we call the panoramic online correlated selection.

Data Structures And Algorithms

Algorithms and Complexity for Variants of Covariates Fine Balance

We study here several variants of the covariates fine balance problem where we generalize some of these problems and introduce a number of others. We present here a comprehensive complexity study of the covariates problems providing polynomial time algorithms, or a proof of NP-hardness. The polynomial time algorithms described are mostly combinatorial and rely on network flow techniques. In addition we present several fixed-parameter tractable results for problems where the number of covariates and the number of levels of each covariate are seen as a parameter.

Data Structures And Algorithms

Algorithms and Complexity on Indexing Founder Graphs

We study the problem of matching a string in a labeled graph. Previous research has shown that unless the Orthogonal Vectors Hypothesis (OVH) is false, one cannot solve this problem in strongly sub-quadratic time, nor index the graph in polynomial time to answer queries efficiently (Equi et al. ICALP 2019, SOFSEM 2021). These conditional lower-bounds cover even deterministic graphs with binary alphabet, but there naturally exist also graph classes that are easy to index: E.g. Wheeler graphs (Gagie et al. Theor. Comp. Sci. 2017) cover graphs admitting a Burrows-Wheeler transform -based indexing scheme. However, it is NP-complete to recognize if a graph is a Wheeler graph (Gibney, Thankachan, ESA 2019). We propose an approach to alleviate the construction bottleneck of Wheeler graphs. Rather than starting from an arbitrary graph, we study graphs induced from multiple sequence alignments (MSAs). Elastic degenerate strings (Bernadini et al. SPIRE 2017, ICALP 2019) can be seen as such graphs, and we introduce here their generalization: elastic founder graphs. We first prove that even such induced graphs are hard to index under OVH. Then we introduce two subclasses, repeat-free and semi-repeat-free graphs, that are easy to index. We give a linear time algorithm to construct a repeat-free non-elastic founder graph from a gapless MSA, and (parameterized) near-linear time algorithms to construct semi-repeat-free (repeat-free, respectively) elastic founder graphs from general MSAs. Finally, we show that repeat-free elastic founder graphs admit a reduction to Wheeler graphs in polynomial time.

Data Structures And Algorithms

Algorithms and Hardness for Multidimensional Range Updates and Queries

Traditional orthogonal range problems allow queries over a static set of points, each with some value. Dynamic variants allow points to be added or removed, one at a time. To support more powerful updates, we introduce the Grid Range class of data structure problems over integer arrays in one or more dimensions. These problems allow range updates (such as filling all cells in a range with a constant) and queries (such as finding the sum or maximum of values in a range). In this work, we consider these operations along with updates that replace each cell in a range with the minimum, maximum, or sum of its existing value, and a constant. In one dimension, it is known that segment trees can be leveraged to facilitate any n of these operations in O ~ (n) time overall. Other than a few specific cases, until now, higher dimensional variants have been largely unexplored. We show that no truly subquadratic time algorithm can support certain pairs of these updates simultaneously without falsifying several popular conjectures. On the positive side, we show that truly subquadratic algorithms can be obtained for variants induced by other subsets. We provide two approaches to designing such algorithms that can be generalised to online and higher dimensional settings. First, we give almost-tight O ~ ( n 3/2 ) time algorithms for single-update variants where the update operation distributes over the query operation. Second, for other variants, we provide a general framework for reducing to instances with a special geometry. Using this, we show that O( m 3/2?��?) time algorithms for counting paths and walks of length 2 and 3 between vertex pairs in sparse graphs imply truly subquadratic data structures for certain variants; to this end, we give an O ~ ( m (4???)/(2?+1) )=O( m 1.478 ) time algorithm for counting simple 3-paths between vertex pairs.

Data Structures And Algorithms

Algorithms and Lower Bounds for the Worker-Task Assignment Problem

We study the problem of assigning workers to tasks where each task has demand for a particular number of workers, and the demands are dynamically changing over time. Specifically, a worker-task assignment function ϕ takes a multiset of w tasks T⊆[t] and produces an assignment ϕ(T) from the workers 1,2,…,w to the tasks T . The assignment function ϕ is said to have switching cost at most k if, for all task multisets T , changing the contents of T by one task changes ϕ(T) by at most k worker assignments. The goal of the worker-task assignment problem is to produce an assignment function ϕ with the minimum possible switching cost. Prior work on this problem (SSS'17, ICALP'20) observed a simple assignment function ϕ with switching cost min(w,t−1) , but there has been no success in constructing ϕ with sublinear switching cost. We construct the first assignment function ϕ with sublinear, and in fact polylogarithmic, switching cost. We give a probabilistic construction for ϕ that achieves switching cost O(logwlog(wt)) and an explicit construction that achieves switching cost polylog(wt) . From the lower bounds side, prior work has used involved arguments to prove constant lower bounds on switching cost, but no super-constant lower bounds are known. We prove the first super-constant lower bound on switching cost. In particular, we show that for any value of w there exists a value of t for which the optimal switching cost is w . That is, when w≪t , the trivial bound on switching cost is optimal. We also consider an application of the worker-task assignment problem to a metric embeddings problem. In particular, we use our results to give the first low-distortion embedding from sparse binary vectors into low-dimensional Hamming space.

Data Structures And Algorithms

Algorithms for the Minimum Dominating Set Problem in Bounded Arboricity Graphs: Simpler, Faster, and Combinatorial

We revisit the minimum dominating set problem on graphs with arboricity α . In the (standard) centralized setting, Bansal and Umboh [BU17] gave an O(α) -approximation LP rounding algorithm, which translates into a near-linear time algorithm using general-purpose approximation results for explicit covering LPs [KY14, You14, AZO19, Qua20]. Moreover, [BU17] showed that it is NP-hard to achieve an asymptotic improvement for the approximation factor. On the other hand, the previous two non-LP-based algorithms, by Lenzen and Wattenhofer [LW10] and Jones et al. [JLR+13], achieve an approximation factor of O( α 2 ) in linear time. There is a similar situation in the distributed setting: While there is an O( log 2 n) -round LP-based O(α) -approximation algorithms [KMW06, DKM19], the best non-LP-based algorithm by Lenzen and Wattenhofer [LW10] is an implementation of their centralized algorithm, providing an O( α 2 ) -approximation within O(logn) rounds with high probability. We address the questions of whether one can achieve an O(α) -approximation algorithm that is elementary, i.e., not based on any LP-based methods, either in the centralized setting or in the distributed setting. We resolve both questions in the affirmative, and en route achieve algorithms that are faster than the state-of-the-art LP-based algorithms: 1. In the centralized setting, we provide a surprisingly simple combinatorial algorithm that is asymptotically optimal in terms of both approximation factor and runtime: an O(α) -approximation in linear time. The previous best O(α) -approximation algorithms are LP-based and have super-linear running time. 2. Based on our centralized algorithm, we design a distributed combinatorial O(α) -approximation algorithm in the CONGEST model that runs in O(αlogn) rounds with high probability.

Data Structures And Algorithms

Algorithms, Reductions and Equivalences for Small Weight Variants of All-Pairs Shortest Paths

APSP with small integer weights in undirected graphs [Seidel'95, Galil and Margalit'97] has an O ~ ( n ? ) time algorithm, where ?<2.373 is the matrix multiplication exponent. APSP in directed graphs with small weights however, has a much slower running time that would be Ω( n 2.5 ) even if ?=2 [Zwick'02]. To understand this n 2.5 bottleneck, we build a web of reductions around directed unweighted APSP. We show that it is fine-grained equivalent to computing a rectangular Min-Plus product for matrices with integer entries; the dimensions and entry size of the matrices depend on the value of ? . As a consequence, we establish an equivalence between APSP in directed unweighted graphs, APSP in directed graphs with small ( O ~ (1)) integer weights, All-Pairs Longest Paths in DAGs with small weights, approximate APSP with additive error c in directed graphs with small weights, for c??O ~ (1) and several other graph problems. We also provide fine-grained reductions from directed unweighted APSP to All-Pairs Shortest Lightest Paths (APSLP) in undirected graphs with {0,1} weights and # mod c APSP in directed unweighted graphs (computing counts mod c ). We complement our hardness results with new algorithms. We improve the known algorithms for APSLP in directed graphs with small integer weights and for approximate APSP with sublinear additive error in directed unweighted graphs. Our algorithm for approximate APSP with sublinear additive error is optimal, when viewed as a reduction to Min-Plus product. We also give new algorithms for variants of #APSP in unweighted graphs, as well as a near-optimal O ~ ( n 3 ) -time algorithm for the original #APSP problem in unweighted graphs. Our techniques also lead to a simpler alternative for the original APSP problem in undirected graphs with small integer weights.

Data Structures And Algorithms

All instantiations of the greedy algorithm for the shortest superstring problem are equivalent

In the Shortest Common Superstring problem (SCS), one needs to find the shortest superstring for a set of strings. While SCS is NP-hard and MAX-SNP-hard, the Greedy Algorithm "choose two strings with the largest overlap; merge them; repeat" achieves a constant factor approximation that is known to be at most 3.5 and conjectured to be equal to 2. The Greedy Algorithm is not deterministic, so its instantiations with different tie-breaking rules may have different approximation factors. In this paper, we show that it is not the case: all factors are equal. To prove this, we show how to transform a set of strings so that all overlaps are different whereas their ratios stay roughly the same. We also reveal connections between the original version of SCS and the following one: find a~superstring minimizing the number of occurrences of a given symbol. It turns out that the latter problem is equivalent to the original one.

Data Structures And Algorithms

All-Pairs LCA in DAGs: Breaking through the O( n 2.5 ) barrier

Let G=(V,E) be an n -vertex directed acyclic graph (DAG). A lowest common ancestor (LCA) of two vertices u and v is a common ancestor w of u and v such that no descendant of w has the same property. In this paper, we consider the problem of computing an LCA, if any, for all pairs of vertices in a DAG. The fastest known algorithms for this problem exploit fast matrix multiplication subroutines and have running times ranging from O( n 2.687 ) [Bender et al.~SODA'01] down to O( n 2.615 ) [Kowaluk and Lingas~ICALP'05] and O( n 2.569 ) [Czumaj et al.~TCS'07]. Somewhat surprisingly, all those bounds would still be Ω( n 2.5 ) even if matrix multiplication could be solved optimally (i.e., ω=2 ). This appears to be an inherent barrier for all the currently known approaches, which raises the natural question on whether one could break through the O( n 2.5 ) barrier for this problem. In this paper, we answer this question affirmatively: in particular, we present an O ~ ( n 2.447 ) ( O ~ ( n 7/3 ) for ω=2 ) algorithm for finding an LCA for all pairs of vertices in a DAG, which represents the first improvement on the running times for this problem in the last 13 years. A key tool in our approach is a fast algorithm to partition the vertex set of the transitive closure of G into a collection of O(ℓ) chains and O(n/ℓ) antichains, for a given parameter ℓ . As usual, a chain is a path while an antichain is an independent set. We then find, for all pairs of vertices, a \emph{candidate} LCA among the chain and antichain vertices, separately. The first set is obtained via a reduction to min-max matrix multiplication. The computation of the second set can be reduced to Boolean matrix multiplication similarly to previous results on this problem. We finally combine the two solutions together in a careful (non-obvious) manner.

Ready to get started?

Join us today

Archive Your Research