Computer Science Data Structures And Algorithms - Researchain

Featured Researches

A Deterministic Parallel APSP Algorithm and its Applications

In this paper we show a deterministic parallel all-pairs shortest paths algorithm for real-weighted directed graphs. The algorithm has O ~ (nm+(n/d ) 3 ) work and O ~ (d) depth for any depth parameter d?�[1,n] . To the best of our knowledge, such a trade-off has only been previously described for the real-weighted single-source shortest paths problem using randomization [Bringmann et al., ICALP'17]. Moreover, our result improves upon the parallelism of the state-of-the-art randomized parallel algorithm for computing transitive closure, which has O ~ (nm+ n 3 / d 2 ) work and O ~ (d) depth [Ullman and Yannakakis, SIAM J. Comput. '91]. Our APSP algorithm turns out to be a powerful tool for designing efficient planar graph algorithms in both parallel and sequential regimes. One notable ingredient of our parallel APSP algorithm is a simple deterministic O ~ (nm) -work O ~ (d) -depth procedure for computing O ~ (n/d) -size hitting sets of shortest d -hop paths between all pairs of vertices of a real-weighted digraph. Such hitting sets have also been called d -hub sets. Hub sets have previously proved especially useful in designing parallel or dynamic shortest paths algorithms and are typically obtained via random sampling. Our procedure implies, for example, an O ~ (nm) -time deterministic algorithm for finding a shortest negative cycle of a real-weighted digraph. Such a near-optimal bound for this problem has been so far only achieved using a randomized algorithm [Orlin et al., Discret. Appl. Math. '18].

Data Structures And Algorithms

A Dynamic Data Structure for Temporal Reachability with Unsorted Contact Insertions

Temporal graphs represent interactions between entities over the time. These interactions may be direct (a contact between two nodes at some time instant), or indirect, through sequences of contacts called temporal paths (journeys). Deciding whether an entity can reach another through a journey is useful for various applications in communication networks and epidemiology, among other fields. In this paper, we present a data structure which maintains temporal reachability information under the addition of new contacts (i.e., triplets (u,v,t) indicating that node u and node v interacted at time t ). In contrast to previous works, the contacts can be inserted in arbitrary order -- in particular, non-chronologically -- which corresponds to systems where the information is collected a posteriori (e.g. when trying to reconstruct contamination chains among people). The main component of our data structure is a generalization of transitive closure called timed transitive closure (TTC), which allows us to maintain reachability information relative to all nested time intervals, without storing all these intervals, nor the journeys themselves. TTCs are of independent interest and we study a number of their general properties. Let n be the number of nodes and ? be the number of timestamps in the lifetime of the temporal graph. Our data structure answers reachability queries regarding the existence of a journey from a given node to another within given time interval in time O(log?) ; it has an amortized insertion time of O( n 2 log?) ; and it can reconstruct a valid journey that witnesses reachability in time O(klog?) , where k<n is the maximum number of edges of this journey. Finally, the space complexity of our reachability data structure is O( n 2 ?) , which remains within the worst-case size of the temporal graph itself.

Data Structures And Algorithms

A Fast Optimal Double Row Legalization Algorithm

In Placement Legalization, it is often assumed that (almost) all standard cells possess the same height and can therefore be aligned in cell rows, which can then be treated independently. However, this is no longer true for recent technologies, where a substantial number of cells of double- or even arbitrary multiple-row height is to be expected. Due to interdependencies between the cell placements within several rows, the legalization task becomes considerably harder. In this paper, we show how to optimize quadratic cell movement for pairs of adjacent rows comprising cells of single- as well as double-row height with a fixed left-to-right ordering in time O(n?�log(n)) , whereby n denotes the number of cells involved. Opposed to prior works, we thereby do not artificially bound the maximum cell movement and can guarantee to find an optimum solution. Experimental results show an average percental decrease of over 26% in the total quadratic movement when compared to a legalization approach that fixes cells of more than single-row height after Global Placement.

Data Structures And Algorithms

A Fast Randomized Algorithm for Finding the Maximal Common Subsequences

Finding the common subsequences of L multiple strings has many applications in the area of bioinformatics, computational linguistics, and information retrieval. A well-known result states that finding a Longest Common Subsequence (LCS) for L strings is NP-hard, e.g., the computational complexity is exponential in L . In this paper, we develop a randomized algorithm, referred to as {\em Random-MCS}, for finding a random instance of Maximal Common Subsequence ( MCS ) of multiple strings. A common subsequence is {\em maximal} if inserting any character into the subsequence no longer yields a common subsequence. A special case of MCS is LCS where the length is the longest. We show the complexity of our algorithm is linear in L , and therefore is suitable for large L . Furthermore, we study the occurrence probability for a single instance of MCS and demonstrate via both theoretical and experimental studies that the longest subsequence from multiple runs of {\em Random-MCS} often yields a solution to LCS .

Data Structures And Algorithms

A Faster Algorithm for Finding Closest Pairs in Hamming Metric

We study the Closest Pair Problem in Hamming metric, which asks to find the pair with the smallest Hamming distance in a collection of binary vectors. We give a new randomized algorithm for the problem on uniformly random input outperforming previous approaches whenever the dimension of input points is small compared to the dataset size. For moderate to large dimensions, our algorithm matches the time complexity of the previously best-known locality sensitive hashing based algorithms. Technically our algorithm follows similar design principles as Dubiner (IEEE Trans. Inf. Theory 2010) and May-Ozerov (Eurocrypt 2015). Besides improving the time complexity in the aforementioned areas, we significantly simplify the analysis of these previous works. We give a modular analysis, which allows us to investigate the performance of the algorithm also on non-uniform input distributions. Furthermore, we give a proof of concept implementation of our algorithm which performs well in comparison to a quadratic search baseline. This is the first step towards answering an open question raised by May and Ozerov regarding the practicability of algorithms following these design principles.

Data Structures And Algorithms

A Faster Exponential Time Algorithm for Bin Packing With a Constant Number of Bins via Additive Combinatorics

In the Bin Packing problem one is given n items with weights w 1 ,…, w n and m bins with capacities c 1 ,…, c m . The goal is to find a partition of the items into sets S 1 ,…, S m such that w( S j )≤ c j for every bin j , where w(X) denotes ∑ i∈X w i . Björklund, Husfeldt and Koivisto (SICOMP 2009) presented an O ⋆ ( 2 n ) time algorithm for Bin Packing. In this paper, we show that for every m∈N there exists a constant σ m >0 such that an instance of Bin Packing with m bins can be solved in O( 2 (1− σ m )n ) randomized time. Before our work, such improved algorithms were not known even for m equals 4 . A key step in our approach is the following new result in Littlewood-Offord theory on the additive combinatorics of subset sums: For every δ>0 there exists an ε>0 such that if |{X⊆{1,…,n}:w(X)=v}|≥ 2 (1−ε)n for some v then |{w(X):X⊆{1,…,n}}|≤ 2 δn .

Data Structures And Algorithms

A Faster Interior Point Method for Semidefinite Programming

Semidefinite programs (SDPs) are a fundamental class of optimization problems with important recent applications in approximation algorithms, quantum complexity, robust learning, algorithmic rounding, and adversarial deep learning. This paper presents a faster interior point method to solve generic SDPs with variable size n×n and m constraints in time O ˜ ( n − − √ (m n 2 + m ω + n ω )log(1/ϵ)), where ω is the exponent of matrix multiplication and ϵ is the relative accuracy. In the predominant case of m≥n , our runtime outperforms that of the previous fastest SDP solver, which is based on the cutting plane method of Jiang, Lee, Song, and Wong [JLSW20]. Our algorithm's runtime can be naturally interpreted as follows: O ˜ ( n − − √ log(1/ϵ)) is the number of iterations needed for our interior point method, m n 2 is the input size, and m ω + n ω is the time to invert the Hessian and slack matrix in each iteration. These constitute natural barriers to further improving the runtime of interior point methods for solving generic SDPs.

Data Structures And Algorithms

A Framework for Consistency Algorithms

We present a framework that provides deterministic consistency algorithms for given memory models. Such an algorithm checks whether the executions of a shared-memory concurrent program are consistent under the axioms defined by a model. For memory models like SC and TSO, checking consistency is NP-complete. Our framework shows, that despite the hardness, fast deterministic consistency algorithms can be obtained by employing tools from fine-grained complexity. The framework is based on a universal consistency problem which can be instantiated by different memory models. We construct an algorithm for the problem running in time O*(2^k), where k is the number of write accesses in the execution that is checked for consistency. Each instance of the framework then admits an O*(2^k)-time consistency algorithm. By applying the framework, we obtain corresponding consistency algorithms for SC, TSO, PSO, and RMO. Moreover, we show that the obtained algorithms for SC, TSO, and PSO are optimal in the fine-grained sense: there is no consistency algorithm for these running in time 2^o(k) unless the exponential time hypothesis fails.

Data Structures And Algorithms

A High-dimensional Sparse Fourier Transform in the Continuous Setting

In this paper, we theoretically propose a new hashing scheme to establish the sparse Fourier transform in high-dimensional space. The estimation of the algorithm complexity shows that this sparse Fourier transform can overcome the curse of dimensionality. To the best of our knowledge, this is the first polynomial-time algorithm to recover the high-dimensional continuous frequencies.

Data Structures And Algorithms

A Lightweight Algorithm to Uncover Deep Relationships in Data Tables

Many data we collect today are in tabular form, with rows as records and columns as attributes associated with each record. Understanding the structural relationship in tabular data can greatly facilitate the data science process. Traditionally, much of this relational information is stored in table schema and maintained by its creators, usually domain experts. In this paper, we develop automated methods to uncover deep relationships in a single data table without expert or domain knowledge. Our method can decompose a data table into layers of smaller tables, revealing its deep structure. The key to our approach is a computationally lightweight forward addition algorithm that we developed to recursively extract the functional dependencies between table columns that are scalable to tables with many columns. With our solution, data scientists will be provided with automatically generated, data-driven insights when exploring new data sets.

Ready to get started?

Join us today

Archive Your Research