Computer Science Data Structures And Algorithms - Researchain

Featured Researches

A performance study of some approximation algorithms for minimum dominating set in a graph

We implement and test the performances of several approximation algorithms for computing the minimum dominating set of a graph. These algorithms are the standard greedy algorithm, the recent LP rounding algorithms and a hybrid algorithm that we design by combining the greedy and LP rounding algorithms. All algorithms perform better than anticipated in their theoretical analysis, and have small performance ratios, measured as the size of output divided by the LP objective lower-bound. However, each may have advantages over the others. For instance, LP rounding algorithm normally outperforms the other algorithms on sparse real-world graphs. On a graph with 400,000+ vertices, LP rounding took less than 15 seconds of CPU time to generate a solution with performance ratio 1.011, while the greedy and hybrid algorithms generated solutions of performance ratio 1.12 in similar time. For synthetic graphs, the hybrid algorithm normally outperforms the others, whereas for hypercubes and k-Queens graphs, greedy outperforms the rest. Another advantage of the hybrid algorithm is to solve very large problems where LP solvers crash, as demonstrated on a real-world graph with 7.7 million+ vertices.

Data Structures And Algorithms

A polynomial time 12-approximation algorithm for restricted Santa Claus problem

In this paper, we consider the restricted case of the problem and improve the current best approximation ratio by presenting a polynomial time 12-approximation algorithm using linear programming and semi-definite programming. Our algorithm starts by solving the configuration LP and uses the optimum value to get a 12-gap instance. This is then followed by the well-known clustering technique of Bansal and Sviridenko\cite{bansal}. We then apply the analysis of Asadpour \textit{et al.} \cite{AFS,AFS2} to show that the clustered instance has an integer solution which is at least 1 6 times the best possible value, which was computed by solving the configuration LP. To find this solution, we formulate a problem called the Extended Assignment Problem, and formulate it as an LP. We then, show that the associated polytope is integral and gives us an fractional solution of value at least 1 6 times the optimum. From this solution we find a solution to a new quadratic program that we introduce to select one machine from each cluster, and then we show that the resulting instance has an Assignment LP fractional solution of value at least 1 6 times the optimum. We then use the well known rounding technique due to Bezakova and Dani \cite{bezakova} on the 12-gap instance to get our 12-approximate solution.

Data Structures And Algorithms

About Weighted Random Sampling in Preferential Attachment Models

The Barabási-Albert model is a popular scheme for creating scale-free graphs but has been previously shown to have ambiguities in its definition. In this paper we discuss a new ambiguity in the definition of the BA model by identifying the tight relation between the preferential attachment process and unequal probability random sampling. While the probability that each individual vertex is selected is set to be proportional to their degree, the model does not specify the joint probabilities that any tuple of m vertices is selected together for m>1 . We demonstrate the consequences using analytical, experimental, and empirical analyses and propose a concise definition of the model that addresses this ambiguity. Using the connection with unequal probability random sampling, we also highlight a confusion about the process via which nodes are selected on each time step, for which -- despite being implicitly indicated in the original paper -- current literature appears fragmented.

Data Structures And Algorithms

Accelerating Force-Directed Graph Drawing with RT Cores

Graph drawing with spring embedders employs a V x V computation phase over the graph's vertex set to compute repulsive forces. Here, the efficacy of forces diminishes with distance: a vertex can effectively only influence other vertices in a certain radius around its position. Therefore, the algorithm lends itself to an implementation using search data structures to reduce the runtime complexity. NVIDIA RT cores implement hierarchical tree traversal in hardware. We show how to map the problem of finding graph layouts with force-directed methods to a ray tracing problem that can subsequently be implemented with dedicated ray tracing hardware. With that, we observe speedups of 4x to 13x over a CUDA software implementation.

Data Structures And Algorithms

Access-Adaptive Priority Search Tree

In this paper we show that the priority search tree of McCreight, which was originally developed to satisfy a class of spatial search queries on 2-dimensional points, can be adapted to the problem of dynamically maintaining a set of keys so that the query complexity adapts to the distribution of queried keys. Presently, the best-known example of such a data structure is the splay tree, which dynamically reconfigures itself during each query so that frequently accessed keys move to the top of the tree and thus can be retrieved with fewer queries than keys that are lower in the tree. However, while the splay tree is conjectured to offer optimal adaptive amortized query complexity, it may require O(n) for individual queries. We show that an access-adaptive priority search tree (AAPST) can provide competitive adaptive query performance while ensuring O(log n) worst-case query performance, thus potentially making it more suitable for certain interactive (e.g.,online and real-time) applications for which the response time must be bounded.

Data Structures And Algorithms

Achieving anonymity via weak lower bound constraints for k-median and k-means

We study k -clustering problems with lower bounds, including k -median and k -means clustering with lower bounds. In addition to the point set P and the number of centers k , a k -clustering problem with (uniform) lower bounds gets a number B . The solution space is restricted to clusterings where every cluster has at least B points. We demonstrate how to approximate k -median with lower bounds via a reduction to facility location with lower bounds, for which O(1) -approximation algorithms are known. Then we propose a new constrained clustering problem with lower bounds where we allow points to be assigned multiple times (to different centers). This means that for every point, the clustering specifies a set of centers to which it is assigned. We call this clustering with weak lower bounds. We give an 8 -approximation for k -median clustering with weak lower bounds and an O(1) -approximation for k -means with weak lower bounds. We conclude by showing that at a constant increase in the approximation factor, we can restrict the number of assignments of every point to 2 (or, if we allow fractional assignments, to 1+ϵ ). This also leads to the first bicritera approximation algorithm for k -means with (standard) lower bounds where bicriteria is interpreted in the sense that the lower bounds are violated by a constant factor. All algorithms in this paper run in time that is polynomial in n and k (and d for the Euclidean variants considered).

Data Structures And Algorithms

Adapting k -means algorithms for outliers

This paper shows how to adapt several simple and classical sampling-based algorithms for the k -means problem to the setting with outliers. Recently, Bhaskara et al. (NeurIPS 2019) showed how to adapt the classical k -means++ algorithm to the setting with outliers. However, their algorithm needs to output O(log(k)⋅z) outliers, where z is the number of true outliers, to match the O(logk) -approximation guarantee of k -means++. In this paper, we build on their ideas and show how to adapt several sequential and distributed k -means algorithms to the setting with outliers, but with substantially stronger theoretical guarantees: our algorithms output (1+ε)z outliers while achieving an O(1/ε) -approximation to the objective function. In the sequential world, we achieve this by adapting a recent algorithm of Lattanzi and Sohler (ICML 2019). In the distributed setting, we adapt a simple algorithm of Guha et al. (IEEE Trans. Know. and Data Engineering 2003) and the popular k -means ∥ of Bahmani et al. (PVLDB 2012). A theoretical application of our techniques is an algorithm with running time O ~ (n k 2 /z) that achieves an O(1) -approximation to the objective function while outputting O(z) outliers, assuming k≪z≪n . This is complemented with a matching lower bound of Ω(n k 2 /z) for this problem in the oracle model.

Data Structures And Algorithms

Adaptive Exact Learning in a Mixed-Up World: Dealing with Periodicity, Errors and Jumbled-Index Queries in String Reconstruction

We study the query complexity of exactly reconstructing a string from adaptive queries, such as substring, subsequence, and jumbled-index queries. Such problems have applications, e.g., in computational biology. We provide a number of new and improved bounds for exact string reconstruction for settings where either the string or the queries are "mixed-up". For example, we show that a periodic (i.e., "mixed-up") string, S= p k p ′ , of smallest period p , where | p ′ |<|p| , can be reconstructed using O(σ|p|+lgn) substring queries, where σ is the alphabet size, if n=|S| is unknown. We also show that we can reconstruct S after having been corrupted by a small number of errors d , measured by Hamming distance. In this case, we give an algorithm that uses O(dσ|p|+d|p|lg n d+1 ) queries. In addition, we show that a periodic string can be reconstructed using 2σ⌈lgn⌉+2|p|⌈lgσ⌉ subsequence queries, and that general strings can be reconstructed using 2σ⌈lgn⌉+n⌈lgσ⌉ subsequence queries, without knowledge of n in advance. This latter result improves the previous best, decades-old result, by Skiena and Sundaram. Finally, we believe we are the first to study the exact-learning query complexity for string reconstruction using jumbled-index queries, which are a "mixed-up" typeA of query that have received much attention of late.

Data Structures And Algorithms

Adaptive Sampling for Fast Constrained Maximization of Submodular Function

Several large-scale machine learning tasks, such as data summarization, can be approached by maximizing functions that satisfy submodularity. These optimization problems often involve complex side constraints, imposed by the underlying application. In this paper, we develop an algorithm with poly-logarithmic adaptivity for non-monotone submodular maximization under general side constraints. The adaptive complexity of a problem is the minimal number of sequential rounds required to achieve the objective. Our algorithm is suitable to maximize a non-monotone submodular function under a p -system side constraint, and it achieves a (p+O( p ????)) -approximation for this problem, after only poly-logarithmic adaptive rounds and polynomial queries to the valuation oracle function. Furthermore, our algorithm achieves a (p+O(1)) -approximation when the given side constraint is a p -extendible system. This algorithm yields an exponential speed-up, with respect to the adaptivity, over any other known constant-factor approximation algorithm for this problem. It also competes with previous known results in terms of the query complexity. We perform various experiments on various real-world applications. We find that, in comparison with commonly used heuristics, our algorithm performs better on these instances.

Data Structures And Algorithms

Additive Approximation Schemes for Load Balancing Problems

In this paper we introduce the concept of additive approximation schemes and apply it to load balancing problems. Additive approximation schemes aim to find a solution with an absolute error in the objective of at most ϵh for some suitable parameter h . In the case that the parameter h provides a lower bound an additive approximation scheme implies a standard multiplicative approximation scheme and can be much stronger when h≪ OPT. On the other hand, when no PTAS exists (or is unlikely to exist), additive approximation schemes can provide a different notion for approximation. We consider the problem of assigning jobs to identical machines with lower and upper bounds for the loads of the machines. This setting generalizes problems like makespan minimization, the Santa Claus problem (on identical machines), and the envy-minimizing Santa Claus problem. For the last problem, in which the objective is to minimize the difference between the maximum and minimum load, the optimal objective value may be zero and hence it is NP-hard to obtain any multiplicative approximation guarantee. For this class of problems we present additive approximation schemes for h= p max , the maximum processing time of the jobs. Our technical contribution is two-fold. First, we introduce a new relaxation based on integrally assigning slots to machines and fractionally assigning jobs to the slots (the slot-MILP). We identify structural properties of (near-)optimal solutions of the slot-MILP, which allow us to solve it efficiently, assuming that there are O(1) different lower and upper bounds on the machine loads (which is the relevant setting for the three problems mentioned above). The second technical contribution is a local-search based algorithm which rounds a solution to the slot-MILP introducing an additive error on the target load intervals of at most ϵ⋅ p max .

Ready to get started?

Join us today

Archive Your Research