Computer Science Data Structures And Algorithms - Researchain

Featured Researches

Edge Deletion to Restrict the Size of an Epidemic

Given a graph G=(V,E) , a set F of forbidden subgraphs, we study F -Free Edge Deletion, where the goal is to remove minimum number of edges such that the resulting graph does not contain any F?�F as a subgraph. For the parameter treewidth, the question of whether the problem is FPT has remained open. Here we give a negative answer by showing that the problem is W[1]-hard when parameterized by the treewidth, which rules out FPT algorithms under common assumption. Thus we give a solution to the conjecture posted by Jessica Enright and Kitty Meeks in [Algorithmica 80 (2018) 1857-1889]. We also prove that the F -Free Edge Deletion problem is W[2]-hard when parameterized by the solution size k , feedback vertex set number or pathwidth of the input graph. A special case of particular interest is the situation in which F is the set T h+1 of all trees on h+1 vertices, so that we delete edges in order to obtain a graph in which every component contains at most h vertices. This is desirable from the point of view of restricting the spread of disease in transmission network. We prove that the T h+1 -Free Edge Deletion problem is fixed-parameter tractable (FPT) when parameterized by the vertex cover number. We also prove that it admits a kernel with O(hk) vertices and O( h 2 k) edges, when parameterized by combined parameters h and the solution size k .

Data Structures And Algorithms

Efficient Algorithms to Mine Maximal Span-Trusses From Temporal Graphs

Over the last decade, there has been an increasing interest in temporal graphs, pushed by a growing availability of temporally-annotated network data coming from social, biological and financial networks. Despite the importance of analyzing complex temporal networks, there is a huge gap between the set of definitions, algorithms and tools available to study large static graphs and the ones available for temporal graphs. An important task in temporal graph analysis is mining dense structures, i.e., identifying high-density subgraphs together with the span in which this high density is observed. In this paper, we introduce the concept of (k,Δ) -truss (span-truss) in temporal graphs, a temporal generalization of the k -truss, in which k captures the information about the density and Δ captures the time span in which this density holds. We then propose novel and efficient algorithms to identify maximal span-trusses, namely the ones not dominated by any other span-truss neither in the order k nor in the interval Δ , and evaluate them on a number of public available datasets.

Data Structures And Algorithms

Efficient Approximation Schemes for Stochastic Probing and Prophet Problems

Our main contribution is a general framework to design efficient polynomial time approximation schemes (EPTAS) for fundamental classes of stochastic combinatorial optimization problems. Given an error parameter ϵ>0 , such algorithmic schemes attain a (1+ϵ) -approximation in only t(ϵ)⋅poly(n) time, where t(⋅) is some function that depends only on ϵ . Technically speaking, our approach relies on presenting tailor-made reductions to a newly-introduced multi-dimensional extension of the Santa Claus problem [Bansal-Sviridenko, STOC'06]. Even though the single-dimensional problem is already known to be APX-Hard, we prove that an EPTAS can be designed under certain structural assumptions, which hold for our applications. To demonstrate the versatility of our framework, we obtain an EPTAS for the adaptive ProbeMax problem as well as for its non-adaptive counterpart; in both cases, state-of-the-art approximability results have been inefficient polynomial time approximation schemes (PTAS) [Chen et al., NIPS'16; Fu et al., ICALP'18]. Turning our attention to selection-stopping settings, we further derive an EPTAS for the Free-Order Prophets problem [Agrawal et al., EC'20] and for its cost-driven generalization, Pandora's Box with Commitment [Fu et al., ICALP'18]. These results improve on known PTASes for their adaptive variants, and constitute the first non-trivial approximations in the non-adaptive setting.

Data Structures And Algorithms

Efficient Constant-Factor Approximate Enumeration of Minimal Subsets for Monotone Properties with Weight Constraints

A property Π on a finite set U is \emph{monotone} if for every X⊆U satisfying Π , every superset Y⊆U of X also satisfies Π . Many combinatorial properties can be seen as monotone properties. The problem of finding a minimum subset of U satisfying Π is a central problem in combinatorial optimization. Although many approximate/exact algorithms have been developed to solve this kind of problem on numerous properties, a solution obtained by these algorithms is often unsuitable for real-world applications due to the difficulty of building accurate mathematical models on real-world problems. A promising approach to overcome this difficulty is to \emph{enumerate} multiple small solutions rather than to \emph{find} a single small solution. To this end, given a weight function w:U→N and an integer k , we devise algorithms that \emph{approximately} enumerate all minimal subsets of U with weight at most k satisfying Π for various monotone properties Π , where "approximate enumeration" means that algorithms output all minimal subsets satisfying Π whose weight at most k and may output some minimal subsets satisfying Π whose weight exceeds k but is at most ck for some constant c≥1 . These algorithms allow us to efficiently enumerate minimal vertex covers, minimal dominating sets in bounded degree graphs, minimal feedback vertex sets, minimal hitting sets in bounded rank hypergraphs, etc., of weight at most k with constant approximation factors.

Data Structures And Algorithms

Efficient Exact Algorithms for Maximum Balanced Biclique Search in Bipartite Graphs

Given a bipartite graph, the maximum balanced biclique (\textsf{MBB}) problem, discovering a mutually connected while equal-sized disjoint sets with the maximum cardinality, plays a significant role for mining the bipartite graph and has numerous applications. Despite the NP-hardness of the \textsf{MBB} problem, in this paper, we show that an exact \textsf{MBB} can be discovered extremely fast in bipartite graphs for real applications. We propose two exact algorithms dedicated for dense and sparse bipartite graphs respectively. For dense bipartite graphs, an O ∗ ( 1.3803 n ) algorithm is proposed. This algorithm in fact can find an \textsf{MBB} in near polynomial time for dense bipartite graphs that are common for applications such as VLSI design. This is because, using our proposed novel techniques, the search can fast converge to sufficiently dense bipartite graphs which we prove to be polynomially solvable. For large sparse bipartite graphs typical for applications such as biological data analysis, an O ∗ ( 1.3803 δ ¨ ) algorithm is proposed, where δ ¨ is only a few hundreds for large sparse bipartite graphs with millions of vertices. The indispensible optimizations that lead to this time complexity are: we transform a large sparse bipartite graph into a limited number of dense subgraphs with size up to δ ¨ and then apply our proposed algorithm for dense bipartite graphs on each of the subgraphs. To further speed up this algorithm, tighter upper bounds, faster heuristics and effective reductions are proposed, allowing an \textsf{MBB} to be discovered within a few seconds for bipartite graphs with millions of vertices. Extensive experiments are conducted on synthetic and real large bipartite graphs to demonstrate the efficiency and effectiveness of our proposed algorithms and techniques.

Data Structures And Algorithms

Efficient Graph Minors Theory and Parameterized Algorithms for (Planar) Disjoint Paths

In the Disjoint Paths problem, the input consists of an n -vertex graph G and a collection of k vertex pairs, {( s i , t i ) } k i=1 , and the objective is to determine whether there exists a collection { P i } k i=1 of k pairwise vertex-disjoint paths in G where the end-vertices of P i are s i and t i . This problem was shown to admit an f(k) n 3 -time algorithm by Robertson and Seymour (Graph Minors XIII, The Disjoint Paths Problem, JCTB). In modern terminology, this means that Disjoint Paths is fixed parameter tractable (FPT) with respect to k . Remarkably, the above algorithm for Disjoint Paths is a cornerstone of the entire Graph Minors Theory, and conceptually vital to the g(k) n 3 -time algorithm for Minor Testing (given two undirected graphs, G and H on n and k vertices, respectively, determine whether G contains H as a minor). In this semi-survey, we will first give an exposition of the Graph Minors Theory with emphasis on efficiency from the viewpoint of Parameterized Complexity. Secondly, we will review the state of the art with respect to the Disjoint Paths and Planar Disjoint Paths problems. Lastly, we will discuss the main ideas behind a new algorithm that combines treewidth reduction and an algebraic approach to solve Planar Disjoint Paths in time 2 k O(1) n O(1) (for undirected graphs).

Data Structures And Algorithms

Efficient Hierarchical Clustering for Classification and Anomaly Detection

We address the problem of large scale real-time classification of content posted on social networks, along with the need to rapidly identify novel spam types. Obtaining manual labels for user-generated content using editorial labeling and taxonomy development lags compared to the rate at which new content type needs to be classified. We propose a class of hierarchical clustering algorithms that can be used both for efficient and scalable real-time multiclass classification as well as in detecting new anomalies in user-generated content. Our methods have low query time, linear space usage, and come with theoretical guarantees with respect to a specific hierarchical clustering cost function (Dasgupta, 2016). We compare our solutions against a range of classification techniques and demonstrate excellent empirical performance.

Data Structures And Algorithms

Efficient Maintenance of Distance Labelling for Incremental Updates in Large Dynamic Graphs

Finding the shortest path distance between an arbitrary pair of vertices is a fundamental problem in graph theory. A tremendous amount of research has been successfully attempted on this problem, most of which is limited to static graphs. Due to the dynamic nature of real-world networks, there is a pressing need to address this problem for dynamic networks undergoing changes. In this paper, we propose an \emph{online incremental} method to efficiently answer distance queries over very large dynamic graphs. Our proposed method incorporates incremental update operations, i.e. edge and vertex additions, into a highly scalable framework of answering distance queries. We theoretically prove the correctness of our method and the preservation of labelling minimality. We have also conducted extensive experiments on 12 large real-world networks to empirically verify the efficiency, scalability, and robustness of our method.

Data Structures And Algorithms

Efficient Network Reliability Computation in Uncertain Graphs

Network reliability is an important metric to evaluate the connectivity among given vertices in uncertain graphs. Since the network reliability problem is known as #P-complete, existing studies have used approximation techniques. In this paper, we propose a new sampling-based approach that efficiently and accurately approximates network reliability. Our approach improves efficiency by reducing the number of samples based on stratified sampling. We theoretically guarantee that our approach improves the accuracy of approximation by using lower and upper bounds of network reliability, even though it reduces the number of samples. To efficiently compute the bounds, we develop an extended BDD, called S2BDD. During constructing the S2BDD, our approach employs dynamic programming for efficiently sampling possible graphs. Our experiment with real datasets demonstrates that our approach is up to 51.2 times faster than the existing sampling-based approach with higher accuracy.

Data Structures And Algorithms

Efficient Tensor Decomposition

This chapter studies the problem of decomposing a tensor into a sum of constituent rank one tensors. While tensor decompositions are very useful in designing learning algorithms and data analysis, they are NP-hard in the worst-case. We will see how to design efficient algorithms with provable guarantees under mild assumptions, and using beyond worst-case frameworks like smoothed analysis.

Ready to get started?

Join us today

Archive Your Research