Kifayat Ullah Khan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kifayat Ullah Khan is active.

Explore More

Publication

Featured researches published by Kifayat Ullah Khan.

Computing | 2015

Set-based approximate approach for lossless graph summarization

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

Graph summarization is valuable approach to analyze various real life phenomenon, like communities, influential nodes, and information flow in a big graph. To summarize a graph, nodes having similar neighbors are merged into super nodes and their corresponding edges are compressed into super edges. Existing methods find similar nodes either by nodes ordering or perform pairwise similarity computations. Compression-by-node ordering approaches are scalable but provide lesser compression due to exhaustive similarity computations of their counterparts. In this paper, we propose a novel set-based summarization approach that directly summarizes naturally occurring sets of similar nodes in a graph. Our approach is scalable since we avoid explicit similarity computations with non-similar nodes and merge sets of nodes in each iteration. Similarly, we provide good compression ratio as each set consists of highly similar nodes. To locate sets of similar nodes, we find candidate sets of similar nodes by using locality sensitive hashing. However, member nodes of every candidate set have varying similarities with each other. Therefore, we propose a heuristic based on similarity among degrees of candidate nodes, and a parameter-free pruning technique to effectively identify subset of highly similar nodes from candidate nodes. Through experiments on real world graphs, our approach requires lesser execution time than pairwise graph summarization, with margin of an order of magnitude in graphs containing nodes with highly diverse neighborhood, and produces summary at similar accuracy. Similarly, we observe comparable scalability against the compression-by-node ordering method, while providing better compression ratio.

international conference on big data and cloud computing | 2014

Set-Based Unified Approach for Attributed Graph Summarization

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

In this paper we combine the neighborhood and attributes similarity to summarize big graphs where each node is attached with multiple attributes. The main intution behind our approach is that sets of nodes having common links, in graphs, usually have same attributes. Thus compressing such Sets of Similar Nodes (SSNs) can significantly reduce the size of big graphs, yet preserving the overall properties of the underlying graphs. However, efficiently finding such sets is computationally expensive. Finding these using pair wise similarity computations is not scalable while exploiting the nodes ordering in graphs, does not consider their attributes. For this purpose we propose a Unified Locality Sensitive Hashing (ULSH) approach to approximate the SSNs, since in graphs LSH can assemble the nodes based on neighborhood similarity only. Further using Minimum Description Length (MDL) principle, we propose a Unified Graph Summarization (UGS) technique to perform lossless compression of each set by creating a super node or adding a new virtual node in the graph. We compare our approach with two state of the art methods by experiments on synthetic and publically available real world graphs and observe very encouraging results.

World Wide Web | 2017

Set-based unified approach for summarization of a multi-attributed graph

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

Rich availability of real world knowledge in a graph based on attributes of each vertex and its interactions, is a valuable source of information. However, it is hard to derive this useful knowledge since either graphs of current era do not fit in main memory or cannot be efficiently processed. In this regard, it is better to create a meaningful summary graph that is compact yet preserves intrinsic properties of its underlying graph. In this paper, we propose a summarization approach for a big graph, where each node is attached with multiple attributes. Main intuition behind our approach is based on a real life concept that tells “friends of friends have many common friends and also have similar likes and preferences”. We use this phenomenon as the basis in our paper to identify sets of nodes having common neighborhood and similar attributes, for summarization. Existing aggregation-based summarization methods use pairwise heuristic to find similar pairs of nodes for compression. Whereas, pairwise similarity computations can check both neighborhood as well as attributes similarities, however, it is impractical to summarize a big graph. For this purpose, we propose a set-based approach for efficient summarization. To identify each set, we adopt Locality Sensitive Hashing (LSH) to restrict similarity computations within candidate similar nodes only. Since, existing LSH techniques only consider neighborhood similarity in a graph, therefore we propose a Unified LSH approach to simultaneously consider both attributes and neighborhood similarities. Further, using Minimum Description Length (MDL) principle, we present a new technique to perform lossless summarization of each set by creating a super node or adding a new virtual node in summary graph. We evaluate our proposed approach with state of the art methods on synthetic and publicly available real world graphs and observe better results in terms of execution time, compression ratio, and number of corrections to restructure the original graph.

Information Sciences | 2017

Faster compression methods for a weighted graph using locality sensitive hashing

Kifayat Ullah Khan; Batjargal Dolgorsuren; Tu Nguyen Anh; Waqas Nawaz; Young-Koo Lee

Abstract Weights on the edges of a graph can show interactions among members of a social network, emails exchanged in any organization, and traffic flow on roads. However, mining hidden patterns is difficult when the size of the graph is large. Creating a compact summary is useful if it preserves the structural and edge weight information of its underlying graph. Existing work in this context provides a pairwise compression strategy to create a summary whose decompressed version has minimum difference in edge weights compared to its initial state. The resultant summary graph is compact, but the solution has quadratic time complexity due to exhaustive pairwise searching. Therefore, we present a set-based summarization approach that aggregates sets of nodes. We avoid explicit similarity computations and directly identify the required sets via Locality Sensitive Hashing (LSH). LSH accelerates the summarization process, but its hashing scheme cannot consider the edge weights. Considering the edge weight during hashing is necessary when the objective of the required summary is altered to a personalized view. Hence, we propose a non-parametric hashing scheme for LSH to generate candidate similar nodes from the weighted neighborhood of each node. We perform comparisons with state-of-the-art solutions and obtain better results using various experimental criteria.

international conference on ubiquitous information management and communication | 2015

Lossless graph summarization using dense subgraphs discovery

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

Dense subgraph discovery, in a large graph, is useful to solve the community search problem. Motivated from this, we propose a graph summarization method where we search and aggregate dense subgraphs into super nodes. Since the dense subgraphs have high overlap of common neighbors, thus merging such subgraphs can produce a highly compact summary graph. Whereas the member nodes of each dense subgraph have many edges in common, they also have some edges not belonging to a given subgraph; which in turn reduce the compression ratio. To solve this problem, we propose the concept of AutoPruning that effectively filters the dense subgraphs having higher ratio of common to that of non-common neighbors. To summarize the dense subgraphs, we use the Minimum Description Length (MDL) principle to obtain a highly compact summary with least edge corrections for lossless compression. We propose two alternatives to trade-off the computation time and compression ratio while creating a super node from each dense subgraph. Through experiments on two publicly available real world graphs, we compare the proposed approach with the well known Minimum Degree Measure (MDM), for dense subgraph discovery, and observe very encouraging results.

international conference on data engineering | 2015

Set-based approach for lossless graph summarization using Locality Sensitive Hashing

Kifayat Ullah Khan

Graph summarization is a valuable approach for in-memory processing of a big graph. A summary graph is compact, yet it maintains the overall characteristics of the underlying graph, thus suitable for querying and visualization. To summarize a big graph, the idea is to compress the similar nodes in dense regions of the graph. The existing approaches find these similar nodes either by nodes ordering or pair-wise similarity computations. The former approaches are scalable but cannot simultaneously consider the attributes and neighborhood similarity among the nodes. In contrast, the pair-wise summarization methods can consider both the similarity aspects but are impractical for a big graph. In this paper, we propose a set-based summarization method that aggregates the sets of similar nodes in each iteration, thus provides scalability. To find each set, we approximate the candidate similar nodes without nodes ordering and explicit similarity computations by using Locality Sensitive Hashing, LSH. In conjunction with an information theoretic approach, we present the scalable solutions for lossless summarization of both attributed and non-attributed graphs.

Proceedings of the Sixth International Conference on Emerging Databases | 2016

Top- k frequent induced subgraph mining using sampling

Van T. T. Duong; Kifayat Ullah Khan; Byeong-Soo Jeong; Young-Koo Lee

These days Frequent Induced Subgraph Mining (FISM) is an active research direction, in various application domains like biological networks, chemical, or social networks. A number of FISM approaches have been proposed over the years. However, existing methods take long execution time since they perform numerous subgraph isomorphism (SI) operations, an NP-hard for counting frequency of subgraphs in a graph database. In this paper, we propose kFISM, a new sampling-based method for top-k Frequent Induced Subgraph Mining from a graph database. To avoid SI operations in kFISM, we present a measure, indFreq, to compute frequency of subgraphs. kFISM executes a biased random walk-based sampling over fixed-size vertex-induced subgraphs so that the potentially frequent subgraphs are visited with high probability. We evaluate execution time and accuracy of finding our desired types of subgraphs using kFISM on a real-life dataset. We observe that our proposed method outperforms state of the art approach in execution time and accuracy.

Archive | 2018

Parallel Compression of Weighted Graphs

Elena En; Aftab Alam; Kifayat Ullah Khan; Young-Koo Lee

Large graphs such as social network, web graph, biological network, are complex and facing the challenges of processing and visualization. Motivated by such issues, Taivonen et al. [1] proposed models and sequential algorithms for weighted graph with the intentions to generate a candidate compress graph. The proposed compression algorithm is expensive in terms of computation time because of the sequential process. The weighted graph compression algorithms can be made faster while adopting parallel processing technique. In this paper, we adopt parallel processing technique for weighted graph compression problem while using multi-selection nodes to perform merge-able technique with various graph clustering algorithms to avoid overlapping between nodes from different threads. For the performance evaluation purposes of the proposed method, we carry out series of tests on the real networks. We perform extensive experiments on parallel graph summarization while using different graph clustering algorithms. Our results demonstrate their effectiveness for parallel graph compression on real networks.

The Journal of Supercomputing | 2018

An effective graph summarization and compression technique for a large-scaled graph

Hojin Seo; Kisung Park; Yongkoo Han; Hyunwook Kim; Muhammad Umair; Kifayat Ullah Khan; Young-Koo Lee

Graphs are widely used in various applications, and their size is becoming larger over the passage of time. It is necessary to reduce their size to minimize main memory needs and to save the storage space on disk. For these purposes, graph summarization and compression approaches have been studied in various existing studies to reduce the size of a large graph. Graph summarization aggregates nodes having similar structural properties to represent a graph with reduced main memory requirements. Whereas graph compression applies various encoding techniques so that the resultant graph needs lesser storage space on disk. Considering usefulness of both the paradigms, we propose to obtain best of the both worlds by combining summarization and compression approaches. Hence, we present a greedy-based algorithm that greatly reduces the size of a large graph by applying both the compression and summarization. We also propose a novel cost model for calculating the compression ratio considering both the compression and summarization strategies. The algorithm uses the proposed cost model to determine whether to perform one or both of them in every iteration. Through comprehensive experiments on real-world datasets, we show that our proposed algorithm achieves a better compression ratio than only applying summarization approaches by up to 16%.

international conference on ubiquitous information management and communication | 2017

Top-k frequent induced subgraph mining on a sliding window using sampling

Van T. T. Duong; Kifayat Ullah Khan; Young-Koo Lee

Finding Frequent Induced Subgraph in a stream of graph data is critical for many application such as frequent substructures in biological networks, chemical compounds, or community detection in social networks. Some approaches have been proposed for mining frequent induced subgraph in graph database. However, existing methods take long execution time since they perform numerous subgraph isomorphism (SI) operations, thus, these approach is not efficience for mining on streaming environment. In this paper, we propose k-FISMW, a new sampling-based method for top-k Frequent Induced Subgraph Mining on a sliding Window. We use a specialized data structure called WSTable (Window Summary Table) to maintain information of recent graphs in the sliding windows. To avoid SI operations in kFISM, we present a measure, indFreq, to compute frequency of subgraphs. k-FISMW executes a biased random walkbased sampling over fixed-size vertex-induced subgraphs on the current window so that the potentially frequent subgraphs are visited with high probability. We evaluate execution time and accuracy of finding our desired types of subgraphs using k-FISMW on both real-life and synthetic dataset. We observe that our proposed method outperforms state of the art approach in execution time and accuracy.

Explore More