Waqas Nawaz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Waqas Nawaz is active.

Explore More

Publication

Featured researches published by Waqas Nawaz.

Distributed and Parallel Databases | 2015

Intra graph clustering using collaborative similarity measure

Waqas Nawaz; Kifayat-Ullah Khan; Young-Koo Lee; Sungyoung Lee

Graph is an extremely versatile data structure in terms of its expressiveness and flexibility to model a range of real life phenomenon. Various networks like social networks, sensor networks and computer networks are represented and stored in the form of graphs. The analysis of these kind of graphs has an immense importance from quite a long time. It is performed from various aspects to get maximum out of such multifaceted information repository. When the analysis is targeted towards finding groups of vertices based on their similarity in a graph, clustering is the most conspicuous option. Previous graph clustering approaches either focus on the topological structure or attributes likeness, however, few recent methods constitutes both aspects simultaneously. Due to enormous computation requirements for similarity estimation, these methods are often suffered from scalability issues. In order to overcome this limitation, we introduce collaborative similarity measure (CSM) for intra-graph clustering. CSM is based on shortest path strategy, instead of all paths, to define structural and semantic relevance among vertices. First, we calculate the pair-wise similarity among vertices using CSM. Second, vertices are grouped together based on calculated similarity under k-Medoid framework. Empirical analysis, based on density, and entropy, proves the efficacy of CSM over existing measures. Moreover, CSM becomes a potential candidate for medium scaled graph analysis due to an order of magnitude less computations.

Computing | 2015

Set-based approximate approach for lossless graph summarization

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

Graph summarization is valuable approach to analyze various real life phenomenon, like communities, influential nodes, and information flow in a big graph. To summarize a graph, nodes having similar neighbors are merged into super nodes and their corresponding edges are compressed into super edges. Existing methods find similar nodes either by nodes ordering or perform pairwise similarity computations. Compression-by-node ordering approaches are scalable but provide lesser compression due to exhaustive similarity computations of their counterparts. In this paper, we propose a novel set-based summarization approach that directly summarizes naturally occurring sets of similar nodes in a graph. Our approach is scalable since we avoid explicit similarity computations with non-similar nodes and merge sets of nodes in each iteration. Similarly, we provide good compression ratio as each set consists of highly similar nodes. To locate sets of similar nodes, we find candidate sets of similar nodes by using locality sensitive hashing. However, member nodes of every candidate set have varying similarities with each other. Therefore, we propose a heuristic based on similarity among degrees of candidate nodes, and a parameter-free pruning technique to effectively identify subset of highly similar nodes from candidate nodes. Through experiments on real world graphs, our approach requires lesser execution time than pairwise graph summarization, with margin of an order of magnitude in graphs containing nodes with highly diverse neighborhood, and produces summary at similar accuracy. Similarly, we observe comparable scalability against the compression-by-node ordering method, while providing better compression ratio.

database systems for advanced applications | 2012

Collaborative similarity measure for intra graph clustering

Waqas Nawaz; Young-Koo Lee; Sungyoung Lee

Assorted networks have transpired for analysis and visualization, including social community network, biological network, sensor network and many other information networks. Prior approaches either focus on the topological structure or attribute likeness for graph clustering. A few recent methods constituting both aspects however cannot be scalable with elevated time complexity. In this paper, we have developed an intra-graph clustering strategy using collaborative similarity measure (IGC-CSM) which is comparatively scalable to medium scale graphs. In this approach, first the relationship intensity among vertices is calculated and then forms the clusters using k-Medoid framework. Empirical analysis is based on density and entropy, which depicts the efficiency of IGC-CSM algorithm without compromising on the quality of the clusters.

international conference on big data and cloud computing | 2014

Set-Based Unified Approach for Attributed Graph Summarization

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

In this paper we combine the neighborhood and attributes similarity to summarize big graphs where each node is attached with multiple attributes. The main intution behind our approach is that sets of nodes having common links, in graphs, usually have same attributes. Thus compressing such Sets of Similar Nodes (SSNs) can significantly reduce the size of big graphs, yet preserving the overall properties of the underlying graphs. However, efficiently finding such sets is computationally expensive. Finding these using pair wise similarity computations is not scalable while exploiting the nodes ordering in graphs, does not consider their attributes. For this purpose we propose a Unified Locality Sensitive Hashing (ULSH) approach to approximate the SSNs, since in graphs LSH can assemble the nodes based on neighborhood similarity only. Further using Minimum Description Length (MDL) principle, we propose a Unified Graph Summarization (UGS) technique to perform lossless compression of each set by creating a super node or adding a new virtual node in the graph. We compare our approach with two state of the art methods by experiments on synthetic and publically available real world graphs and observe very encouraging results.

World Wide Web | 2017

Set-based unified approach for summarization of a multi-attributed graph

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

Rich availability of real world knowledge in a graph based on attributes of each vertex and its interactions, is a valuable source of information. However, it is hard to derive this useful knowledge since either graphs of current era do not fit in main memory or cannot be efficiently processed. In this regard, it is better to create a meaningful summary graph that is compact yet preserves intrinsic properties of its underlying graph. In this paper, we propose a summarization approach for a big graph, where each node is attached with multiple attributes. Main intuition behind our approach is based on a real life concept that tells “friends of friends have many common friends and also have similar likes and preferences”. We use this phenomenon as the basis in our paper to identify sets of nodes having common neighborhood and similar attributes, for summarization. Existing aggregation-based summarization methods use pairwise heuristic to find similar pairs of nodes for compression. Whereas, pairwise similarity computations can check both neighborhood as well as attributes similarities, however, it is impractical to summarize a big graph. For this purpose, we propose a set-based approach for efficient summarization. To identify each set, we adopt Locality Sensitive Hashing (LSH) to restrict similarity computations within candidate similar nodes only. Since, existing LSH techniques only consider neighborhood similarity in a graph, therefore we propose a Unified LSH approach to simultaneously consider both attributes and neighborhood similarities. Further, using Minimum Description Length (MDL) principle, we present a new technique to perform lossless summarization of each set by creating a super node or adding a new virtual node in summary graph. We evaluate our proposed approach with state of the art methods on synthetic and publicly available real world graphs and observe better results in terms of execution time, compression ratio, and number of corrections to restructure the original graph.

Applied Intelligence | 2015

SPORE: shortest path overlapped regions and confined traversals towards graph clustering

Waqas Nawaz; Kifayat-Ullah Khan; Young-Koo Lee

An abundance of structural information has resulted in non-trivial graph traversals. Shortcut construction is among the utilized techniques implemented for efficient shortest path (SP) traversals on graphs. However, shortcut construction, being a computationally intensive task, required to be exclusive and offline, often produces unnecessary auxiliary data, i.e., shortcuts. Medium to large-scale graphs can take minutes to hours of computation time depending upon the utilization of computational resources and complexity of shortcut construction algorithms. In addition, the branching factor during SP expansions greatly increases due to excessive shortcuts. These factors make repeated SP queries unsuitable for graph mining tasks. This paper presents Shortest Path Overlapped Region (SPORE), a performance-based initiative that improves the shortcut construction performance by exploiting SP overlapped regions. Path overlapping has been overlooked by shortcut construction systems. SPORE takes advantage of this opportunity and provides a solution by constructing auxiliary shortcuts incrementally, using SP trees during traversals, instead of an exclusive step. SPORE is exposed to a graph clustering task, which requires extensive graph traversals to group similar vertices together, for realistic implications. We further suggest an optimization strategy to accelerate the performance of the clustering process using confined subgraph traversals. A performance evaluation of SPORE on real and synthetic graphs reveals an execution time gain of up to 40 %, having an order of magnitude fewer shortcuts over the SegTable approach. Leveraging the SPORE with multiple SP computations consistently reduces the latency of the entire clustering process. Furthermore, the confined subgraph traversal scheme improves the performance by an order of magnitude on undirected graphs, which is twice that of directed graphs.

Information Sciences | 2017

Faster compression methods for a weighted graph using locality sensitive hashing

Kifayat Ullah Khan; Batjargal Dolgorsuren; Tu Nguyen Anh; Waqas Nawaz; Young-Koo Lee

Abstract Weights on the edges of a graph can show interactions among members of a social network, emails exchanged in any organization, and traffic flow on roads. However, mining hidden patterns is difficult when the size of the graph is large. Creating a compact summary is useful if it preserves the structural and edge weight information of its underlying graph. Existing work in this context provides a pairwise compression strategy to create a summary whose decompressed version has minimum difference in edge weights compared to its initial state. The resultant summary graph is compact, but the solution has quadratic time complexity due to exhaustive pairwise searching. Therefore, we present a set-based summarization approach that aggregates sets of nodes. We avoid explicit similarity computations and directly identify the required sets via Locality Sensitive Hashing (LSH). LSH accelerates the summarization process, but its hashing scheme cannot consider the edge weights. Considering the edge weight during hashing is necessary when the objective of the required summary is altered to a personalized view. Hence, we propose a non-parametric hashing scheme for LSH to generate candidate similar nodes from the weighted neighborhood of each node. We perform comparisons with state-of-the-art solutions and obtain better results using various experimental criteria.

international conference on ubiquitous information management and communication | 2015

Lossless graph summarization using dense subgraphs discovery

Kifayat Ullah Khan; Waqas Nawaz; Young-Koo Lee

Dense subgraph discovery, in a large graph, is useful to solve the community search problem. Motivated from this, we propose a graph summarization method where we search and aggregate dense subgraphs into super nodes. Since the dense subgraphs have high overlap of common neighbors, thus merging such subgraphs can produce a highly compact summary graph. Whereas the member nodes of each dense subgraph have many edges in common, they also have some edges not belonging to a given subgraph; which in turn reduce the compression ratio. To solve this problem, we propose the concept of AutoPruning that effectively filters the dense subgraphs having higher ratio of common to that of non-common neighbors. To summarize the dense subgraphs, we use the Minimum Description Length (MDL) principle to obtain a highly compact summary with least edge corrections for lossless compression. We propose two alternatives to trade-off the computation time and compression ratio while creating a super node from each dense subgraph. Through experiments on two publicly available real world graphs, we compare the proposed approach with the well known Minimum Degree Measure (MDM), for dense subgraph discovery, and observe very encouraging results.

international conference on big data and cloud computing | 2014

CORE Analysis for Efficient Shortest Path Traversal Queries in Social Graphs

Waqas Nawaz; Kifayat-Ullah Khan; Young-Koo Lee

The shortest path traversal queries require to scan the entire graph. The repetitive scans make it hard to answer the traversal queries in a reasonable time. Many traversal algorithms imply precomputed information to speed up the run-time query process. However, computing effective auxiliary data at pre-processing stage is still an active problem. It is useful but non-trivial to determine the occurrence of vertices or edges during shortest path computations. In this paper, we empirically analyze the continuous overlapped regions (COREs) to articulate the behavior of shortest path traversals on different real life social graphs. First, we compute the shortest paths between a set of vertices. Each shortest path is considered as one transaction. Second, we utilize the pattern mining approach to identify the frequency of occurrence of the vertices appear together. Further, we evaluate the results in terms of network properties, i.e. Degree distribution, average shortest path, and clustering coefficient, along with the visual analysis.

international conference on big data | 2015

Dynamic Taxi Trip Information Management using G* System

Batjargal Dolgorsuren; Waqas Nawaz; Young-Koo Lee

In this paper, we provide an efficient mechanism to manage dynamic taxi trip information. Specifically, we have designed a graph storage model, which is more efficient than relational data model, for taxi dispatching problem. Our proposed model is capable of mapping dynamics of the road network in terms of taxi trip information by a set of graph instances or snapshots. G* can effectively handle the complexity and subtlety inherent in dynamic graphs and executes complex queries on large graphs using distributed operators to process graph data in parallel. We extended the G* system using our modeling strategy for transportation network. We carry out our experiments using dynamic taxi trip information in New-York area. The experimental results show the superiority of our proposed system in terms of efficient storage and processing for taxi dispatching problems. This makes the existing taxi information management system more practical and profitable by improving the overall performance.

Explore More