Zubair Shah | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zubair Shah is active.

Explore More

Publication

Featured researches published by Zubair Shah.

advanced data mining and applications | 2013

Software Clustering Using Automated Feature Subset Selection

Zubair Shah; Rashid Naseem; Mehmet A. Orgun; Abdun Naser Mahmood; Sara Shahzad

This paper proposes a feature selection technique for software clustering which can be used in the architecture recovery of software systems. The recovered architecture can then be used in the subsequent phases of software maintenance, reuse and re-engineering. A number of diverse features could be extracted from the source code of software systems, however, some of the extracted features may have less information to use for calculating the entities, which result in dropping the quality of software clusters. Therefore, further research is required to select those features which have high relevancy in finding associations between entities. In this article first we propose a supervised feature selection technique for unlabeled data, and then we apply this technique for software clustering. A number of feature subset selection techniques in software architecture recovery have been proposed. However none of them focus on automated feature selection in this domain. Experimental results on three software test systems reveal that our proposed approach produces results which are closer to the decompositions prepared by human experts, as compared to those discovered by the well-known K-Means algorithm.

IEEE Transactions on Big Data | 2017

A Spatio-temporal Data Summarization Paradigm for Real-time Operation of Smart Grid

Zubair Shah; Adnan Anwar; Abdun Naser Mahmood; Zahir Tari; Albert Y. Zomaya

In a smart grid distribution management system, operation, planning, forecasting and decision making relies on demand-side management functions, which require real-time smart grid data. This data has significant dollar value because it is extremely useful for efficient control and intelligent prediction of the energy consumption, and expert management of residential and commercial load. However, the huge amount of (smart grid) data generated at a very high velocity poses a number of challenges. Utility companies have a huge demand for efficient summarization techniques to mine interesting patterns and extracting useful and actionable intelligence. Research from various domains has shown that data summarization can significantly improve the scalability and efficiency of various data analytic tasks (e.g., transactional database mining, data streams mining, network monitoring). This paper proposes a summarization approach (i.e., a set of algorithms, data structures, and query mechanisms) that enables the utility company to accurately infer various energy consumption patterns in real-time by automatic monitoring of smart grid data using significantly less computational resources. The proposed summarization approach is suitable for processing spatiotemporal streams, and it can also provide answers in real-time to various smart grid applications (e.g., demand-side management, direct load control, smart pricing and Volt-VAr control). Both theoretical bound and experimental evaluation are presented in this paper, which shows that the memory required for the proposed data structure grows linearly for the first 52 weeks; but interestingly, after the first year, the memory growth is negligible. The experimental results show that the proposed approach can process around 4 million smart meter readings every second or 120 million readings every minute. The proposed approach outperforms widely commercially used Database Management Systems (DBMSs) in terms of update and query costs: it is about 200 times faster than DBMSs in terms of update time, and about 340 times faster than DBMSs in terms of query time.

conference on industrial electronics and applications | 2014

Stochastic model of TCP and UDP traffic in IEEE 802.11b/g

Mazhar Hussain Malik; Mehmet Emin Aydin; Zubair Shah; Saqib Hussain

IEEE 802.11 networks have been widely explored since last decade and IEEE802.11g was proposed to increase the data rate of wireless networks up to 54Mbps which ensure backward capability with IEEE802.11b. The main goal of increasing the data rate is to handle high data rate traffic. Many studies have been conducted to analyze the performance of wireless networks using Quality of service constrains. However there are limited studies which try to understand the behavior of TCP traffic while UDP is present. The IEEE 802.11 family has a standard mechanism to handle TCP and UDP traffic without consideration of traffic priorities. In the presence of UDP traffic, the TCP cause high delay and saturation. In this paper we embedded the Bianchis Markov chain model with our packet fragmentation technique to reduce the saturation impact on TCP traffic while UDP is present. We have analyzed the throughput for IEEE802.11b/g considering response times and packet retransmission. The theoretical results are validated by simulation and it is observed that the TCP packet rate is improved by using our proposed approach.

knowledge discovery and data mining | 2016

Computing Hierarchical Summary of the Data Streams

Zubair Shah; Abdun Naser Mahmood; Michael Barlow

Data stream processing is an important function in many online applications such as network traffic analysis, web applications, and financial data analysis. Computing summaries of data stream is challenging since streaming data is generally unbounded, and cannot be permanently stored or accessed more than once. In this paper, we have proposed two counter based hierarchical CHS

computational science and engineering | 2013

Subset Selection Classifier (SSC): A Training Set Reduction Method

Zubair Shah; Abdun Naser Mahmood; Mehmet A. Orgun; M. Hadi Mashinchi

conference on information and knowledge management | 2015

A Data-Driven Approach to Distinguish Cyber-Attacks from Physical Faults in a Smart Grid

Adnan Anwar; Abdun Naser Mahmood; Zubair Shah

\epsilon

international conference on security and privacy in communication systems | 2014

Forensic Potentials of Solid State Drives

Zubair Shah; Abdun Naser Mahmood; Jill Slay

international conference on big data | 2014

A summarization paradigm for big data

Zubair Shah; Abdun Naser Mahmood

---approximation algorithms to create hierarchical summaries of one dimensional data. CHS maintains a data structure, where each entry contains the incoming data item and an associated counter to store its frequency. Since every item in streaming data cannot be stored, CHS only maintains frequent items known as hierarchical heavy hitters at various levels of generalization hierarchy by exploiting the natural hierarchy of the data. The algorithm guarantees accuracy of count within an

conference on industrial electronics and applications | 2013