Mohammad Maifi Hasan Khan

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohammad Maifi Hasan Khan is active.

Explore More

Publication

Featured researches published by Mohammad Maifi Hasan Khan.

international conference on embedded networked sensor systems | 2008

Dustminer: troubleshooting interactive complexity bugs in sensor networks

Mohammad Maifi Hasan Khan; Hieu Khac Le; Hossein Ahmadi; Tarek F. Abdelzaher; Jiawei Han

This paper presents a tool for uncovering bugs due to interactive complexity in networked sensing applications. Such bugs are not localized to one component that is faulty, but rather result from complex and unexpected interactions between multiple often individually non-faulty components. Moreover, the manifestations of these bugs are often not repeatable, making them particularly hard to find, as the particular sequence of events that invokes the bug may not be easy to reconstruct. Because of the distributed nature of failure scenarios, our tool looks for sequences of events that may be responsible for faulty behavior, as opposed to localized bugs such as a bad pointer in a module. An extensible framework is developed where a front-end collects runtime data logs of the system being debugged and an offline back-end uses frequent discriminative pattern mining to uncover likely causes of failure. We provide a case study of debugging a recent multichannel MAC protocol that was found to exhibit corner cases of poor performance (worse than single channel MAC). The tool helped uncover event sequences that lead to a highly degraded mode of operation. Fixing the problem significantly improved the performance of the protocol.We also provide a detailed analysis of tool overhead in terms of memory requirements and impact on the running application.

high performance computing and communications | 2015

Performance Prediction for Apache Spark Platform

Kewen Wang; Mohammad Maifi Hasan Khan

Apache Spark is an open source distributed data processing platform that uses distributed memory abstraction to process large volume of data efficiently. However, performance of a particular job on Apache Spark platform can vary significantly depending on the input data type and size, design and implementation of the algorithm, and computing capability, making it extremely difficult to predict the performance metric of a job such as execution time, memory footprint, and I/O cost. To address this challenge, in this paper, we present a simulation driven prediction model that can predict job performance with high accuracy for Apache Spark platform. Specifically, as Apache spark jobs are often consist of multiple sequential stages, the presented prediction model simulates the execution of the actual job by using only a fraction of the input data, and collect execution traces (e.g., I/O overhead, memory consumption, execution time) to predict job performance for each execution stage individually. We evaluated our prediction framework using four real-life applications on a 13 node cluster, and experimental results show that the model can achieve high prediction accuracy.

distributed computing in sensor systems | 2009

Finding Symbolic Bug Patterns in Sensor Networks

Mohammad Maifi Hasan Khan; Tarek F. Abdelzaher; Jiawei Han; Hossein Ahmadi

This paper presents a failure diagnosis algorithm for summarizing and generalizing patterns that lead to instances of anomalous behavior in sensor networks. Often multiple seemingly different event patterns lead to the same type of failure manifestation. A hidden relationship exists, in those patterns, among event attributes that is somehow responsible for the failure. For example, in some system, a message might always get corrupted if the sender is more than two hops away from the receiver (which is a distance relationship) irrespective of the senderId and receiverId. To uncover such failure-causing relationships, we present a new symbolic pattern extraction technique that identifies and symbolically expresses relationships correlated with anomalous behavior. Symbolic pattern extraction is a new concept in sensor network debugging that is unique in its ability to generalize over patterns that involve different combinations of nodes or message exchanges by extracting their common relationship. As a proof of concept, we provide synthetic traffic scenarios where we show that applying symbolic pattern extraction can uncover more complex bug patterns that are crucial to the understanding of real causes of problems. We also use symbolic pattern extraction to diagnose a real bug and show that it generates much fewer and more accurate patterns compared to previous approaches.

Human-centric Computing and Information Sciences | 2015

Designing challenge questions for location‐based authentication systems: a real‐life study

Yusuf Albayram; Mohammad Maifi Hasan Khan; Athanasios Bamis; Sotirios Kentros; Nhan Nguyen; Ruhua Jiang

Online service providers often use challenge questions (a.k.a. knowledge‐based authentication) to facilitate resetting of passwords or to provide an extra layer of security for authentication. While prior schemes explored both static and dynamic challenge questions to improve security, they do not systematically investigate the problem of designing challenge questions and its effect on user recall performance. Interestingly, as answering different styles of questions may require different amount of cognitive effort and evoke different reactions among users, we argue that the style of challenge questions itself can have a significant effect on user recall performance and usability of such systems. To address this void and investigate the effect of question types on user performance, this paper explores location‐based challenge question generation schemes where different types of questions are generated based on users’ locations tracked by smartphones and presented to users. For evaluation, we deployed our location tracking application on users’ smartphones and conducted two real‐life studies using four different kinds of challenge questions. Each study was approximately 30 days long and had 14 and 15 users respectively. Our findings suggest that the question type can have a significant effect on user performance. Finally, as individual users may vary in terms of performance and recall rate, we investigate and present a Bayesian classifier based authentication algorithm that can authenticate legitimate users with high accuracy by leveraging individual response patterns while reducing the success rate of adversaries.

military communications conference | 2011

PRONET: Network trust assessment based on incomplete provenance

Kannan Govindan; Xinlei Wang; Mohammad Maifi Hasan Khan; Gulustan Dogan; Kai Zeng; Gerald M. Powell; Ted Brown; Tarek F. Abdelzaher; Prasant Mohapatra

This paper presents a tool ProNet, that is used to obtain the network trust based on incomplete provenance. We consider a multihop scenario where a set of source nodes observe an event and disseminate their observations as an information item through a multihop path to the command center. Nodes are assumed to embed their provenance details on the information content. Received provenance may not be complete at the command center due to attackers dropping provenance or the unavailability of provenance. We design ProNet, a tool which is at the command center that acts on the received information item to determine the information trust, node-level trust and sequence-level trust. ProNet contains three steps. In the first step it reconstructs the complete provenance details of received information from the available provenance. In the second step it employs a data classification scheme to classify the data into a good and bad pool. In the third step it employs pattern mining on the reconstructed provenance of bad data pools to determine the frequently appearing node and node sequence. This frequent appearance will quantify the trust level of nodes and node sequence. Now an information quality/trust level of newly received information can be determined based on the occurrences of these node/sequence patterns on the provenance data. We provide a detailed analysis on false positive and false negatives.

international conference on cloud computing | 2016

Modeling Interference for Apache Spark Jobs

Kewen Wang; Mohammad Maifi Hasan Khan; Nhan Nguyen; Swapna S. Gokhale

To maximize resource utilization and system throughput, hardware resources are often shared across multiple Apache Spark jobs through virtualization techniques in cloud platforms. However, while the performance of these jobs running in virtualized environment can be negatively affected due to interference caused by resource contention, it is nontrivial to predict the effect of interference on job performance in such settings, which is critical for efficient scheduling of such jobs and performance troubleshooting. To address this challenge, in this paper, we develop analytical models to estimate the effect of interference among multiple Apache Spark jobs running concurrently on job execution time in virtualized cloud environment. We evaluated the accuracy of our models using four real-life applications (e.g., Page rank, K-means, Logistic regression, and Word count) on a 6 node cluster while running up to four jobs concurrently. Our experimental results show that the model can achieve high prediction accuracy, and ranges between 86% to 99% when the number of concurrent jobs are four and all start simultaneously, and ranges between 71% to 99% when the number of concurrent jobs are four and start at different times.

ACM Transactions on Sensor Networks | 2014

Troubleshooting interactive complexity bugs in wireless sensor networks using data mining techniques

Mohammad Maifi Hasan Khan; Hieu Khac Le; Hossein Ahmadi; Tarek F. Abdelzaher; Jiawei Han

This article presents a tool for uncovering bugs due to interactive complexity in networked sensing applications. Such bugs are not localized to one component that is faulty, but rather result from complex and unexpected interactions between multiple often individually nonfaulty components. Moreover, the manifestations of these bugs are often not repeatable, making them particularly hard to find, as the particular sequence of events that invokes the bug may not be easy to reconstruct. Because of the distributed nature of failure scenarios, our tool looks for sequences of events that may be responsible for faulty behavior, as opposed to localized bugs such as a bad pointer in a module. We identified several challenges in applying discriminative sequence mining for root cause analysis when the system fails to perform as expected and presented our solutions to those challenges. We also present two alternative schemes, namely, two-stage mining and the progressive discriminative sequence mining to address the scalability challenge. An extensible framework is developed where a front-end collects runtime data logs of the system being debugged and an offline back-end uses frequent discriminative pattern mining to uncover likely causes of failure. We provided several case studies where we applied our tool successfully to troubleshoot the cause of the problem. We uncovered a kernel-level race condition bug in the LiteOS operating system and a protocol design bug in the directed diffusion protocol. We also presented a case study of debugging a multichannel MAC protocol that was found to exhibit corner cases of poor performance (worse than single-channel MAC). The tool helped to uncover event sequences that lead to a highly degraded mode of operation. Fixing the problem significantly improved the performance of the protocol. We also evaluated the extensions presented in this article. Finally, we provided a detailed analysis of tool overhead in terms of memory requirements and impact on the running application.

Human-centric Computing and Information Sciences | 2015

How does this message make you feel? A study of user perspectives on software update/warning message design

Michael Fagan; Mohammad Maifi Hasan Khan; Nhan Nguyen

Software update messages are commonly used to inform users about software updates, recent bug fixes, and various system vulnerabilities, and to suggest recommended actions (e.g., updating software). While various design features (e.g., update options, message layout, update message presentation) of these messages can influence the actions taken by users, no prior study can be found that investigated users opinions regarding various design alternatives. To address this void, this paper focuses on identifying software update message design features (e.g., layout, color, content) that may affect users positively or negatively. Toward that, we conducted a user study where users are shown 13 software update messages along with 1 virus warning message. We collect responses from 155 users through an online survey. Participants gave a total of 809 positive comments and 866 negative comments along with ranking of each image in terms of perceived importance, noticeability, annoyance and confusion. As many of the comments are repetitive and often contain multiple themes, we manually analyzed and performed a bottom-up, inductive coding to identify and refine the underlying themes. Over multiple iterations, positive comments were grouped into 52 categories which were subsequently grouped under four themes. Similarly, negative comments were first grouped into 38 categories which were subsequently grouped under four themes. Based on our analysis, we present the list of design features that are found to be highly correlated to confusion, annoyance, noticeability, and importance, either positively or negatively.

ACM Transactions on Sensor Networks | 2015

Power-Based Diagnosis of Node Silence in Remote High-End Sensing Systems

Yong Yang; Lu Su; Mohammad Maifi Hasan Khan; Michael LeMay; Tarek F. Abdelzaher; Jiawei Han

Our prior work suggested the use of power traces of unresponsive sensor nodes to diagnose the cause of anomalous node silence, but suffers from its limitations in scalability. To address these issues, we propose a new concept of power watermarking, a diagnostic service that actively produces unique power watermarks for each system state of interest so as to convey system information over power measurements. Failures of applications, hardware, or the watermark generator result in different watermark combinations or absence thereof. Experiments demonstrate high diagnostic accuracy and energy efficiency, even in the presence of multiple applications of similar natural power consumption patterns.

Journal of Systems and Software | 2015

A closed-loop context aware data acquisition and resource allocation framework for dynamic data driven applications systems (DDDAS) on the cloud

Nhan Nguyen; Mohammad Maifi Hasan Khan

A closed-loop integrated solution for maximizing the utility of information while minimizing data loss at the back-end.A centralized algorithm that attempts to maximize the overall quality of information for the whole network.A threshold based heuristic that helps system administrators to tune the algorithm.A proactive resource optimization framework that adaptively allocates resources. Various dynamic data driven applications systems (DDDAS) such as hazard management, target tracking, and battlefield monitoring often leverage multiple heterogeneous sensors, and generate huge volume of data. Not surprisingly, researchers are investigating ways to support such applications on the cloud. However, in such applications, the importance of a subset of sensors may change quickly due to changes in the execution environment, which often require adaptation of sampling rates accordingly. Additionally, such variations in sampling rates can create significant load imbalance on back-end servers, leading toward performance degradation. To address this, we investigate a closed-loop integrated solution as follows. First, we develop a centralized algorithm that attempts to maximize the overall quality of information for the whole network given the utility functions and the importance rankings of sensor nodes. Next, we present a threshold based heuristic that prevents omission of highly important nodes at critical times. Finally, a proactive resource optimization framework is investigated that adaptively allocate resources (e.g., servers) in response to changed sampling rates. Extensive evaluation on cloud platform for various scenarios shows that our approach can quickly adapt sampling rates and reallocate resources in response to the changed importance of sensor nodes, minimizing data loss significantly.

Explore More