Malak Alshawabkeh
Northeastern University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Malak Alshawabkeh.
Operating Systems Review | 2011
Fatemeh Azmandian; Micha Moffie; Malak Alshawabkeh; Jennifer G. Dy; Javed A. Aslam; David R. Kaeli
As virtualization technology gains in popularity, so do attempts to compromise the security and integrity of virtualized computing resources. Anti-virus software and firewall programs are typically deployed in the guest virtual machine to detect malicious software. These security measures are effective in detecting known malware, but do little to protect against new variants of intrusions. Intrusion detection systems (IDSs) can be used to detect malicious behavior. Most intrusion detection systems for virtual execution environments track behavior at the application or operating system level, using virtualization as a means to isolate themselves from a compromised virtual machine. In this paper, we present a novel approach to intrusion detection of virtual server environments which utilizes only information available from the perspective of the virtual machine monitor (VMM). Such an IDS can harness the ability of the VMM to isolate and manage several virtual machines (VMs), making it possible to provide monitoring of intrusions at a common level across VMs. It also offers unique advantages over recent advances in intrusion detection for virtual machine environments. By working purely at the VMM-level, the IDS does not depend on structures or abstractions visible to the OS (e.g., file systems), which are susceptible to attacks and can be modified by malware to contain corrupted information (e.g., the Windows registry). In addition, being situated within the VMM provides ease of deployment as the IDS is not tied to a specific OS and can be deployed transparently below different operating systems. Due to the semantic gap between the information available to the VMM and the actual application behavior, we employ the power of data mining techniques to extract useful nuggets of knowledge from the raw, low-level architectural data. We show in this paper that by working entirely at the VMM-level, we are able to capture enough information to characterize normal executions and identify the presence of abnormal malicious behavior. Our experiments on over 300 real-world malware and exploits illustrate that there is sufficient information embedded within the VMM-level data to allow accurate detection of malicious attacks, with an acceptable false alarm rate.
architectural support for programming languages and operating systems | 2010
Malak Alshawabkeh; Byunghyun Jang; David R. Kaeli
The Local Outlier Factor (LOF) is a very powerful anomaly detection method available in machine learning and classification. The algorithm defines the notion of local outlier in which the degree to which an object is outlying is dependent on the density of its local neighborhood, and each object can be assigned an LOF which represents the likelihood of that object being an outlier. Although this concept of a local outlier is a useful one, the computation of LOF values for every data object requires a large number of k-nearest neighbor queries -- this overhead can limit the use of LOF due to the computational overhead involved. Due to the growing popularity of Graphics Processing Units (GPU) in general-purpose computing domains, and equipped with a high-level programming language designed specifically for general-purpose applications (e.g., CUDA), we look to apply this parallel computing approach to accelerate LOF. In this paper we explore how to utilize a CUDA-based GPU implementation of the k-nearest neighbor algorithm to accelerate LOF classification. We achieve more than a 100X speedup over a multi-threaded dual-core CPU implementation. We also consider the impact of input data set size, the neighborhood size (i.e., the value of k) and the feature space dimension, and report on their impact on execution time.
international conference on machine learning and applications | 2010
Malak Alshawabkeh; Micha Moffie; Fatemeh Azmandian; Javed A. Aslam; Jennifer G. Dy; David R. Kaeli
Virtualization is becoming an increasingly popular service hosting platform. Recently, intrusion detection systems (IDSs) which utilize virtualization have been introduced. One particular challenge present in current virtualization-based IDS systems is considered in this paper. IDS systems are commonly faced with high-dimensionality imbalanced data. Improved feature selection methods are needed to achieve more accurate detection when presented with imbalanced data. These methods must select the right set of features which will lead to a lower number of false alarms and higher correct detection rates. In this paper we propose a new Boosting-based feature selection that evaluates the relative importance of individual features using the fractional absolute confidence that Boosting produces. Our approach accounts for the sample distributions by optimizing for the area under the Receive Operating Characteristic (ROC) curve (i.e., Area Under the Curve(AUC)). Empirical results on different commercial virtual appliances and malwares indicate that proper input feature selection is key if we want an effective virtualization-based IDS that is lightweight, efficient and effective.
international conference on machine learning and applications | 2011
Malak Alshawabkeh; Javed A. Aslam; Jennifer G. Dy; David R. Kaeli
Feature selection helps us to address problems possessing high dimensionality, retaining only those features that are most important for the classification task. However, traditional feature selection methods fail to account for imbalanced class distributions, leading to poor predictions for minority class samples. Recently, there has been a growing interest around the Area Under ROC curve (AUC) metric due to the fact that it can provide meaningful performance measures in the presence of imbalanced data. In this paper, we propose a new margin-based feature selection metric that defines the quality of a set of features by considering the maximized AUC margin it induces during the process of learning with boosting. Our algorithm measures the cumulative effect each feature has on the margin distribution associated with the weighted linear combination that boosting produces over the positive and the negative examples. Experiments on various real imbalanced data sets show the effectiveness of our algorithm when faced with selecting informative features from small data possessing skewed class distributions.
international conference on tools with artificial intelligence | 2011
Malak Alshawabkeh; Javed A. Aslam; David R. Kaeli; Jennifer G. Dy
Intrusion detection systems (IDSs) are continuously evolving, with the goal of improving the security of computer infrastructures. However, one of the most significant challenges in this area is the poor detection rate, due to the presence of excessive features in a data set whose class distributions are imbalanced. Despite the relatively long existence and the promising nature of feature selection methods, most of them fail to account for imbalance class distributions, particularly, for intrusion data, leading to poor predictions for minority class samples. In this paper, we propose a new feature selection algorithm to enhance the accuracy of IDS of virtual server environments. Our algorithm assigns weights to subsets of features according to the maximized area under the ROC curve (AUC) margin it induces during the boosting process over the minority and the majority examples. The best subset of features is then selected by a greedy search strategy. The empirical experiments are carried out on multiple intrusion data sets using different commercial virtual appliances and real malwares.
Proceedings of the First International Workshop on Secure and Resilient Architectures and Systems | 2012
Malak Alshawabkeh; David R. Kaeli; Javed A. Aslam; Jennifer G. Dy; Dana Schaa
Intrusion detection is one of the high priority and challenging tasks in many technologies, particularly, in virtualization technology. There is a need to safeguard these systems from known vulnerabilities and at the same time take steps to detect new and unseen, but possible, system abuses by developing more reliable and efficient intrusion detection systems. In this correspondence, we propose a machine learning based intrusion detection algorithm based on Enhanced Boosting with Decision Stumps algorithm to detect various categories of attacks utilizing information embedded within the virtual machine monitor (VMM) level. In the algorithm, decision stumps are used as weak classifiers. The decision rules are provided for different types of features. By combining the weak classifiers for the heterogeneous mixture features types into a strong classifier, the relations between these features are handled naturally, without any forced conversions between them. Moreover, adjustable initial weights based on the area under the ROC curve (AUC) are adopted to make a tradeoff between the false-alarm and detection rates. Experimental results show that our algorithm has low computational complexity and error rates as tested on real malwares.
international conference on machine learning and applications | 2012
Malak Alshawabkeh; Alma Riska; Adnan Sahin; Motasem Awwad
In this paper, we develop an automated and adaptive framework that aims to move active data to high performance storage tiers and inactive data to low cost/high capacity storage tiers by learning patterns of the storage workloads. The framework proposed is designed using efficient Markov chain correlation based clustering method (MCC), which can quickly predict or detect any changes in the current workload based on what the system has experienced before. The workload data is first normalized and Markov chains are constructed from the dynamics of the IO loads of the data storage units. Based on the correlation of one-step Markov chain transition probabilities k-means method is employed to group the storage units that have similar behavior at each point. Such framework can then easily be incorporated in various resource management policies that aim at enhancing performance, reliability, availability. The predictive nature of the model, particularly makes a storage system both faster and lower-cost at the same time, because it only uses high performance tiers when needed, and uses low cost/high capacity tiers when possible.
Archive | 2009
Micha Moffie; David R. Kaeli; Aviram Cohen; Javed A. Aslam; Malak Alshawabkeh; Jennifer G. Dy; Fatemeh Azmandian
Archive | 2013
Sean C. Dolan; Dana Naamad; Alma Dimnaku; Malak Alshawabkeh; Adnan Sahin
Archive | 2018
Owen Martin; Malak Alshawabkeh; Hui Wang; Xiaomei Liu; Sean C. Dolan; Adnan Sahin