Mohiuddin Solaimani | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohiuddin Solaimani is active.

Explore More

Publication

Featured researches published by Mohiuddin Solaimani.

2014 IEEE Symposium on Computational Intelligence in Cyber Security (CICS) | 2014

Spark-based anomaly detection over multi-source VMware performance data in real-time

Mohiuddin Solaimani; Mohammed Iftekhar; Latifur Khan; Bhavani M. Thuraisingham; Joey Burton Ingram

Anomaly detection refers to identifying the patterns in data that deviate from expected behavior. These non-conforming patterns are often termed as outliers, malwares, anomalies or exceptions in different application domains. This paper presents a novel, generic real-time distributed anomaly detection framework for multi-source stream data. As a case study, we have decided to detect anomaly for multi-source VMware-based cloud data center. The framework monitors VMware performance stream data (e.g., CPU load, memory usage, etc.) continuously. It collects these data simultaneously from all the VMwares connected to the network. It notifies the resource manager to reschedule its resources dynamically when it identifies any abnormal behavior of its collected data. We have used Apache Spark, a distributed framework for processing performance stream data and making prediction without any delay. Spark is chosen over a traditional distributed framework (e.g., Hadoop and MapReduce, Mahout, etc.) that is not ideal for stream data processing. We have implemented a flat incremental clustering algorithm to model the benign characteristics in our distributed Spark based framework. We have compared the average processing latency of a tuple during clustering and prediction in Spark with Storm, another distributed framework for stream data processing. We experimentally find that Spark processes a tuple much quicker than Storm on average.

international conference on big data | 2014

Statistical technique for online anomaly detection using Spark over heterogeneous data from multi-source VMware performance data

Mohiuddin Solaimani; Mohammed Iftekhar; Latifur Khan; Bhavani M. Thuraisingham

Anomaly detection refers to the identification of patterns in a dataset that do not conform to expected patterns. Depending on the domain, the non-conformant patterns are assigned various tags, e.g. anomalies, outliers, exceptions, malwares and so forth. Online anomaly detection aims to detect anomalies in data flowing in a streaming fashion. Such stream data is commonplace in todays cloud data centers that house a large array of virtual machines(VM) producing vast amounts of performance data in real-time. Sophisticated detection mechanism will likely entail collation of data from heterogeneous sources with diversified data format and semantics. Therefore, detection of performance anomaly in this context requires a distributed framework with high throughput and low latency. Apache Spark is one such framework that represents the bleeding-edge amongst its contemporaries. In this paper, we have taken up the challenge of anomaly detection in VMware based cloud data centers. We have employed a Chi-square based statistical anomaly detection technique in Spark. We have demonstrated how to take advantage of the high processing power of Spark to perform anomaly detection on heterogeneous data using statistical techniques. Our approach is optimally designed to cope with the heterogeneity of input data streams and the experiments we conducted testify to its efficacy in online anomaly detection.

information reuse and integration | 2014

Real-time anomaly detection over VMware performance data using storm

Mohiuddin Solaimani; Latifur Khan; Bhavani M. Thuraisingham

Anomaly detection is the identification of items or observations which deviate from an expected pattern in a dataset. This paper proposes a novel real time anomaly detection framework for dynamic resource scheduling of a VMware-based cloud data center. The framework monitors VMware performance stream data (e.g. CPU load, memory usage, etc.). Hence, the framework continuously needs to collect data and make decision without any delay. We have used Apache Storm, distributed framework for handling performance stream data and making prediction without any delay. Storm is chosen over a traditional distributed framework (e.g., Hadoop and MapReduce, Mahout) that is good for batch processing. An incremental clustering algorithm to model benign characteristics is incorporated in our storm-based framework. During continuous incoming test stream, if the model finds data deviated from its benign behavior, it considers that as an anomaly. We have shown effectiveness of our framework by providing real-time complex analytic functionality over stream data.

Software - Practice and Experience | 2016

Online anomaly detection for multi-source VMware using a distributed streaming framework

Mohiuddin Solaimani; Mohammed Iftekhar; Latifur Khan; Bhavani M. Thuraisingham; Joey Burton Ingram; Sadi Evren Seker

Anomaly detection refers to the identification of patterns in a dataset that do not conform to expected patterns. Such non‐conformant patterns typically correspond to samples of interest and are assigned to different labels in different domains, such as outliers, anomalies, exceptions, and malware. A daunting challenge is to detect anomalies in rapid voluminous streams of data.

international conference on social computing | 2017

APART: Automatic Political Actor Recommendation in Real-time

Mohiuddin Solaimani; Sayeed Salam; Latifur Khan; Patrick T. Brandt; Vito D’Orazio

Extracting actor data from news reports is important when generating event data. Hand-coded dictionaries are used to code actors and actions. Manually updating dictionaries for new actors and roles is costly and there is no automated method. We propose a dynamic frequency-based actor ranking algorithm with partial string matching for new actor-role detection, based on similar actors in the CAMEO dictionary. This is compared to a graph-based weighted label propagation baseline method. Results show our method outperforms the alternatives.

intelligence and security informatics | 2016

Near real-time atrocity event coding

Mohiuddin Solaimani; Sayeed Salam; Ahmad M. Mustafa; Latifur Khan; Patrick T. Brandt; Bhavani M. Thuraisingham

In recent years, mass atrocities, terrorism, and political unrest have caused much human suffering. Thousands of innocent lives have been lost to these events. With the help of advanced technologies, we can now dream of a tool that uses machine learning and natural language processing (NLP) techniques to warn of such events. Detecting atrocities demands structured event data that contain metadata, with multiple fields and values (e.g. event date, victim, perpetrator). Traditionally, humans apply common sense and encode events from news stories but this process is slow, expensive, and ambiguous. To accelerate it, we use machine coding to generate an encoded event. In this paper, we develop a near-real-time supervised machine coding technique with an external knowledge base, WordNet, to generate a structured event. We design a Spark-based distributed framework with a web scraper to gather news reports periodically, process, and generate events. We use Spark to reduce the performance bottleneck while processing raw text news using CoreNLP.

international conference on data mining | 2013

Host-Based Anomaly Detection Using Learning Techniques

Ahmad M. Mustafa; Mohiuddin Solaimani; Latifur Khan; Ken Chiang; Joey Burton Ingram

Anomaly detection is a crucial part of computer-security. This paper presents various host based anomaly detection techniques. One technique uses clustering with markov network (CMN). In CMN we first cluster the benign training data and then from each cluster we build a separate markov network to model the benign behavior. During testing, each Markov network calculates the probability of each testing instance. If the probability from multiple markov networks is low, we classify the point as malicious. The paper also presents CMN with Outlying subspace (CMN-OS). In CMN-OS, a training data set that consists of benign and few malicious data is used to identify the outlying subspace which is used as a lower dimensional representation of the full dimensional space. Then, CMN uses the new subspace to represent its training and testing data sets. Finally, the paper presents Clustered Label Propagation (CLP). CLP starts by clustering benign and malicious training. It then labels each cluster based on its central-most point. During testing, these points are added to the testing data as labeled points and Label Propagation is used to label the testing data. We experimentally show that CMN approach outperforms several other approaches and performs similar to CMN-OS and we show that it is less sensitive to noise as compared to other approaches.

international conference on big data | 2016