Is this you? Create Your Porfile

Wael Khreich

École de technologie supérieure

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wael Khreich is active.

Explore More

Publication

Featured researches published by Wael Khreich.

Information Sciences | 2012

A survey of techniques for incremental learning of HMM parameters

Wael Khreich; Eric Granger; Ali Miri; Robert Sabourin

The performance of Hidden Markov Models (HMMs) targeted for complex real-world applications are often degraded because they are designed a priori using limited training data and prior knowledge, and because the classification environment changes during operations. Incremental learning of new data sequences allows to adapt HMM parameters as new data becomes available, without having to retrain from the start on all accumulated training data. This paper presents a survey of techniques found in literature that are suitable for incremental learning of HMM parameters. These techniques are classified according to the objective function, optimization technique and target application, involving block-wise and symbol-wise learning of parameters. Convergence properties of these techniques are presented along with an analysis of time and memory complexity. In addition, the challenges faced when these techniques are applied to incremental learning is assessed for scenarios in which the new training data is limited and abundant. While the convergence rate and resource requirements are critical factors when incremental learning is performed through one pass over abundant stream of data, effective stopping criteria and management of validation sets are important when learning is performed through several iterations over limited data. In both cases managing the learning rate to integrate pre-existing knowledge and new data is crucial for maintaining a high level of performance. Finally, this paper underscores the need for empirical benchmarking studies among techniques presented in literature, and proposes several evaluation criteria based on non-parametric statistical testing to facilitate the selection of techniques given a particular application domain.

Pattern Recognition | 2012

Adaptive ROC-based ensembles of HMMs applied to anomaly detection

Wael Khreich; Eric Granger; Ali Miri; Robert Sabourin

Hidden Markov models (HMMs) have been successfully applied in many intrusion detection applications, including anomaly detection from sequences of operating system calls. In practice, anomaly detection systems (ADSs) based on HMMs typically generate false alarms because they are designed using limited amount of representative training data. Since new data may become available over time, an important feature of an ADS is the ability to accommodate newly acquired data incrementally, after it has originally been trained and deployed for operations. In this paper, a system based on the receiver operating characteristic (ROC) is proposed to efficiently adapt ensembles of HMMs (EoHMMs) in response to new data, according to a learn-and-combine approach. When a new block of training data becomes available, a pool of base HMMs is generated from the data using a different number of HMM states and random initializations. The responses from the newly trained HMMs are then combined to those of the previously trained HMMs in ROC space using a novel incremental Boolean combination (incrBC) technique. Finally, specialized algorithms for model management allow to select a diversified EoHMM from the pool, and adapt Boolean fusion functions and thresholds for improved performance, while it prunes redundant base HMMs. The proposed system is capable of changing the desired operating point during operations, and this point can be adjusted to changes in prior probabilities and costs of errors. Computer simulations conducted on synthetic and real-world host-based intrusion detection data indicate that the proposed system can achieve a significantly higher level of performance than when parameters of a single best HMM are estimated, at each learning stage, using reference batch and incremental learning techniques. It also outperforms the learn-and-combine approaches using static fusion functions (e.g., majority voting). Over time, the proposed ensemble selection algorithms form compact EoHMMs, while maintaining or improving system accuracy. Pruning allows to limit the pool size from increasing indefinitely, thereby reducing the storage space for accommodating HMMs parameters without negatively affecting the overall EoHMM performance. Although applied for HMM-based ADSs, the proposed approach is general and can be employed for a wide range of classifiers and detection applications.

Pattern Recognition Letters | 2010

On the memory complexity of the forward-backward algorithm

Wael Khreich; Eric Granger; Ali Miri; Robert Sabourin

The Forward-backward (FB) algorithm forms the basis for estimation of Hidden Markov Model (HMM) parameters using the Baum-Welch technique. It is however, known to be prohibitively costly when estimation is performed from long observation sequences. Several alternatives have been proposed in literature to reduce the memory complexity of FB at the expense of increased time complexity. In this paper, a novel variation of the FB algorithm - called the Efficient Forward Filtering Backward Smoothing (EFFBS) - is proposed to reduce the memory complexity without the computational overhead. Given an HMM with N states and an observation sequence of length T, both FB and EFFBS algorithms have the same time complexity, O(N^2T). Nevertheless, FB has a memory complexity of O(NT), while EFFBS has a memory complexity that is independent of T, O(N). EFFBS requires fewer resources than FB, yet provides the same results.

international conference on communications | 2009

Combining Hidden Markov Models for Improved Anomaly Detection

Wael Khreich; Eric Granger; Robert Sabourin; Ali Miri

In host-based intrusion detection systems (HIDS), anomaly detection involves monitoring for significant deviations from normal system behavior. Hidden Markov Models (HMMs) have been shown to provide a high level performance for detecting anomalies in sequences of system calls to the operating system kernel. Although the number of hidden states is a critical parameter for HMM performance, it is often chosen heuristically or empirically, by selecting the single value that provides the best performance on training data. However, this single best HMM does not typically provide a high level of performance over the entire detection space. This paper presents a multiple-HMMs approach, where each HMM is trained using a different number of hidden states, and where HMM responses are combined in the Receiver Operating Characteristics (ROC) space according to the Maximum Realizable ROC (MRROC) technique. The performance of this approach is compared favorably to that of a single best HMM and to a traditional sequence matching technique called STIDE, using different synthetic HIDS data sets. Results indicate that this approach provides a higher level of performance over a wide range of training set sizes with various alphabet sizes and irregularity indices, and different anomaly sizes, without a significant computational and storage overhead.

International Journal of Biometrics | 2012

Fusion of biometric systems using Boolean combination: an application to iris-based authentication

Eric Granger; Wael Khreich; Robert Sabourin; Dmitry O. Gorodnichy

To improve accuracy and reliability, Boolean combination (BC) can efficiently integrate the responses of multiple biometric systems in the ROC space. However, BC techniques assume that recognition systems are conditionally-independent and that their ROC curves are convex. These assumptions are rarely valid in practice, where systems face complex environments, and are designed using limited enrollment data. In recent research, the authors have introduced an Iterative BC (IBC) technique that applies all Boolean functions iteratively, without prior assumptions. In this paper, IBC is considered for fusion of different commercial biometric systems at the decision level. Performance of IBC is assessed for biometric authentication applications in which the operational response of unimodal iris-base systems are combined. Experiments performed with four different commercial systems using anonymised data collected by the Canada Border Services Agency indicate that IBC fusion with interpolation can signicantly outperform related BC techniques and individual systems.

computational intelligence and security | 2009

A comparison of techniques for on-line incremental learning of HMM parameters in anomaly detection

Wael Khreich; Eric Granger; Ali Miri; Robert Sabourin

Hidden Markov Models (HMMs) have been shown to provide a high level performance for detecting anomalies in intrusion detection systems. Since incomplete training data is always employed in practice, and environments being monitored are susceptible to changes, a system for anomaly detection should update its HMM parameters in response to new training data from the environment. Several techniques have been proposed in literature for on-line learning of HMM parameters. However, the theoretical convergence of these algorithms is based on an infinite stream of data for optimal performances. When learning sequences with a finite length, on-line incremental versions of these algorithms can improve discrimination by allowing for convergence over several training iterations. In this paper, the performance of these techniques is compared for learning new sequences of training data in host-based intrusion detection. The discrimination of HMMs trained with different techniques is assessed from data corresponding to sequences of system calls to the operating system kernel. In addition, the resource requirements are assessed through an analysis of time and memory complexity. Results suggest that the techniques for online incremental learning of HMM parameters can provide a higher level of discrimination than those for on-line learning, yet require significantly fewer resources than with batch training. On-line incremental learning techniques may provide a promising solution for adaptive intrusion detection systems.

international conference on pattern recognition | 2010

Boolean Combination of Classifiers in the ROC Space

Wael Khreich; Eric Granger; Ali Miri; Robert Sabourin

Using Boolean AND and OR functions to combine the responses of multiple one- or two-class classifiers in the ROC space may significantly improve performance of a detection system over a single best classifier. However, techniques found in literature assume that the classifiers are conditionally independent, and that their ROC curves are convex. These assumptions are not valid in most real-world applications, where classifiers are designed using limited and imbalanced training data. A new Iterative Boolean Combination (IBC) technique applies all Boolean functions to combine the ROC curves produced by multiple classifiers without prior assumptions, and its time complexity is linear according to the number of classifiers. The results of computer simulations conducted on synthetic and real-world host-based intrusion detection data indicate that combining the responses from multiple HMMs with IBC can achieve a significantly higher level of performance than with the AND and OR combinations, especially when training data is limited and imbalanced.

international conference on multiple classifier systems | 2011

Incremental Boolean combination of classifiers

Wael Khreich; Eric Granger; Ali Miri; Robert Sabourin

The incremental Boolean combination (incrBC) technique is a new learn-and-combine approach that is proposed to adapt ensemble-based pattern classification systems over time, in response to new data acquired during operations. When a new block of training data becomes available, this technique generates a diversified pool of base classifiers from the data by varying training hyperparameters and random initializations. The responses of these classifiers are then combined with those of previously-trained classifiers through Boolean combination in the ROC space. Through this process, an ensemble is selected from the pool, where Boolean fusion functions and thresholds are adapted for improved accuracy, while redundant base classifiers are pruned. Results of computer simulations conducted using Hidden Markov Models (HMMs) on synthetic and real-world host-based intrusion detection data indicate that incrBC can sustain a significantly higher level of accuracy than when the parameters of a single best HMM are re-estimated for each new block of data, using reference batch and incremental learning techniques. It also outperforms static fusion techniques such as majority voting for combining the responses of new and previously-generated pools of HMMs. Pruning prevents pool sizes from increasing indefinitely over time, without adversely affecting the overall ensemble performance.

2011 IEEE Workshop on Computational Intelligence in Biometrics and Identity Management (CIBIM) | 2011

Exploring the upper bound performance limit of iris biometrics using score calibration and fusion

Dmitry O. Gorodnichy; Elan Dubrofsky; Richard Hoshino; Wael Khreich; Eric Granger; Robert Sabourin

Researchers now acknowledge that the ultimate goal for biometric technologies to be error-free may never be achieved for any biometric modality. The key interest therefore for any biometric modality is to know its current performance limits. For the iris modality, which is intensively used for trusted traveller programs in many countries, the question of the iris recognition limitations is of particular importance, as it affects security risk mitigation strategies employed by the programs. In this paper, we provide the answer to this question, based on the recent large-scale evaluations of state-of-the-art iris biometrics systems conducted by the National Institute of Standards and Technology (NIST) and the Canada Border Services Agency (CBSA) and two performance-improving post-processing methods developed by the CBSA and its academic partners: one based on score recalibration and the other based on fusion of decisions from multiple systems. Particular emphasis of the paper is on the description of datasets used in iris evaluations and the presentation of the new large-scale iris dataset created for the purpose at the CBSA. The importance of proper evaluation metrics and methodologies used in iris evaluations, including the subject-based analysis, is discussed.

canadian conference on electrical and computer engineering | 2016

A novel approach in household electricity consumption forecasting

Assad Sahebalam; Soosan Beheshti; Wael Khreich; Edward W. Nidoy

In this paper, we devise a comprehensive model to forecast the energy consumption of the bulk energy consumers. The behavior of the bulk energy consumers is different and consumers has their own consumption patterns; therefore, we segment the bulk energy consumers into two clusters: low-consumption and high-consumption. Individual predictive models are made for each segment. Box and Jenkins and regression models are utilized to forecast energy consumption. We use the time series to forecast energy usage and as a novel method, we combine the time series results with weather data to form our regression model. Our results shows decrease in error and extension of forecasting period. We use the Akaike Information Criterion (AIC) for model selection among a finite set of models.

Explore More