Mohamed Khalil-Hani | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohamed Khalil-Hani is active.

Explore More

Publication

Featured researches published by Mohamed Khalil-Hani.

Future Generation Computer Systems | 2013

Biometric encryption based on a fuzzy vault scheme with a fast chaff generation algorithm

Mohamed Khalil-Hani; Muhammad Nadzir Marsono; Rabia Bakhteri

Fuzzy vault is a scheme that complements traditional cryptographic security systems by combining it with biometric authentication to overcome the security vulnerability inherent in cryptographic key storage. Biometric encryption systems based on fuzzy vault scheme are suitable for stand-alone security and authentication devices in the form of system-on-chip (SoC). However, the current fuzzy vault scheme has too many compute-intensive processes to make this feasible for SoC implementation. The most critical but compute-intensive function in the fuzzy vault scheme is the chaff generation which produces noise (chaff) points that hide the valid points inside the vault template. In this paper, we propose a new chaff generation algorithm which is computationally fast and viable for hardware acceleration by employing simple arithmetic operations. Complexity study shows that our algorithm has a complexity of O(n^2), which is a significant improvement over the existing method that exhibits O(n^3) complexity. Our experimental results show that, to generate 500 chaff points, the proposed algorithm gives a performance speed-up of over 140 times over existing Clancys algorithm. With the new chaff generation algorithm, it becomes much more amenable to implement the fuzzy vault scheme in the resource-constrained environment of system-on-chip.

international conference on intelligent systems, modelling and simulation | 2010

Hardware Acceleration of OpenSSL Cryptographic Functions for High-Performance Internet Security

Mohamed Khalil-Hani; Vishnu P. Nambiar; Muhammad Nadzir Marsono

The Transport Layer Security (TLS) protocol is currently the predominant method of implementing Internet security. This paper proposes an FPGA-based embedded system integrating hardware that accelerates the cryptographic algorithms used in the SSL/TLS protocol. OpenSSL, an open source implementation of the SLL v3 and TLS v1 protocol, is deployed in the proposed embedded system powered with a Nios-2 embedded soft-core processor. Nios2-Linux RTOS is applied, which serves to provide Ethernet connectivity, multitasking, and support for the OpenSSL library. Key cipher functions used in SSL-driven connections, which include AES-256 symmetric encryption, SHA-2 hashing, RSA-2048 publickey cryptography, are accelerated in hardware. The embedded cryptosystem is prototyped completely on an Altera Stratix II FPGA development board. Experimental results show significant improvements in performance of the SSL transactions when the proposed embedded cryptosystem is deployed in the networking system.

international conference on high performance computing and simulation | 2010

Securing cryptographic key with fuzzy vault based on a new chaff generation method

Mohamed Khalil-Hani; Rabia Bakhteri

A crucial issue in the design of a cryptographic system is the problem of key management. A state-of-the-art solution to this problem is to use bio-cryptosystems, in which cryptography is combined with biometrics. In this solution, the user biometrics is used to protect the cryptographic key. A popular approach to the design of such bio-cryptosystems is the application of a fuzzy vault scheme. This so-called vault is a secure storage in which the key is hidden within the biometric data mixed up with meaningless chaff points. The most critical operation in the fuzzy vault scheme is generation of these chaff points. Experiments will show that this module is the most compute-intensive part of the whole system. This paper introduces a new chaff generation algorithm for the fuzzy vault in a bio-cryptosystem. The proposed algorithm, which is based on a circle packing mathematical theorem, is computationally less intensive than existing methods. Experimental results show that the proposed algorithm is around 100 times faster than existing methods for 200, and above, number of chaff points, and therefore, is suitable for a real-time embedded system implementation.

Neurocomputing | 2016

Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems

Shan Sung Liew; Mohamed Khalil-Hani; Rabia Bakhteri

This paper focuses on the enhancement of the generalization ability and training stability of deep neural networks (DNNs). New activation functions that we call bounded rectified linear unit (ReLU), bounded leaky ReLU, and bounded bi-firing are proposed. These activation functions are defined based on the desired properties of the universal approximation theorem (UAT). An additional work on providing a new set of coefficient values for the scaled hyperbolic tangent function is also presented. These works result in improved classification performances and training stability in DNNs. Experimental works using the multilayer perceptron (MLP) and convolutional neural network (CNN) models have shown that the proposed activation functions outperforms their respective original forms in regards to the classification accuracies and numerical stability. Tests on MNIST, mnist-rot-bg-img handwritten digit, and AR Purdue face databases show that significant improvements of 17.31%, 9.19%, and 74.99% can be achieved in terms of the testing misclassification error rates (MCRs), applying both mean squared error (MSE) and cross-entropy (CE) loss functions This is done without sacrificing the computational efficiency. With the MNIST dataset, bounding the output of an activation function results in a 78.58% reduction in numerical instability, and with the mnist-rot-bg-img and AR Purdue databases the problem is completely eliminated. Thus, this work has demonstrated the significance of bounding an activation function in helping to alleviate the training instability problem when training a DNN model (particularly CNN).

Computer Methods and Programs in Biomedicine | 2016

Paroxysmal atrial fibrillation prediction method with shorter HRV sequences

K.H. Boon; Mohamed Khalil-Hani; Mb Malarvili; C.W. Sia

This paper proposes a method that predicts the onset of paroxysmal atrial fibrillation (PAF), using heart rate variability (HRV) segments that are shorter than those applied in existing methods, while maintaining good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to stabilize (electrically) and prevent the onset of atrial arrhythmias with different pacing techniques. We investigate the effect of HRV features extracted from different lengths of HRV segments prior to PAF onset with the proposed PAF prediction method. The pre-processing stage of the predictor includes QRS detection, HRV quantification and ectopic beat correction. Time-domain, frequency-domain, non-linear and bispectrum features are then extracted from the quantified HRV. In the feature selection, the HRV feature set and classifier parameters are optimized simultaneously using an optimization procedure based on genetic algorithm (GA). Both full feature set and statistically significant feature subset are optimized by GA respectively. For the statistically significant feature subset, Mann-Whitney U test is used to filter non-statistical significance features that cannot pass the statistical test at 20% significant level. The final stage of our predictor is the classifier that is based on support vector machine (SVM). A 10-fold cross-validation is applied in performance evaluation, and the proposed method achieves 79.3% prediction accuracy using 15-minutes HRV segment. This accuracy is comparable to that achieved by existing methods that use 30-minutes HRV segments, most of which achieves accuracy of around 80%. More importantly, our method significantly outperforms those that applied segments shorter than 30 minutes.

international visual informatics conference | 2011

Character recognition of license plate number using convolutional neural network

Syafeeza Ahmad Radzi; Mohamed Khalil-Hani

This paper presents machine-printed character recognition acquired from license plate using convolutional neural network (CNN). CNN is a special type of feed-forward multilayer perceptron trained in supervised mode using a gradient descent Backpropagation learning algorithm that enables automated feature extraction. Common methods usually apply a combination of handcrafted feature extractor and trainable classifier. This may result in sub-optimal result and low accuracy. CNN has proved to achieve state-of-the-art results in such tasks such as optical character recognition, generic objects recognition, real-time face detection and pose estimation, speech recognition, license plate recognition etc. CNN combines three architectural concept namely local receptive field, shared weights and subsampling. The combination of these concepts and optimization method resulted in accuracy around 98%. In this paper, the method implemented to increase the performance of character recognition using CNN is proposed and discussed.

Neurocomputing | 2016

An optimized second order stochastic learning algorithm for neural network training

Shan Sung Liew; Mohamed Khalil-Hani; Rabia Bakhteri

This paper proposes an improved stochastic second order learning algorithm for supervised neural network training. The proposed algorithm, named bounded stochastic diagonal Levenberg-Marquardt (B-SDLM), utilizes both gradient and curvature information to achieve fast convergence while requiring only minimal computational overhead than the stochastic gradient descent (SGD) method. B-SDLM has only a single hyperparameter as opposed to most other learning algorithms that suffer from the hyperparameter overfitting problem due to having more hyperparameters to be tuned. Experiments using the multilayer perceptron (MLP) and convolutional neural network (CNN) models have shown that B-SDLM outperforms other learning algorithms with regard to the classification accuracies and computational efficiency (about 5.3% faster than SGD on the mnist-rot-bg-img database). It can classify all testing samples correctly on the face recognition case study based on AR Purdue database. In addition, experiments on handwritten digit classification case studies show that significant improvements of 19.6% on MNIST database and 17.5% on mnist-rot-bg-img database can be achieved in terms of the testing misclassification error rates (MCRs). The computationally expensive Hessian calculations are kept to a minimum by using just 0.05% of the training samples in its estimation or updating the learning rates once per two training epochs, while maintaining or even achieving lower testing MCRs. It is also shown that B-SDLM works well in the mini-batch learning mode, and we are able to achieve 3.32 × performance speedup when deploying the proposed algorithm in a distributed learning environment with a quad-core processor.

Neurocomputing | 2014

Hardware implementation of evolvable block-based neural networks utilizing a cost efficient sigmoid-like activation function

Vishnu P. Nambiar; Mohamed Khalil-Hani; Riadh Sahnoun; Muhammad Nadzir Marsono

This paper presents the hardware implementation of an evolvable block-based neural network that utilizes a novel and cost efficient sigmoid-like activation function. Evolvable block-based neural networks (BbNNs) feature simultaneous optimization of structure, and viable implementation in reconfigurable digital hardware such as field programmable gate arrays (FPGAs). Efficient hardware implementation of BbNN structures is the primary goal of this paper. Various aspects of BbNN modeling and design considerations are presented. The neuron blocks are designed with properly described methodology, using only a single multiplier each, and implement a cost efficient sigmoid-like activation function. A novel method of reusing the multiplier to smoothly approximate a hyperbolic tangent (tanh) function to be used as the activation function for the neuron blocks is also presented. This is an important contribution, because a sigmoid-like activation function is provided at almost no additional cost. The neuron blocks are very cost efficient in terms of logic utilization when compared to the previous work. The BbNN is designed as an system-on-chip (SoC), and is functionally verified and tested on several case studies. The system performance allows real-time classification, and executes up to 410×faster than embedded software.

Computing | 2013

HW/SW co-design of reconfigurable hardware-based genetic algorithm in FPGAs applicable to a variety of problems

Vishnu P. Nambiar; Sathivellu Balakrishnan; Mohamed Khalil-Hani; Muhammad Nadzir Marsono

This paper describes the implementation of a reconfigurable hardware-based genetic algorithm (HGA) accelerator using the hardware-software (HW/SW) co-design methodology. This HGA is coupled with a unique TRNG that extracts random jitters from a phase lock loop (PLL) to ensure proper GA operation. It is then applied and benchmarked with several case studies, which include the optimization of a simple fitness function, a constrained Michalewicz function, and the tuning of parameters in finger-vein biometrics. A HGA solution is necessary in systems that demand high performance during the optimization process. However, implementations that are completely designed in hardware will result in a very rigid architecture, making it difficult to reconfigure the system for use in different applications. This paper aims to solve this issue by proposing a HGA design that provides reconfigurability and flexibility by moving problem-dependent processes into software. The prototyping platform used is an Altera Stratix II EP2S60 FPGA prototyping board with a clock frequency of 50 MHz. The HW/SW co-design technique is applied, and system partitioning is done based on aspects such as system constraints, operational intensity, process sequencing, hardware logic utilization, and reconfigurability. Experimental results show that the proposed HGA outperforms equivalent software implementations compiled with an open-sourced C++ GA component library (GAlib) running on the same prototyping platform by 102 times at most. In the final case study, the application of the proposed HGA in tunable parameter optimization in finger-vein biometrics improved the matching rate, reducing the equal error rate (EER) value from 1.004% down to 0.101%.

international conference on computer engineering and technology | 2009

An AES Tightly Coupled Hardware Accelerator in an FPGA-based Embedded Processor Core

Arif Irwansyah; Vishnu P. Nambiar; Mohamed Khalil-Hani

This paper presents the implementation of a tightly coupled hardware architectural enhancement to the Altera FPGA-based Nios II embedded processor. The goal is to accelerate Advanced Encryption Standard (AES) operations in 128, 192 and 256-bits, for application in a high-performance embedded system implementing symmetric key cryptography. The concept is to augment the embedded processor with a new custom instruction for encryption and decryption operations. In order to show the effectiveness of tightly coupled hardware implementation over coprocessor based approach, we have also realized the design in coprocessor approach using the same AES core. Experimental results show that for the encryption or decryption operations, real implementation with custom instructions and tightly coupled hardware is about 35% faster than the co-processor based hardware.

Explore More