Mohamed Bader-El-Den | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohamed Bader-El-Den is active.

Explore More

Publication

Featured researches published by Mohamed Bader-El-Den.

international conference on neural information processing | 2012

GARF: towards self-optimised random forests

Mohamed Bader-El-Den; Mohamed Medhat Gaber

Ensemble learning is a machine learning approach that utilises a number of classifiers to contribute via voting to identifying the class label for any unlabelled instances. Random Forests RF is an ensemble classification approach that has proved its high accuracy and superiority. However, most of the commonly used selection methods are static. Motivated by the idea of having self-optimised RF capable of dynamical changing the trees in the forest. This study uses a genetic algorithm GA approach to further enhance the accuracy of RF. The approach is termed as Genetic Algorithm based RF (GARF). Our extensive experimental study has proved that RF performance is be boosted using the GA approach.

international joint conference on neural network | 2016

Hierarchical classification for dealing with the Class imbalance problem

Mohamed Bader-El-Den; Eleman Teitei; Mo Adda

The aim of classification in machine learning is to utilize knowledge gained from applying learning algorithms on a given data so as determine what class an unlabelled data having same pattern belongs to. However, algorithms do not learn properly when a massive difference in size between data classes exist. This classification problem exists in many real world application domains and has been a popular area of focus by machine learning and data mining researchers. The class imbalance problem is further made complex with the presence of associative data difficult factors. The duo have proven to greatly deteriorate classification performance. This paper introduces a two-phased data level approach for binary classes which entails the temporary re-labelling of classes. The proposed approach takes advantage of the local neighbourhood of the minority instances to identify and treat difficult examples belonging to both classes. Its outcome was satisfactory when compared against various data-level methods using datasets extracted from KEEL and UCI datasets repository.

acs/ieee international conference on computer systems and applications | 2014

Self-adaptive heterogeneous random forest

Mohamed Bader-El-Den

Random Forest RF is an ensemble learning approach that utilises a number of classifiers to contribute though voting to predicting the class label of any unlabelled instances. Parameters such as the size of the forest N and the number of features used at each split M, has significant impact on the performance of the RF especially on instances with very large number of attributes. In a previous work Genetic Algorithms has been used to dynamically optimize the size of RF. This study extends this genetic algorithm approach to further enhance the accuracy of Random Forests by building the forest out of heterogeneous decision trees, heterogeneous here means trees with different M values. The approach is termed as Heterogeneous Genetic Algorithm based Random Forests (HGARF). As Random Forests generates a typical large number of decision trees with randomisation over the feature space when splitting at each node for all the trees, this has motivated the development of a genetic algorithm based optimisation. Typically, HGARF accepts as an input a forest RF→ of N trees, the initial population is randomly generated from RF→ as a number of smaller random forests rfi→ where each one has a number ni ≤ N of trees. This population of forests is then evolved through a number of generations using genetic algorithms. Our extensive experimental study has proved that Random Forests performance could be boosted using the genetic algorithm approach.

Memetic Computing | 2018

Guided genetic algorithm for the multidimensional knapsack problem

Abdellah Rezoug; Mohamed Bader-El-Den; Dalila Boughaci

Genetic Algorithm (GA) has emerged as a powerful method for solving a wide range of combinatorial optimisation problems in many fields. This paper presents a hybrid heuristic approach named Guided Genetic Algorithm (GGA) for solving the Multidimensional Knapsack Problem (MKP). GGA is a two-step memetic algorithm composed of a data pre-analysis and a modified GA. The pre-analysis of the problem data is performed using an efficiency-based method to extract useful information. This prior knowledge is integrated as a guide in a GA at two stages: to generate the initial population and to evaluate the produced offspring by the fitness function. Extensive experimentation was carried out to examine GGA on the MKP. The main GGA parameters were tuned and a comparative study with other methods was conducted on well-known MKP data. The real impact of GGA was checked by a statistical analysis using ANOVA, t-test and Welch’s t-test. The obtained results showed that the proposed approach largely improved standard GA and was highly competitive with other optimisation methods.

Information Processing and Management | 2018

Question categorization and classification using grammar based approach

Alaa Mohasseb; Mohamed Bader-El-Den; Mihaela Cocea

Abstract Question-answering has become one of the most popular information retrieval applications. Despite that most question-answering systems try to improve the user experience and the technology used in finding relevant results, many difficulties are still faced because of the continuous increase in the amount of web content. Questions Classification (QC) plays an important role in question-answering systems, with one of the major tasks in the enhancement of the classification process being the identification of questions types. A broad range of QC approaches has been proposed with the aim of helping to find a solution for the classification problems; most of these are approaches based on bag-of-words or dictionaries. In this research, we present an analysis of the different type of questions based on their grammatical structure. We identify different patterns and use machine learning algorithms to classify them. A framework is proposed for question classification using a grammar-based approach (GQCC) which exploits the structure of the questions. Our findings indicate that using syntactic categories related to different domain-specific types of Common Nouns, Numeral Numbers and Proper Nouns enable the machine learning algorithms to better differentiate between different question types. The paper presents a wide range of experiments the results show that the GQCC using J48 classifier has outperformed other classification methods with 90.1% accuracy.

international conference on speech and computer | 2017

Web Queries Classification Based on the Syntactical Patterns of Search Types

Alaa Mohasseb; Mohamed Bader-El-Den; Andreas Kanavos; Mihaela Cocea

Nowadays, people make frequent use of search engines in order to find the information they need on the web. The abundance of available data has rendered the process of obtaining relevant information challenging in terms of processing and analyzing it. A broad range of web queries classification techniques have been proposed with the aim of helping in understanding the actual intent behind a web search. In this research, we have categorized search queries through introducing Search Type Syntactical Patterns for automatically identifying and classifying search engine user queries. Experiments show that our approach has a good level of accuracy in identifying different search types.

international conference on machine learning and cybernetics | 2017

Domain specific syntax based approach for text classification in machine learning context

Alaa Mohasseb; Mohamed Bader-El-Den; Han Liu; Mihaela Cocea

Due to the vast amount of data, searching and obtaining relevant information on the web is a challenging task. Despite that a broad range of classification techniques have been proposed to improve the information retrieval methods, many difficulties are still present because of the continuous increase in the amount of web contents, as well as its diversity. In this paper, we propose a method that automatically identifies and classifies user queries by using a domain specific syntax approach — this approach is based on the syntactical pattern of each type of search query. A framework is developed to test the performance of the proposed method. Experimental results show that our approach leads to accurate identification of different query types.

ieee international conference on fuzzy systems | 2016

Fuzzy systems with multiple rule bases for selection of alternatives using TOPSIS

Abdul Malek Yaakob; Alexander Gegov; Mohamed Bader-El-Den; Siti Fatimah Abdul Rahman

This paper introduces a novel modification of the technique for ordering of preference by similarity to ideal solution (TOPSIS) method and uses a fuzzy system with multiple rule bases to solve multi-criteria decision making problems where both benefit and cost criteria are presented as subsystems. Thus, the decision maker evaluates the performance of each alternative for optimization and further observes the performance for both benefit and cost criteria. This approach improves significantly the transparency of the TOPSIS method while ensuring high effectiveness in comparison to established methods. To ensure practicality and effectiveness of the proposed method, a traded equity case study is considered. Furthermore, the ranking based on the proposed method is validated comparatively using spearman rho correlation. The proposed method outperforms the existing TOPSIS methods in terms of ranking for the case study under consideration.

International Journal of Medical Informatics | 2017

Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach

Aya Awad; Mohamed Bader-El-Den; James McNicholas; Jim Briggs

BACKGROUND Mortality prediction of hospitalized patients is an important problem. Over the past few decades, several severity scoring systems and machine learning mortality prediction models have been developed for predicting hospital mortality. By contrast, early mortality prediction for intensive care unit patients remains an open challenge. Most research has focused on severity of illness scoring systems or data mining (DM) models designed for risk estimation at least 24 or 48h after ICU admission. OBJECTIVES This study highlights the main data challenges in early mortality prediction in ICU patients and introduces a new machine learning based framework for Early Mortality Prediction for Intensive Care Unit patients (EMPICU). MATERIALS AND METHODS The proposed method is evaluated on the Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database. Mortality prediction models are developed for patients at the age of 16 or above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU). We employ the ensemble learning Random Forest (RF), the predictive Decision Trees (DT), the probabilistic Naive Bayes (NB) and the rule-based Projective Adaptive Resonance Theory (PART) models. The primary outcome was hospital mortality. The explanatory variables included demographic, physiological, vital signs and laboratory test variables. Performance measures were calculated using cross-validated area under the receiver operating characteristic curve (AUROC) to minimize bias. 11,722 patients with single ICU stays are considered. Only patients at the age of 16 years old and above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU) are considered in this study. RESULTS The proposed EMPICU framework outperformed standard scoring systems (SOFA, SAPS-I, APACHE-II, NEWS and qSOFA) in terms of AUROC and time (i.e. at 6h compared to 48h or more after admission). DISCUSSION AND CONCLUSION The results show that although there are many values missing in the first few hour of ICU admission, there is enough signal to effectively predict mortality during the first 6h of admission. The proposed framework, in particular the one that uses the ensemble learning approach - EMPICU Random Forest (EMPICU-RF) offers a base to construct an effective and novel mortality prediction model in the early hours of an ICU patient admission, with an improved performance profile.

international conference on machine learning and cybernetics | 2016

Oil PVT characterisation using ensemble systems

Munirudeen A. Oloso; Mohamed Hassan; James Buick; Mohamed Bader-El-Den

In reservoir engineering, there is always a need to estimate crude oil Pressure, Volume and Temperature (PVT) properties for many critical calculations and decisions such as reserve estimate, material balance design and oil recovery strategy, among others. Empirical correlation are often used instead of costly laboratory experiments to estimate these properties. However, these correlations do not always give sufficient accuracy. This paper develops ensemble support vector regression and ensemble regression tree models to predict two important crude oil PVT properties: bubblepoint pressure and oil formation volume factor at bubblepoint. The developed ensemble models are compared with standalone support vector machine (SVM) and regression tree models, and commonly used empirical correlations. The ensemble models give better accuracy when compared to correlations from the literature and more consistent results than the standalone SVM and regression tree models.

Explore More