Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ghada Badr is active.

Publication


Featured researches published by Ghada Badr.


Computational Biology and Chemistry | 2015

Genetic Bee Colony (GBC) algorithm

Hala M. Alshamlan; Ghada Badr; Yousef Al-Ohali

Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification.


BioMed Research International | 2015

mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

Hala M. Alshamlan; Ghada Badr; Yousef Al-Ohali

An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.


International Journal of Bioscience, Biochemistry and Bioinformatics | 2014

The Performance of Bio-Inspired Evolutionary Gene Selection Methods for Cancer Classification Using Microarray Dataset

Hala M. Alshamlan; Ghada Badr; Yousef Al-Ohali

—Microarray based gene expression profiling has become an important and promising dataset for cancer classification that are used for diagnosis and prognosis purposes. It is important to determine the informative genes that cause the cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. Furthermore, find accurate gene selection method that reduce the dimensionality and select informative genes is very significant issue in cancer classification area. In literature, there are several gene selection methods for cancer classification using microarray dataset. However, most of them did not concern on identifying minimum number of informative genes with high classification accuracy. Therefore, in our research study we discuss the performance of Bio-Inspired evolutionary gene selection method in cancer classification using microarray dataset. And, we prove that the Bio-Inspired evolutionary gene selection methods have superior classification accuracy with minimum number of selected genes.


Lecture Notes in Computer Science | 2004

Dictionary-Based Syntactic Pattern Recognition Using Tries

B. John Oommen; Ghada Badr

This paper deals with the problem of estimating a transmitted string X * by processing the corresponding string Y, which is a noisy version of X *. We assume that Y contains substitution, insertion and deletion errors, and that X * is an element of a finite (but possibly, large) dictionary, H. The best estimate X + of X *, is defined as that element of H which minimizes the Generalized Levenshtein Distance D(X, Y) between X and Y, for all X ∈ H. All existing techniques for computing X + requires a separate evaluation of the edit distances between Y and every X ∈ H. In this paper, we show how we can evaluate D(X, Y) for every X ∈ H simultaneously, without resorting to any parallel computations. This is achieved by resorting to the use of an additional data structure called the Linked List of Prefixes (LLP), which is built “on top of” the trie representation of the dictionary. The computational advantage (for a dictionary made from the set of 1023 most common words augmented by computer-related words) gained is at least 50% and 80% measured in terms of the time and the number of operations required respectively. The accuracy forfeited is negligible.


DaEng | 2014

A Comparative Study of Cancer Classification Methods Using Microarray Gene Expression Profile

Hala M. Alshamlan; Ghada Badr; Yousef Al-Ohali

Microarray based gene expression profiling has been emerged as an efficient technique for cancer classification, as well as for diagnosis, prognosis, and treatment purposes. The primary task of microarray data classification is to determine a computational model from the given microarray data that can determine the class of unknown samples. In recent times, microarray technique has gained more attraction in both scientific and in industrial fields. It is important to determine the informative genes that cause the cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. Classifying cancer microarray gene expression data is a challenging task because microarray is a high dimensional-low sample dataset with lots of noisy or irrelevant genes and missing data. Therefore, finding an accurate and an effective cancer classification approach is very significant issue in medical domain. In this paper, we will make a comparative study and we will categorize the effective binary classification approaches that have been applied for cancer microarray gene expression profile. Then we conclude by identifying the most accurate classification method that has the highest classification accuracy along with the smallest number of effective genes.


The Computer Journal | 2005

Self-Adjusting of Ternary Search Tries Using Conditional Rotations and Randomized Heuristics

Ghada Badr; B. John Oommen

A ternary search trie (TST) is a highly efficient dynamic dictionary structure applicable for strings and textual data. The strings are accessed based on a set of access probabilities and are to be arranged using a TST. We consider the scenario where the probabilities are not known a priori and is time-invariant. Our aim is to adaptively restructure the TST so as to yield the best access or retrieval time. Unlike the case of lists and binary search trees where numerous methods have been proposed, in the case of the TST, currently, the number of reported adaptive schemes are few. In this paper we consider various self-organizing schemes that were applied to binary search trees and apply them to TSTs. Three new schemes, which are the splaying, the conditional rotation and the randomization heuristics, have been proposed, tested and comparatively presented. The results demonstrate that the conditional rotation heuristic is the best when compared with other heuristics that are considered in the paper.


international symposium on bioinformatics research and applications | 2011

Component-based matching for multiple interacting RNA sequences

Ghada Badr; Marcel Turcotte

RNA interactions are fundamental to a multitude of cellular processes including post-transcriptional gene regulation. Although much progress has been made recently at developing fast algorithms for predicting RNA interactions, much less attention has been devoted to the development of efficient algorithms and data structures for locating RNA interaction patterns. We present two algorithms for locating all the occurrences of a given interaction pattern in a set of RNA sequences. The baseline algorithm implements an exhaustive backtracking search. The second algorithm also finds all the matches, but uses additional data structures in order to considerably decrease the execution time, sometimes by one order of magnitude. The worst case memory requirement for the later algorithm increases exponentially with the input pattern length and does not depend on the database size, making it practical for large databases. The performance of the algorithms is illustrated with an application for locating RNA elements in a Diplonemid genome.


Pattern Analysis and Applications | 2007

Breadth-first search strategies for trie-based syntactic pattern recognition

B. John Oommen; Ghada Badr

Dictionary-based syntactic pattern recognition of strings attempts to recognize a transmitted string X*, by processing its noisy version, Y, without sequentially comparing Y with every element X in the finite, (but possibly, large) dictionary, H. The best estimate X+ of X*, is defined as that element of H which minimizes the generalized Levenshtein distance (GLD) D(X, Y) between X and Y, for all X ∈H. The non-sequential PR computation of X+ involves a compact trie-based representation of H. In this paper, we show how we can optimize this computation by incorporating breadth first search schemes on the underlying graph structure. This heuristic emerges from the trie-based dynamic programming recursive equations, which can be effectively implemented using a new data structure called the linked list of prefixes that can be built separately or “on top of” the trie representation of H. The new scheme does not restrict the number of errors in Y to be merely a small constant, as is done in most of the available methods. The main contribution is that our new approach can be used for generalized GLDs and not merely for 0/1 costs. It is also applicable when all possible correct candidates need to be known, and not just the best match. These constitute the cases when the “cutoffs” cannot be used in the DFS trie-based technique (Shang and Merrettal in IEEE Trans Knowl Data Eng 8(4):540–547, 1996). The new technique is compared with the DFS trie-based technique (Risvik in United Patent 6377945 B1, 23 April 2002; Shang and Merrettal in IEEE Trans Knowl Data Eng 8(4):540–547, 1996) using three large and small benchmark dictionaries with different errors. In each case, we demonstrate marked improvements with regard to the operations needed up to 21%, while at the same time maintaining the same accuracy. Additionally, some further improvements can be obtained by introducing the knowledge of the maximum number or percentage of errors in Y.


Pattern Analysis and Applications | 2006

A novel look-ahead optimization strategy for trie-based approximate string matching

Ghada Badr; B. John Oommen

This paper deals with the problem of estimating a transmitted string X* by processing the corresponding string Y, which is a noisy version of X*. We assume that Y contains substitution, insertion, and deletion errors, and that X* is an element of a finite (but possibly, large) dictionary, H. The best estimate X+ of X*, is defined as that element of H which minimizes the generalized Levenshtein distance D(X, Y) between X and Y such that the total number of errors is not more than K, for all X ∈H. The trie is a data structure that offers search costs that are independent of the document size. Tries also combine prefixes together, and so by using tries in approximate string matching we can utilize the information obtained in the process of evaluating any one D(Xi, Y), to compute any other D(Xj, Y), where Xi and Xj share a common prefix. In the artificial intelligence (AI) domain, branch and bound (BB) schemes are used when we want to prune paths that have costs above a certain threshold. These techniques have been applied to prune, for example, game trees. In this paper, we present a new BB pruning strategy that can be applied to dictionary-based approximate string matching when the dictionary is stored as a trie. The new strategy attempts to look ahead at each node, c, before moving further, by merely evaluating a certain local criterion at c. The search algorithm according to this pruning strategy will not traverse inside the subtrie(c) unless there is a “hope” of determining a suitable string in it. In other words, as opposed to the reported trie-based methods (Kashyap and Oommen in Inf Sci 23(2):123–142, 1981; Shang and Merrettal in IEEE Trans Knowledge Data Eng 8(4):540–547, 1996), the pruning is done a priori before even embarking on the edit distance computations. The new strategy depends highly on the variance of the lengths of the strings in H. It combines the advantages of partitioning the dictionary according to the string lengths, and the advantages gleaned by representing H using the trie data structure. The results demonstrate a marked improvement (up to 30% when costs are of a 0/1 form, and up to 47% when costs are general) with respect to the number of operations needed on three benchmark dictionaries.


Algorithms for Molecular Biology | 2011

Listing all sorting reversals in quadratic time.

Krister M. Swenson; Ghada Badr; David Sankoff

We describe an average-case O(n2) algorithm to list all reversals on a signed permutation π that, when applied to π, produce a permutation that is closer to the identity. This algorithm is optimal in the sense that, the time it takes to write the list is Ω(n2) in the worst case.

Collaboration


Dive into the Ghada Badr's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge