Nagwa M. El-Makky
Alexandria University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nagwa M. El-Makky.
Sigkdd Explorations | 2000
Khalil M. Ahmed; Nagwa M. El-Makky; Yousry I. Taha
In their paper [1], S. Brin, R. Matwani and C. Silverstien discussed measuring significance of (generalized) association rules via the support and the chi-squared test for correlation. They provided some illustrative examples and pointed that the chi-squared test needs to be agumented by a measure of interest that they also suggested.This paper presents a further elaboration and extension of their discussion. As suggested by Brin et al, the chi-squared test succeeds in measuring the cell dependencies in a 2x2 contingency table. However, it can be misleading in cases of bigger contingency tables. We will give some illustrative examples based on those presented in [1]. We will also propose a more appropriate reliability measure of association rules.
empirical methods in natural language processing | 2014
Heba Abdelnasser; Maha Ragab; Reham Mohamed; Alaa Mohamed; Bassant Farouk; Nagwa M. El-Makky; Marwan Torki
Recently, Question Answering (QA) has been one of the main focus of natural language processing research. However, Arabic Question Answering is still not in the mainstream. The challenges of the Arabic language and the lack of resources have made it difficult to provide Arabic QA systems with high accuracy. While low accuracies may be accepted for general purpose systems, it is critical in some fields such as religious affairs. Therefore, there is a need for specialized accurate systems that target these critical fields. In this paper, we propose Al-Bayan, a new Arabic QA system specialized for the Holy Quran. The system accepts an Arabic question about the Quran, retrieves the most relevant Quran verses, then extracts the passage that contains the answer from the Quran and its interpretation books (Tafseer). Evaluation results on a collected dataset show that the overall system can achieve 85% accuracy using the top-3 results.
international conference of the ieee engineering in medicine and biology society | 2014
Rania Ibrahim; Noha A. Yousri; Mohamed A. Ismail; Nagwa M. El-Makky
Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting the most informative genes/miRNAs. Recently, deep learning has proven its ability in representing the data in multiple levels of abstraction, allowing for better discrimination between different classes. However, the idea of using deep learning for feature selection is not widely used in the bioinformatics field yet. In this paper, a novel multi-level feature selection approach named MLFS is proposed for selecting genes/miRNAs based on expression profiles. The approach is based on both deep and active learning. Moreover, an extension to use the technique for miRNAs is presented by considering the biological relation between miRNAs and genes. Experimental results show that the approach was able to outperform classical feature selection methods in hepatocellular carcinoma (HCC) by 9%, lung cancer by 6% and breast cancer by around 10% in F1-measure. Results also show the enhancement in F1-measure of our approach over recently related work in [1] and [2].
acs ieee international conference on computer systems and applications | 2005
Noha A. Yousri; Khalil M. Ahmed; Nagwa M. El-Makky
Summary form only given. A data warehouse stores materialized views of data from one or more sources, for the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a DW is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views so that the sum cost of processing set of queries and maintaining the materialized views is minimized. In this paper, new algorithms are proposed for selecting materialized views in a data warehouse. Two targets of research are considered. The first target is to propose an approach to solve the problem considering both multi-query optimization, and the maintenance process optimization. The other target considers using a simple search strategy that reduces the search space for the view selection problem, and reduces the time complexity to a linear instead of a quadratic one.
bioinformatics and biomedicine | 2013
Rania Ibrahim; Noha A. Yousri; Mohamed A. Ismail; Nagwa M. El-Makky
A number of attempts to classify cancer samples using miRNA/gene expression profiles are known in literature. However, semi-supervised learning models have only been recently introduced to exploit the huge unlabeled expression profiles in enhancing sample classification. It is important to combine both miRNA and gene expression sets as that provides more information on the characteristics of cancer samples. The use of both of labeled and unlabeled miRNA and gene expression sets to enhance sample classification has not been explored yet. In this paper, two semi-supervised machine learning approaches, namely self-learning and co-training are adapted to enhance the quality of cancer sample classification. In self-learning, miRNA and gene based classifiers are enhanced independently. While in co-training, both miRNA and gene expression profiles are used simultaneously to provide different views of cancer samples. The approaches were evaluated using breast cancer, hepatocellular carcinoma (HCC) and lung cancer expression sets. Results show up to 20% improvement in F1-measure over Random Forests and SVM classifiers. Co-Training also outperforms Low Density Separation (LDS) approach by around 25% improvement in F1-measure in breast cancer.
north american chapter of the association for computational linguistics | 2015
Reham Mohamed; Maha Ragab; Heba Abdelnasser; Nagwa M. El-Makky; Marwan Torki
This paper describes Al-Bayan team participation in SemEval-2015 Task 3, Subtask A. Task 3 targets semantic solutions for answer selection in community question answering systems. We propose a knowledge-based solution for answer selection of Arabic questions, specialized for Islamic sciences. We build a Semantic Interpreter to evaluate the semantic similarity between Arabic question and answers using our Quranic ontology of concepts. Using supervised learning, we classify the candidate answers according to their relevance to the users questions. Results show that our system achieves 74.53% accuracy which is comparable to the other participating systems.
international conference on machine learning and applications | 2011
Amr Magdy; Noha A. Yousri; Nagwa M. El-Makky
The availability of streaming data in different fields and in various forms increases the importance of streaming data analysis. The huge size of a continuously flowing data has put forward a number of challenges in data stream analysis. Exploration of the structure of streamed data represented a major challenge that resulted in introducing various clustering algorithms. However, current clustering algorithms still lack the ability to efficiently discover clusters of arbitrary densities in data streams. In this paper, a new grid-based and density-based algorithm is proposed for clustering streaming data. It addresses drawbacks of recent algorithms in discovering clusters of arbitrary densities. The algorithm uses an online component to map the input data to grid cells. An offline component is then used to cluster the grid cells based on density information. Relative density relatedness measures and a dynamic range neighborhood are proposed to differentiate clusters of arbitrary densities. The experimental evaluation shows considerable improvements upon the state-of-the-art algorithms in both clustering quality and scalability. In addition, the output quality of the proposed algorithm is less sensitive to parameter selection errors.
ICCI '91 Proceedings of the International Conference on Computing and Information: Advances in Computing and Information | 1991
Mohamed Eltoweissy; Nagwa M. El-Makky; M. Abougabal; Souheir A. Fouad
The diversity of available concurrency control algorithms in database systems necessitates the development of quantitative methods for evaluating their performance. This paper proposes an analytical model to analyze the performance of Time-stamp Ordering algorithms. In particular, Time-stamp Ordering employing blocking and restarts, both with and without Thomas Write Rule. The modeling approach is promising since it has the potential of providing useful insights to DBMS designers and at the same time very inexpensive to use. Moreover, the results obtained are extensive and closely track those of simulation.
Clinical Genetics | 2018
Rana Momtaz; Nagia M. Ghanem; Nagwa M. El-Makky; Mohamed A. Ismail
Integrative approaches that combine multiple forms of data can more accurately capture pathway associations and so provide a comprehensive understanding of the molecular mechanisms that cause complex diseases. Association analyses based on single nucleotide polymorphism (SNP) genotypes, copy number variant (CNV) genotypes, and gene expression profiles are the 3 most common paradigms used for gene set/pathway enrichment analyses. Many work has been done to leverage information from 2 types of data from these 3 paradigms. However, to the best of our knowledge, there is no work done before to integrate the 3 paradigms all together. In this article, we present an integrated analysis that combine SNP, CNV, and gene expression data to generate a single gene list. We present different methods to compare this gene list with the other 3 possible lists that result from the combinations of the following pairs of data: SNP genotype with gene expression, CNV genotype with gene expression, and SNP genotype with CNV genotype. The comparison is done using 3 different cancer datasets and 2 different methods of comparison. Our results show that integrating SNP, CNV, and gene expression data give better association results than integrating any pair of 3 data.
Proceedings of ICCI'93: 5th International Conference on Computing and Information | 1993
Mohamed Eltoweissy; Hussein M. Abdel-Wahab; M. Abougabal; Nagwa M. El-Makky; Souheir A. Fouad
Due to the complexity of the issues affecting the performance of concurrency control algorithms in database systems most studies adopt different approaches and make different assumptions. We propose the use of a unified mean value analytic approach to the performance analysis of concurrency control algorithms. In this paper, we apply this modeling approach to analyze the performance of time-stamp ordering algorithms. We then compare our results with those reported on two-phase locking and optimistic algorithms, and generate several conclusions and recommendations.<<ETX>>