Shady Shehata | Researchain

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shady Shehata is active.

Explore More

Publication

Featured researches published by Shady Shehata.

IEEE Transactions on Knowledge and Data Engineering | 2010

An Efficient Concept-Based Mining Model for Enhancing Text Clustering

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of the common techniques in text mining are based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying text mining model should indicate terms that capture the semantics of text. In this case, the mining model can capture terms that present the concepts of the sentence, which leads to discovery of the topic of the document. A new concept-based mining model that analyzes terms on the sentence, document, and corpus levels is introduced. The concept-based mining model can effectively discriminate between nonimportant terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed mining model consists of sentence-based concept analysis, document-based concept analysis, corpus-based concept-analysis, and concept-based similarity measure. The term which contributes to the sentence semantics is analyzed on the sentence, document, and corpus levels rather than the traditional analysis of the document only. The proposed model can efficiently find significant matching concepts between documents, according to the semantics of their sentences. The similarity between documents is calculated based on a new concept-based similarity measure. The proposed similarity measure takes full advantage of using the concept analysis measures on the sentence, document, and corpus levels in calculating the similarity between documents. Large sets of experiments using the proposed concept-based mining model on different data sets in text clustering are conducted. The experiments demonstrate extensive comparison between the concept-based analysis and the traditional analysis. Experimental results demonstrate the substantial enhancement of the clustering quality using the sentence-based, document-based, corpus-based, and combined approach concept analysis.

web intelligence | 2007

Enhancing Search Engine Quality Using Concept-based Text Retrieval

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of the common techniques in text retrieval are based on the statistical analysis of a term either as a word or a phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. Thus, to achieve a more accurate analysis, the underlying representation should indicate terms that capture the semantics of text. In this case, the representation can capture terms that present the concepts of the sentence, which leads to discover the topic of the document. A new concept-based representation, called Conceptual Ontological Graph (COG), where a concept can be either a word or a phrase and totally dependent on the sentence semantics, is introduced. The aim of the proposed representation is to extract the most important terms in a sentence and a document with respect to the meaning of the text. The COG representation analyzes each term at both the sentence and the document levels. This is different from the classical approach of analyzing terms at the document level. First, the proposed representation denotes the terms which contribute to the sentence semantics. Then, each term is chosen based on its position within the COG representation. Lastly, the selected terms are associated to their documents as features for the purpose of indexing before text retrieval. The COG representation can effectively discriminate between non-important terms with respect to sentence semantics and terms which hold the key concepts that represent the sentence meaning. Large sets of experiments using the proposed COG representation on different datasets in text retrieval are conducted. Experimental results demonstrate the substantial enhancement of the text retrieval quality using the COG representation over the traditional techniques. The evaluation of results relies on two quality measures, the bpref and P(10). Both the quality measures improved when the newly developed COG representation is used to enhance the quality of the text retrieval results.In this paper, we investigate the emotion classification of web blog corpora using support vector machine (SVM) and conditional random field (CRF) machine learning techniques. The emotion classifiers are trained at the sentence level and applied to the document level. Our methods also determine an emotion category by taking the context of a sentence into account. Experiments show that CRF classifiers outperform SVM classifiers. When applying emotion classification to a blog at the document level, the emotion of the last sentence in a document plays an important role in determining the overall emotion.

knowledge discovery and data mining | 2007

A concept-based model for enhancing text categorization

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of text categorization techniques are based on word and/or phrase analysis of the text. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes moreto the meaning of its sentences than the other term. Thus, the underlying model should indicate terms that capture these mantics of text. In this case, the model can capture terms that present the concepts of the sentence, which leads todiscover the topic of the document. A new concept-based model that analyzes terms on the sentence and document levels rather than the traditional analysis of document only is introduced. The concept-based model can effectively discriminate between non-important terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed model consists of concept-based statistical analyzer, conceptual ontological graph representation,and concept extractor. The term which contributes to the sentence semantics is assigned two different weights by the concept-based statistical analyzer and the conceptual ontological graph representation. These two weights are combined into a new weight. The concepts that have maximum combined weights are selected by the concept extractor. A set of experiments using the proposed concept-basedmodel on different datasets in text categorization is conducted. The experiments demonstrate the comparison between traditional weighting and the concept-based weighting obtained by the combined approach of the concept-based statistical analyzer and the conceptual ontological graph. The evaluation of results is relied on two quality measures, the Macro-averaged F1 and the Error rate. These quality measures are improved when the newly developedconcept-based model is used to enhance the quality of thetext categorization.

international conference on data mining | 2006

Enhancing Text Clustering Using Concept-based Mining Model

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of text mining techniques are based on word and/or phrase analysis of the text. The statistical analysis of a term (word or phrase) frequency captures the importance of the term within a document. However, to achieve a more accurate analysis, the underlying mining technique should indicate terms that capture the semantics of the text from which the importance of a term in a sentence and in the document can be derived. A new concept-based mining model that relies on the analysis of both the sentence and the document, rather than, the traditional analysis of the document dataset only is introduced. The proposed mining model consists of a concept-based analysis of terms and a concept-based similarity measure. The term which contributes to the sentence semantics is analyzed with respect to its importance at the sentence and document levels. The model can efficiently find significant matching terms, either words or phrases, of the documents according to the semantics of the text. The similarity between documents relies on a new concept-based similarity measure which is applied to the matching terms between documents. Experiments using the proposed concept-based term analysis and similarity measure in text clustering are conducted. Experimental results demonstrate that the newly developed concept-based mining model enhances the clustering quality of sets of documents substantially.

international conference on data mining | 2009

A WordNet-Based Semantic Model for Enhancing Text Clustering

Shady Shehata

Most of text mining techniques are based on word and/or phrase analysis of the text. The statistical analysis of a term (word or phrase) frequency captures the importance of the term within a document. However, to achieve a more accurate analysis, the underlying mining technique should indicate terms that capture the semantics of the text from which the importance of a term in a sentence and in the document can be derived. Incorporating semantic features from the WordNet lexical database is one of many approaches that have been tried to improve the accuracy of text clustering techniques. A new semantic-based model that analyzes documents based on their meaning is introduced. The proposed model analyzes terms and their corresponding synonyms and/or hypernyms on the sentence and document levels. In this model, if two documents contain different words and these words are semantically related, the proposed model can measure the semantic-based similarity between the two documents. The similarity between documents relies on a new semantic-based similarity measure which is applied to the matching concepts between documents. Experiments using the proposed semantic-based model in text clustering are conducted. Experimental results demonstrate that the newly developed semantic-based model enhances the clustering quality of sets of documents substantially.

international conference on data mining | 2006

Enhancing Text Retrieval Performance using Conceptual Ontological Graph

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of the data representation techniques are based on word and/or phrase analysis of the text. The statistical analysis of a term (word or phrase) frequency captures the importance of the term within a document. However, to achieve a more accurate analysis, the underlying data representation should indicate terms that capture the semantics of the text from which the importance of a term in a sentence and in the document can be derived. A new concept-based representation that relies on the analysis of the sentence semantics, rather than, the traditional analysis of the document dataset only is introduced. The proposed conceptual ontological graph representation denotes the terms which contribute to the sentence semantics. Then, each term is chosen based on its position in the proposed representation. Lastly, the selected terms are associated to their documents as features for the purpose of indexing in the text retrieval. Experiments using the proposed conceptual ontological graph representation in text retrieval are conducted. The evaluation of results is relied on two quality measures, the precision and the recall. Both of these quality measures improved when the newly developed representation is used to enhance the performance of the text retrieval

Knowledge and Information Systems | 2013

An efficient concept-based retrieval model for enhancing text retrieval quality

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of the common techniques in text retrieval are based on the statistical analysis terms (words or phrases). Statistical analysis of term frequency captures the importance of the term within a document only. Thus, to achieve a more accurate analysis, the underlying model should indicate terms that capture the semantics of text. In this case, the model can capture terms that represent the concepts of the sentence, which leads to discovering the topic of the document. In this paper, a new concept-based retrieval model is introduced. The proposed concept-based retrieval model consists of conceptual ontological graph (COG) representation and concept-based weighting scheme. The COG representation captures the semantic structure of each term within a sentence. Then, all the terms are placed in the COG representation according to their contribution to the meaning of the sentence. The concept-based weighting analyzes terms at the sentence and document levels. This is different from the classical approach of analyzing terms at the document level only. The weighted terms are then ranked, and the top concepts are used to build a concept-based document index for text retrieval. The concept-based retrieval model can effectively discriminate between unimportant terms with respect to sentence semantics and terms which represent the concepts that capture the sentence meaning. Experiments using the proposed concept-based retrieval model on different data sets in text retrieval are conducted. The experiments provide comparison between traditional approaches and the concept-based retrieval model obtained by the combined approach of the conceptual ontological graph and the concept-based weighting scheme. The evaluation of results is performed using three quality measures, the preference measure (bpref), precision at 10 documents retrieved (P(10)) and the mean uninterpolated average precision (MAP). All of these quality measures are improved when the newly developed concept-based retrieval model is used, confirming that such model enhances the quality of text retrieval.

computational intelligence | 2010

AN EFFICIENT MODEL FOR ENHANCING TEXT CATEGORIZATION USING SENTENCE SEMANTICS

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of text categorization techniques are based on word and/or phrase analysis of the text. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in there documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying model should identify terms that capture the semantics of text. In this case, the model can capture terms that present the concepts of the sentence, which leads to discovering the topic of the document. A new concept‐based model that analyzes terms on the sentence, document, and corpus levels rather than the traditional analysis of document only is introduced. The concept‐based model can effectively discriminate between nonimportant terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. A set of experiments using the proposed concept‐based model on different datasets in text categorization is conducted in comparison with the traditional models. The results demonstrate the substantial enhancement of the categorization quality using the sentence‐based, document‐based and corpus‐based concept analysis.

advanced data mining and applications | 2008

Enhancing Text Categorization Using Sentence Semantics

Shady Shehata; Fakhri Karray; Mohamed S. Kamel

Most of text categorization techniques are based on word and/or phrase analysis of the text. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying model should indicate terms that capture the semantics of text. In this case, the model can capture terms that present the concepts of the sentence, which leads to discover the topic of the document. A new concept-based model that analyzes terms on the sentence and document levels rather than the traditional analysis of document only is introduced. The concept-based model can effectively discriminate between non-important terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. A set of experiments using the proposed concept-based model on different datasets in text categorization is conducted. The experiments demonstrate the comparison between traditional weighting and the concept-based weighting enhances the quality of categorization quality of sets of documents substantially.

Archive | 2010

System, method and computer program for searching within a sub-domain by linking to other sub-domains

Shady Shehata; Fakhri Karray; Mohammed Salem Kamel

Explore More

Collaboration

Dive into the Shady Shehata's collaboration.

Top Co-Authors

Fakhri Karray

University of Waterloo

View shared research outputs

Top Co-Authors

Mohamed S. Kamel

University of Waterloo

View shared research outputs

Explore More

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot

Dive into the research topics where Shady Shehata is active.

Publication

Featured researches published by Shady Shehata.

An Efficient Concept-Based Mining Model for Enhancing Text Clustering

Enhancing Search Engine Quality Using Concept-based Text Retrieval

A concept-based model for enhancing text categorization

Enhancing Text Clustering Using Concept-based Mining Model

A WordNet-Based Semantic Model for Enhancing Text Clustering

Enhancing Text Retrieval Performance using Conceptual Ontological Graph

An efficient concept-based retrieval model for enhancing text retrieval quality

AN EFFICIENT MODEL FOR ENHANCING TEXT CATEGORIZATION USING SENTENCE SEMANTICS

Enhancing Text Categorization Using Sentence Semantics

System, method and computer program for searching within a sub-domain by linking to other sub-domains

Collaboration

Dive into the Shady Shehata's collaboration.