Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Emad M. Al-Shawakfa is active.

Publication


Featured researches published by Emad M. Al-Shawakfa.


International Journal of Information Retrieval Research archive | 2011

The Effect of Stemming on Arabic Text Classification: An Empirical Study

Izzat Alsmadi; Mohammed Al-Kabi; Abdullah Wahbeh; Qasem A. Al-Radaideh; Emad M. Al-Shawakfa

The information world is rich of documents in different formats or applications, such as databases, digital libraries, and the Web. Text classification is used for aiding search functionality offered by search engines and information retrieval systems to deal with the large number of documents on the web. Many research papers, conducted within the field of text classification, were applied to English, Dutch, Chinese, and other languages, whereas fewer were applied to Arabic language. This paper addresses the issue of automatic classification or classification of Arabic text documents. It applies text classification to Arabic language text documents using stemming as part of the preprocessing steps. Results have showed that applying text classification without using stemming; the support vector machine SVM classifier has achieved the highest classification accuracy using the two test modes with 87.79% and 88.54%. On the other hand, stemming has negatively affected the accuracy, where the SVM accuracy using the two test modes dropped down to 84.49% and 86.35%.


International Journal of Computer Processing of Languages | 2011

An Approach for Arabic Text Categorization Using Association Rule Mining

Qasem A. Al-Radaideh; Emad M. Al-Shawakfa; Abdullah S. Ghareb; Hani Abu-Salem

Text Categorization (TC) has become one of the major techniques for organizing and managing online information. Several studies proposed the so-called associative classification for databases and few of these studies are proposed to classify text documents into predefined categories based on their contents. In this paper a new approach is proposed for Arabic text categorization. The approach facilitates the discovery of association rules for building a classification model for Arabic text categorization. An apriori based algorithm is employed for association rule mining. To validate the proposed approach, several experiments were applied on a collection of Arabic documents. Three classification methods using association rules were compared in terms of their classification accuracy; the methods are: ordered decision list, weighted rules, and majority voting. The results showed that the majority voting method is the best in most of experiments achieving an accuracy of up to 87%. On the other hand, the weigh...


Journal of Information Science | 2017

The impact of indexing approaches on Arabic text classification

Amer Al-Badarneh; Emad M. Al-Shawakfa; Basel Bani-Ismail; Khaleel Al-Rababah; Safwan Shatnawi

This paper investigates the impact of using different indexing approaches (full-word, stem, and root) when classifying Arabic text. In this study, the naïve Bayes classifier is used to construct the multinomial classification models and is evaluated using stratified k-fold cross-validation (k ranges from 2 to 10). It is also uses a corpus that consists of 1000 normalized Arabic documents. The results of one experiment in this study show that significant accuracy improvements have occurred when the full-word form is used in most k-folds. Further experiments show that the classifier has achieved the highest accuracy in the eight-fold by using 7/8–1/8 train–test ratio, despite the indexing approach being used. The overall results of this study show that the classifier has achieved the maximum micro-average accuracy 99.36%, either by using the full-word form or the stem form. This proves that the stem is a better choice to use when classifying Arabic text, because it makes the corpus dataset smaller and this will enhance both the processing time and storage utilization, and achieve the highest level of accuracy.


ieee jordan conference on applied electrical engineering and computing technologies | 2013

Evaluating English to Arabic machine translators

Taghreed M. Hailat; Mohammed N. Al-Kabi; Izzat Alsmadi; Emad M. Al-Shawakfa

Location and language have now less impact as barriers for the expansion and the spread of information around the world. Machine translators achieve such a tedious task of translation among languages in quick and reliable manners. However, if compared with human translation, issues related to semantic meanings may always arise. Different machine translators may differ in their effectiveness, and they can be evaluated either by humans or through the use of automatic methods. In this study, we attempt to evaluate the effectiveness of two popular Machine Translation (MT) systems (Google Translate and Babylon machine translation systems) to translate sentences from English to Arabic, where an automatic evaluation method called Bilingual Evaluation Understudy (BLEU) is used. Our preliminary tests indicated that Google Translate system is more effective in translating English sentences into Arabic in comparison with the Babylon MT system.


International journal of continuing engineering education and life-long learning | 2014

Studying and analysing students web search behaviours within three Jordanian universities

Abdullah Wahbeh; Mohammed N. Al-Kabi; Majdi Maabreh; Izzat Alsmadi; Emad M. Al-Shawakfa

The World Wide Web has become a phenomenon. The web is a crucial source of information for almost every person. Peoples with different jobs using the internet to meet their professional needs, and the university students constitute a large segment of those people who use the internet frequently to accomplish their information needs. This research has conducted an investigation to determine how Jordanian university students are using the web, what are the characteristics of queries used when searching the internet, what are the top searched queries, …, etc. All these goals are achieved after extracting different queries and information from these web log files and analysing them. Results have showed that students are interested in topics related to entertainment and society more than topics related to their studies. Students submit short queries consisting of 1, 2, or 3 terms with little usage of different query operators. Unsurprisingly, they use Google as the preferred search engine and Arabic as the preferred search language.


international conference on computer information and telecommunication systems | 2012

Enhancing query retrieval efficiency using BGIT coding

Ameen A. Al-Jedady; Izzat Alsmadi; Emad M. Al-Shawakfa; Mohammed Al-Kabi

Data compression techniques are used to optimize time and space while sending and retrieving data. In information retrieval, data compression techniques are used by Search engines to reduce the size of their indexes which will result in optimizing the speed and performance of retrieving relevant information. The goal of this research project is to propose some enhancements on search engines indexing using Bigram index term coding. Evaluation of the improvements on search-engine performance resulting from encoding the terms of its index is also conducted. Our experiments showed a good reduction in the size of index terms which contributes to the overall index size. It also showed a significant reduction of the number of comparisons made to process the user queries as a result of reducing the number of symbols representing each index term.


International Journal of Advanced Computer Science and Applications | 2016

Classifying Arabic text using KNN classifier.

Amer Al-Badarenah; Emad M. Al-Shawakfa; Khaleel Al-Rababah; Safwan Shatnawi; Basel Bani-Ismail

With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest – neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold


international conference on computer information and telecommunication systems | 2012

Enhancing retrieval and novelty detection for arabic text using sentence level information pattern

Esra’a Alshdaifat; Mohammed Al-Kabi; Emad M. Al-Shawakfa; Abdulla H Wahbeh

Novelty detection is already used in many Natural Processing Language (NLP) applications, such as information retrieval systems, Web search engines, text summarization, question answering systems...etc. This study aims to detect novel Arabic sentence level information patterns. The Length Adjusted (LA) model is based on sentence level information patterns is used, which depends on the sentence length. Test results show a significant improvement in the performance of novelty detection for Arabic texts in terms of precision at top ranks.


International Journal of Computer Processing of Languages | 2012

Evaluating Rule-Based and Statistical Filters to Detecting Arabic E-Mail Alert Messages

Emad M. Al-Shawakfa; Qasem A. Al-Radaideh; Ahmed Aleroud

Detecting and filtering e-mail alerts that are related to criminal or terrorist activities is of great interest for both security agencies and people. This paper evaluates and compares the performance of both the rule-based filter and Paul Graham statistical filter for detecting alerts in Arabic e-mail messages. To evaluate the two filters, a set of 1500 Arabic messages related to criminal activities were collected manually from some news websites such as Al-Jazeera Net and BBC Arabic news. The e-mails have been preprocessed, normalized, and then the relevant features were extracted from the collected e-mails by involving categorical proportional difference (CPD) and term frequency variance (TFV) as features weighting methods for the rule-based filter. To test the performance of the two filters, several experiments have been conducted and the result show that the Paul Graham statistical filter was more accurate. It was able to detect about 85% of the e-mail alerts used in the experiments. The rule-based filter has achieved 80% accuracy using the CPD method and 70% accuracy using the TFV method.


International Journal of Advanced Computer Science and Applications | 2011

A Comparison Study between Data Mining Tools over some Classification Methods

Abdullah Wahbeh; Qasem A. Al-Radaideh; Mohammed N. Al-Kabi; Emad M. Al-Shawakfa

Collaboration


Dive into the Emad M. Al-Shawakfa's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Amer Al-Badarneh

Jordan University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge