Is this you? Create Your Porfile

Arafat Awajan

Princess Sumaya University for Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Arafat Awajan is active.

Explore More

Publication

Featured researches published by Arafat Awajan.

Information and Communication Systems (ICICS), 2016 7th International Conference on | 2016

Sentiment classification techniques for Arabic language: A survey

Mariam Biltawi; Wael Etaiwi; Sara Tedmori; Amjad Hudaib; Arafat Awajan

With the advent of online data, sentiment analysis has received growing attention in recent years. Sentiment analysis aims to determine the overall sentiment orientation of a speaker or writer towards a specific entity or towards a specific feature of a specific entity. A fundamental task of sentiment analysis is sentiment classification, which aims to automatically classify opinionated text as being positive, negative, or neutral. Although the literature on sentiment classification is quite extensive, only a few endeavors to classify opinionated text written in the Arabic language can be found. This paper provides a comprehensive survey of existing lexicon, machine learning, and hybrid sentiment classification techniques for Arabic language.

acm transactions on asian and low resource language information processing | 2015

Keyword Extraction from Arabic Documents using Term Equivalence Classes

Arafat Awajan

The rapid growth of the Internet and other computing facilities in recent years has resulted in the creation of a large amount of text in electronic form, which has increased the interest in and importance of different automatic text processing applications, including keyword extraction and term indexing. Although keywords are very useful for many applications, most documents available online are not provided with keywords. We describe a method for extracting keywords from Arabic documents. This method identifies the keywords by combining linguistics and statistical analysis of the text without using prior knowledge from its domain or information from any related corpus. The text is preprocessed to extract the main linguistic information, such as the roots and morphological patterns of derivative words. A cleaning phase is then applied to eliminate the meaningless words from the text. The most frequent terms are clustered into equivalence classes in which the derivative words generated from the same root and the non-derivative words generated from the same stem are placed together, and their count is accumulated. A vector space model is then used to capture the most frequent N-gram in the text. Experiments carried out using a real-world dataset show that the proposed method achieves good results with an average precision of 31% and average recall of 53% when tested against manually assigned keywords.

digital information and communication technology and its applications | 2015

Semantic vector space model for reducing Arabic text dimensionality

Arafat Awajan

In this paper, we introduce an efficient method to represent Arabic texts in comparatively smaller sizes without losing significant information. The proposed method uses the linguistic features of the Arabic language, mainly its very productive morphology and its richness in synonyms, to reduce the dimension of the document vector and to improve its vector space model representation. We have incorporated semantic information from word thesauri like WordNet to create clusters of similar words extracted from the same root and regrouped along with their synonyms. Distributional similarity measures are applied on the word-context matrix associated with the document in order to identify similar words based on a texts context. The experimental results have confirmed that the proposed method significantly reduces the size of text representation by about 20% compared with the stem-based vector space model and by about 40% compared with the traditional bag of words model.

ieee jordan conference on applied electrical engineering and computing technologies | 2013

Extracting Arabic semantic graph from Aljazeera.net

Akram Alkouz; Arafat Awajan; Mahmoud Jeet; Abdelfattah Al-Zaqqa

Collaborative knowledge bases such as Wikipedia and Wiktionary became valuable lexical semantic resources with a high influence in diverse Natural Language Processing tasks. Although Arabic collaborative knowledge bases are in general poorly informatized and significantly different from traditional linguistic knowledge bases in various respects. Aljazeera.net is professionally edited and has rich semantic structure. It constitutes an asset, an impediment and a challenge for research in Arabic Natural Language Processing. This paper addresses one such major impediment, namely the lack of suitable tools to access the knowledge stored in these large semantic knowledge bases. We present a framework designed for mining the explicit and implicit lexical semantic information impeded in the structure and the content of Aljazeera.net. Furthermore, it provides an efficient and structured access to the resulted semantic graph. This framework will be freely available for research purposes to meet the needs of Arabic Natural Language Processing community.

computer and information technology | 2017

Bag-of-concept based keyword extraction from Arabic documents

Dima Suleiman; Arafat Awajan

This paper proposes a new keyword extraction method that uses bag-of-concept to extract keywords from Arabic text. The proposed algorithm utilizes semantic vector space model instead of traditional vector space model to group words into classes. The new method built word-context matrix where the synonym words will be grouped into the same class. The evaluation of new approach was conducted using dataset which consists of three documents and compared with Keyword Extraction from Arabic Documents using Term Equivalence Classes method; experimental results showed that the proposed method provides significant results.

Procedia Computer Science | 2017

Statistical Arabic Name Entity Recognition Approaches: A Survey

Wael Etaiwi; Arafat Awajan; Dima Suleiman

Abstract With the increase of Arabic textual information via internet websites and services, a tools for processing Arabic text information are needed to extract knowledge from them. Name Entity recognition aims to extract name entities such as: person names, locations and organizations from a given text. Name Entity recognition approaches were classified into two main approaches: rule-based approach and statistical approach. Although the literature on Name Entity recognition is quit extensive, few researches to extract Name Entities in the Arabic language could be found. This paper provides a comprehensive survey about statistical approaches of Arabic Name Entity extraction.

Procedia Computer Science | 2017

The Use of Hidden Markov Model in Natural ARABIC Language Processing: a survey

Dima Suleiman; Arafat Awajan; Wael Etaiwi

Abstract Hidden Markov Model is an empirical tool that can be used in many applications related to natural language processing. In this paper a comparative study was conducted between different applications in natural Arabic language processing that uses Hidden Markov Model such as morphological analysis, part of speech tagging, text classification, and name entity recognition. Comparative results showed that HMM can be used in different layers of natural language processing, but mainly in pre-processing phase such as: part of speech tagging, morphological analysis and syntactic structure; however in high level applications text classification their use is limited to certain number of researches.

2015 Fourth International Conference on Cyber Security, Cyber Warfare, and Digital Forensic (CyberSec) | 2015

MLDED: Multi-layer Data Exfiltration Detection System

Mohammad Ahmad Abu Allawi; Ali Hadi; Arafat Awajan

Due to the growing advancement of crime ware services, the computer and network security becomes a crucial issue. Detecting sensitive data exfiltration is a principal component of each information protection strategy. In this research, a Multi-Level Data Exfiltration Detection (MLDED) system that can handle different types of insider data leakage threats with staircase difficulty levels and their implications for the organization environment has been proposed, implemented and tested. The proposed system detects exfiltration of data outside an organization information system, where the main goal is to use the detection results of a MLDED system for digital forensic purposes. MLDED system consists of three major levels Hashing, Keywords Extraction and Labeling. However, it is considered only for certain type of documents such as plain ASCII text and PDF files. In response to the challenging issue of identifying insider threats, a forensic readiness data exfiltration system is designed that is capable of detecting and identifying sensitive information leaks. The results show that the proposed system has an overall detection accuracy of 98.93%.

International Journal of Information Technology and Web Engineering | 2010

Quality of Service for Multimedia and Real-Time Services

F. W. Albalas; B. A. Abu-Alhaija; Arafat Awajan; Khalid Al-Begain

New web technologies have encouraged the deployment of various network applications that are rich with multimedia and real-time services. These services demand stringent requirements are defined through Quality of Service QoS parameters such as delay, jitter, loss, etc. To guarantee the delivery of these services QoS routing algorithms that deal with multiple metrics are needed. Unfortunately, QoS routing with multiple metrics is considered an NP-complete problem that cannot be solved by a simple algorithm. This paper proposes three source based QoS routing algorithms that find the optimal path from the service provider to the user that best satisfies the QoS requirements for a particular service. The three algorithms use the same filtering technique to prune all the paths that do not meet the requirements which solves the complexity of NP-complete problem. Next, each of the three algorithms integrates a different Multiple Criteria Decision Making method to select one of the paths that have resulted from the route filtering technique. The three decision making methods used are the Analytic Hierarchy Process AHP, Multi-Attribute Utility Theory MAUT, and Kepner-Tregoe KT. Results show that the algorithms find a path using multiple constraints with a high ability to handle multimedia and real-time applications.

Journal of Computer Science | 2008