Is this you? Create Your Porfile

Nazlia Omar

National University of Malaysia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nazlia Omar is active.

Explore More

Publication

Featured researches published by Nazlia Omar.

rough sets and knowledge technology | 2010

Automatic part of speech tagging for Arabic: an experiment using Bigram hidden Markov model

Mohammed Albared; Nazlia Omar; Mohd Juzaiddin Ab Aziz; Mohd Zakree Ahmad Nazri

Part Of Speech (POS) tagging is the ability to computationally determine which POS of a word is activated by its use in a particular context. POS tagger is a useful preprocessing tool in many natural languages processing (NLP) applications such as information extraction and information retrieval. In this paper, we present the preliminary achievement of Bigram Hidden Markov Model (HMM) to tackle the POS tagging problem of Arabic language. In addition, we have used different smoothing algorithms with HMM model to overcome the data sparseness problem. The Viterbi algorithm is used to assign the most probable tag to each word in the text. Furthermore, several lexical models have been defined and implemented to handle unknown word POS guessing based on word substring i.e. prefix probability, suffix probability or the linear interpolation of both of them. The average overall accuracy for this tagger is 95.8.

Artificial Intelligence Review | 2014

Arabic machine translation: a survey

Arwa Alqudsi; Nazlia Omar; Khalid Shaker

Although there is no machine learning technique that fully meets human requirements, finding a quick and efficient translation mechanism has become an urgent necessity, due to the differences between the languages spoken in the world’s communities and the vast development that has occurred worldwide, as each technique demonstrates its own advantages and disadvantages. Thus, the purpose of this paper is to shed light on some of the techniques that employ machine translation available in literature, to encourage researchers to study these techniques. We discuss some of the linguistic characteristics of the Arabic language. Features of Arabic that are related to machine translation are discussed in detail, along with possible difficulties that they might present. This paper summarizes the major techniques used in machine translation from Arabic into English, and discusses their strengths and weaknesses.

asian conference on intelligent information and database systems | 2011

Developing a competitive HMM arabic POS tagger using small training corpora

Mohammed Albared; Nazlia Omar; Mohd Juzaiddin Ab Aziz

Part Of Speech (POS) tagging is the ability to computationally determine which POS of a word is activated by its use in a particular context. POS is one of the important processing steps for many natural language systems such as information extraction, question answering. This paper presents a study aiming to find out the appropriate strategy to develop a fast and accurate Arabic statistical POS tagger when only a limited amount of training material is available. This is an essential factor when dealing with languages like Arabic for which small annotated resources are scarce and not easily available. Different configurations of a HMM tagger are studied. Namely, bigram and trigram models are tested, as well as different smoothing techniques. In addition, new lexical model has been defined to handle unknown word POS guessing based on the linear interpolation of both word suffix probability and word prefix probability. Several experiments are carried out to determine the performance of the different configurations of HMM with two small training corpora. The first corpus includes about 29300 words from both Modern Standard Arabic and Classical Arabic. The second corpus is the Quranic Arabic Corpus which is consisting of 77,430 words of the Quranic Arabic.

asia information retrieval symposium | 2014

A Comparative Study of Feature Selection and Machine Learning Algorithms for Arabic Sentiment Classification

Nazlia Omar; Mohammed Albared; Tareq Al-Moslmi; Adel Al-Shabi

Sentiment analysis is a very challenging and important task that involves natural language processing, web mining, and machine learning. Sentiment analysis in the Arabic language is a more challenging task than in other languages due to the morphological complexity of the Arabic and the large variation of its dialects. This paper presents an empirical comparison of seven feature selection methods (Information Gain, Principal Components Analysis, Relief-F, Gini Index, Uncertainty, Chi-squared, and Support Vector Machines (SVMs)), and three machine learning classifiers (SVM, Naive Bayes, and K-nearest neighbor) for Arabic sentiment classification. A wide range of comparative experiments are conducted on an opinion corpus for Arabic (OCA). This paper demonstrates that feature selection does improve the performance of Arabic sentiment-based classification, but the result depends on the method used and the number of features selected. The experimental results demonstrate that feature reduction methods are found to improve the classifier performance. Moreover, the experimental results indicate that SVM-based feature selection yields the best performance for feature selection and that the SVM classifier outperforms the other techniques for Arabic sentiment-based classification. Finally, the experiments indicate that the SVM classifier with the SVM-based feature selection method yields the best classification method, with an accuracy of 92.4%.

2nd International Multi-Conference on Artificial Intelligence Technology, M-CAIT 2013 | 2013

Soft computing applications and intelligent systems: Second international multi-conference on artificial intelligence technology, m-cait 2013 shah alam, august 28-29, 2013 Proceedings

Shahrul Azman Mohd Noah; Azizi Abdullah; Haslina Arshad; Azuraliza Abu Bakar; Zulaiha Ali Othman; Shahnorbanun Sahran; Nazlia Omar; Zalinda Othman

The determination of real world coordinate from image coordinate has many applications in computer vision. This paper proposes the algorithm for determination of real world coordinate of a point on a plane from its image coordinate using single calibrated camera based on simple analytic geometry. Experiment has been done using the image of chessboard pattern taken from five different views. The experiment result shows that exact real world coordinate and its approximation lie on the same plane and there are no significant difference between exact real world coordinate and its approximation.

international conference on asian digital libraries | 2007

Semantic similarity measures for Malay sentences

Shahrul Azman Mohd Noah; Amru Yusrin Amruddin; Nazlia Omar

The concept of semantic similarity is an important element in many applications such as information extraction, information retrieval, document clustering and ontology learning. Most of the previous works regarding semantic similarity measures have been traditionally defined between words or concepts (i.e. word-to-word similarity), thus ignoring the text or sentence that the concepts participate. Semantic text similarity was made possible with the availability of resources in the form of semantic lexicon such as the WordNet for English and GermaNet for German. However, for languages such as Malay, text similarity proved to be difficult due to the unavailability of similar resources. This paper, however, describe our approach for text similarity in Malay language. We used a preprocessed Malay dictionary and the overlap edge counting based method to first calculate the word-to-word semantic similarity. The word-to-word semantic similarity measure is then used to identify the semantic sentence similarity using a modified approach for English language. Results of the experiments are very encouraging, and indicate the potential of semantic similarity measure for Malay sentences.

international conference on information technology | 2014

Study on feature selection and machine learning algorithms for Malay sentiment classification

Ahmed Alsaffar; Nazlia Omar

Online social media is used to show the sentiments of different individuals about various subjects. Sentiment analysis or opinion mining has recently been considered as one of the highly dynamic research fields in natural language processing, Web mining, and machine learning. There has been a very limited amount of research that focuses on sentiment analysis in the Malay language. This study investigates how feature selection methods contribute to the improvement of Malay sentiment classification performance. Three supervised machine-learning classifiers and seven feature selection methods are used to conduct a series of experiments for the effective selection of the appropriate methods for the automatic sentiment classification of online Malay-written reviews. Findings show that the classifications of Malay sentiment improve using feature selections approaches. This work demonstrates that all feature reduction methods generally improve classifier performance. Support Vector Machine (SVM) approach provide the highest accuracy performance of features selection in order to classify Malay sentiment comparing with other classifications approaches such as PCA and CHI square. SVM records 87% as experimental accuracy result of feature selection.

international conference on information technology | 2014

Automatic Arabic text summarization using clustering and keyphrase extraction

Hamzah Noori Fejer; Nazlia Omar

As the number of electronic documents increases rapidly, the need for faster techniques to assess the relevance of these documents emerges. A summary is a concise representation of underlying text. A full understanding of the document is essential to form an ideal summary. However, achieving full understanding is either difficult or impossible for computers. Therefore, selecting important sentences from the original text and presenting these sentences as a summary present the most common techniques in automated text summarization. This paper propose a hybrid clustering method(partitioning and hierarchical) to group many Arabic documents into several clusters .Then keyphrase extraction module is applied to extract important Keyphrases from each cluster, which helps identify the most important sentences and find similar sentences based on several similarity algorithms. It applied to extract one sentence from a group of similar sentences while ignoring the other similar sentences (i.e., sentences that have a greater similarity than the predefined threshold). This model is designed for both single-and multi-document Arabic text summarization. The Recall-Oriented Understudy for Gisting Evaluation (ROGUE) matrix used for the evaluation. For the summarization dataset, Essex Arabic Summaries Corpus was used. It has many topic based articles with multiple human summaries. This model achieved an accuracy of 80 % for single-document and 62% for multi-document summarization.

international conference on electrical engineering and informatics | 2009

Developing learning software for children with learning disabilities through Block-Based development approach

Afiza Ismail; Nazlia Omar; Abdullah Mohd Zin

Children with learning disability such as autism who have serious impairments with social, emotional, and communication skills require a high degree of personalization in using the educational software develop for them. The aim of this paper is to propose Block-Based Software Development method and approach that enables the end-users (such as parents and teachers) to build application software to suit the different need of an autistic child. This research hopefully can produce useful tailorable learning software in order to assist educating autistic children.

Natural Language Engineering | 2017

Mapping Arabic WordNet synsets to Wikipedia articles using monolingual and bilingual features

Abdulgabbar Saif; Mohd Juzaiddin Ab Aziz; Nazlia Omar

The alignment of WordNet and Wikipedia has received wide attention from researchers of computational linguistics, who are building a new lexical knowledge source or enriching the semantic information of WordNet entities. The main challenge of this alignment is how to handle the synonymy and ambiguity issues in the contents of two units from different sources. Therefore, this paper introduces mapping method that links an Arabic WordNet synset to its corresponding article in Wikipedia. This method uses monolingual and bilingual features to overcome the lack of semantic information in Arabic WordNet. For evaluating this method, an Arabic mapping data set, which contains 1,291 synset–article pairs, is compiled. The experimental analysis shows that the proposed method achieves promising results and outperforms the state-of-the-art methods that depend only on monolingual features. The mapped method has also been used to increase the coverage of Arabic WordNet by inserting new synsets from Wikipedia.

Explore More