Khairullah Khan
Universiti Teknologi Petronas
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Khairullah Khan.
ieee international conference on digital ecosystems and technologies | 2009
Khairullah Khan; Baharum Baharudin; Aurangzeb Khan; Fazal-e-Malik
Opinion Mining is a process, used for automatic extraction of knowledge from the opinion of others about some particular topic or problem. With the growing availability of online resources on web and popularity of fast and rich resources of opinion sharing such as online review sites and personal blogs, Opinion Mining has become an interesting area of research. World Wide Web is a fastest medium for opinion collection from users. Human perception and user opinion has greater potential for knowledge discovery and decision support. In this paper we have presented a survey which covers techniques and methods that promise to enable us to get opinion oriented information from text. This research effort deals with techniques and challenges related to sentiment analysis and Opinion Mining. We have followed systematic literature review process to conduct this survey. Our focus was mainly on machine learning techniques on the basis of their usage and importance for opinion mining. We have tried to identify most commonly used classification techniques for opinionated documents to assist future research in this area.
frontiers of information technology | 2010
Aurangzeb Khan; Baharum Baharudin; Khairullah Khan
Sentiment analysis is the process of analyzing and classifying the rewires contents about a product, event, and place etc into positive, negative or neutral opinion. In this paper; we propose a sentence level machine learning approach for sentiment classification of online reviews. The proposed method extracts the subjective sentences from the reviews and label each sentence either positive or negative based on its word level feature using naïve Naïve Bayesian (NB) classifier. The labeled sentences create an annotated set of sentences called as BOS (Bag-of-Sentences). We train Support Vector machine (SVM) classifier on the BOS for sentences polarity classification. The contextual information in each sentence structure is taken into consideration to calculate the semantic orientation. The effectiveness of the proposed method is evaluated thought simulation. Results show that our machine learning based proposed method on average achieves accuracy of 81% and 83% with some contextual information. This method improves the sentiment classification polarity on sentence level unlike the word level lexical feature based work, by focus on sentences, this also concentrate on contextual information.
international conference on software engineering and computer systems | 2011
Aurangzeb Khan; Baharum Baharudin; Khairullah Khan
Sentiment analysis is the procedure by which information is extracted from the opinions, appraisals and emotions of people in regards to entities, events and their attributes. In decision making, the opinions of others have a significant effect on customers, ease in making choices regards to online shopping, choosing events, products, entities, etc. When an important decision needs to be made, consumers usually want to know the opinion, sentiment and emotion of others. With rapidly growing online resources such as online discussion groups, forums and blogs, people are commentating via the Internet. As a result, a vast amount of new data in the form of customer reviews, comments and opinions about products, events and entities are being generated more and more. So it is desired to develop an efficient and effective sentiment analysis system for online customer reviews and comments. In this paper, the rule based domain independent sentiment analysis method is proposed. The proposed method classifies subjective and objective sentences from reviews and blog comments. The semantic score of subjective sentences is extracted from SentiWordNet to calculate their polarity as positive, negative or neutral based on the contextual sentence structure. The results show the effectiveness of the proposed method and it outperforms the word level and machine learning methods. The proposed method achieves an accuracy of 97.8% at the feedback level and 86.6% at the sentence level.
international symposium on information technology | 2010
Aurangzeb Khan; Baharum Baharudin; Khairullah Khan
Feature selection and weighting is of vital concern in text classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the documents using “Bag of Word” BOW model with term weighting phenomena. Documents representation through this model has some limitations that are, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem, Semantics Base Feature Vector using Part of Speech (POS), is proposed, which is used to extract the concept of terms using WordNet, co-occurring and associated terms. The proposed method is applied on small documents dataset which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification.
international conference on intelligent and advanced systems | 2012
Khairullah Khan; Baharum Baharudin
Collecting consumer opinion about products through web is becoming more popular day by day. The opinion of users is helpful to consumers, retailors, and manufacturers in decision making. Due to the huge number user reviews it is impossible to summarize it. Therefore systems are required for mining consumer reviews data efficiently. Opinion mining is an interesting area of research due to its applications in various fields. One of the challenging issues in this area is the identification of opinion components from unstructured reviews. The work of opinion mining is natural language dependent. Therefore syntactic patterns play a key role in identifying the opinion components. In this paper we have presented analysis of synaptic patterns for products features identification from unstructured reviews. Basically the noun phrases are used for named entity identification; however all noun phrases are not features. The problem is how to restrict the patterns to get the features. After in-depth analysis and evaluation we identify a new pattern which shown comparatively best result.
scalable information systems | 2018
Sadiq Nawaz Khan; Khairullah Khan; Wahab Khan
Urdu is the national language of Pakistan, also the most widely spoken and understandable language of the globe. In order to accomplish successful Urdu NLP a robust and high-performance NLP tools and resources are utmost necessary. Word segmentation takes on an authoritative role for morphologically rich languages such as Urdu for diverse NLP domains such as named entity recognition, sentiment analysis, part of speech tagging, information retrieval etc. The morphological richness property of Urdu adds to the challenges of the word segmentation task, because a single word can be composed of null or a few prefixes, a stem and null or a few suffixes. In this paper we present supervised Urdu word segmentation scheme based on part of speech (POS) information of the corresponding words. For experiments conditional random fields (CRF) with contextual feature is used. The performance of the proposed system is evaluated on 300K words, results shows evidential improvements on baseline approach.
Journal of Advances in Information Technology | 2010
Baharum Baharudin; Lam Hong Lee; Khairullah Khan
Journal of King Saud University - Computer and Information Sciences | 2014
Khairullah Khan; Baharum Baharudin; Aurnagzeb Khan; Ashraf Ullah
Trends in Applied Sciences Research | 2011
Aurangzeb Khan; Baharum Baharudin; Khairullah Khan
The International Arab Journal of Information Technology | 2014
Khairullah Khan; Baharum Baharudin; Aurangzeb Khan