Is this you? Create Your Porfile

Yogan Jaya Kumar

Universiti Teknikal Malaysia Melaka

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yogan Jaya Kumar is active.

Explore More

Publication

Featured researches published by Yogan Jaya Kumar.

Applied Soft Computing | 2015

A framework for multi-document abstractive summarization based on semantic role labelling

Atif Khan; Naomie Salim; Yogan Jaya Kumar

We have proposed a framework for multi-document abstractive summarization based on semantic role labeling (SRL). To the best of our knowledge, SRL has not been employed for abstractive summarization.The integration of genetic algorithm with SRL based framework for abstractive summarization results gives improved summarization results.My study focus on two highlights and discussion is based on these two highlights. We propose a framework for abstractive summarization of multi-documents, which aims to select contents of summary not from the source document sentences but from the semantic representation of the source documents. In this framework, contents of the source documents are represented by predicate argument structures by employing semantic role labeling. Content selection for summary is made by ranking the predicate argument structures based on optimized features, and using language generation for generating sentences from predicate argument structures. Our proposed framework differs from other abstractive summarization approaches in a few aspects. First, it employs semantic role labeling for semantic representation of text. Secondly, it analyzes the source text semantically by utilizing semantic similarity measure in order to cluster semantically similar predicate argument structures across the text; and finally it ranks the predicate argument structures based on features weighted by genetic algorithm (GA). Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Results indicate that the proposed approach performs better than other summarization systems.

2012 International Conference on Information Retrieval & Knowledge Management | 2012

Text summarization features selection method using pseudo Genetic-based model

Albaraa Abuobieda; Naomie Salim; Ameer Tawfik Albaham; Ahmed Hamza Osman; Yogan Jaya Kumar

The features are considered the cornerstone of text summarization. The most important issue is what feature to be considered in a text summarization process. Including all the features in the summarization process may not be considered as an optimal solution. Therefore, other methods need to be deployed. In this paper, random five features used and investigated using a (pseudo) Genetic concept as an optimized trainable features selection mechanism. The Document Understanding Conference (DUC2002) used to train our proposed model; hence the objective of this paper is to learn the weight (importance) of each used feature. For each input document using the genetic concept, the size of the generation is defined and the chromosome dimension (genes) is equal to number of features used. Each gene is represents a feature and in binary format. A chromosome with high fitness value is selected to be enrolled in the final round. The average of each gene is computed for all best chromosomes and considered the weight of that feature. Our experimental result shows that our proposed model is able performing features selection process.

Applied Soft Computing | 2014

Multi document summarization based on news components using fuzzy cross-document relations

Yogan Jaya Kumar; Naomie Salim; Albaraa Abuobieda; Ameer Tawfik Albaham

Online information is growing enormously day by day with the blessing of World Wide Web. Search engines often provide users with abundant collection of articles; in particular, news articles which are retrieved from different news sources reporting on the same event. In this work, we aim to produce high quality multi document news summaries by taking into account the generic components of a news story within a specific domain. We also present an effective method, named Genetic-Case Base Reasoning, to identify cross-document relations from un-annotated texts. Following that, we propose a new sentence scoring model based on fuzzy reasoning over the identified cross-document relations. The experimental findings show that the proposed approach performed better that the conventional graph based and cluster based approach.

asian conference on intelligent information and database systems | 2013

An improved evolutionary algorithm for extractive text summarization

Albaraa Abuobieda; Naomie Salim; Yogan Jaya Kumar; Ahmed Hamza Osman

The main challenge of extractive-base text summarization is in selecting the top representative sentences from the input document. Several techniques were proposed to enhance the process of selection such as feature-base, cluster-base, and graph-base methods. Basically, this paper proposed to enhance a previous work, and provides some limitations in the similarity calculation of that previous work. This paper proposes an enhanced mixed feature-base and cluster-base approaches to produce a high qualified single-document summary. We used the Jaccard similarity measure to adjust the sentence clustering process instead of using the Normalized Google Distance (NGD) similarity measure. In addition, this paper proposes a new real-to-integer values modulator instead of using the genetic mutation operator which was adopted in the previous work. The Differential Evolution (DE) algorithm is used for train and test the proposed methods. The DUC2002 dataset was preprocessed and used as a test bed. The results show that our proposed differential mutant presented a satisfied performance while the Genetic mutant proved to be the better. In addition, our analysis of NGD similarity scores showed that NGD was an inappropriate selection in the previous study as it performs successfully in a very big database such as Google. Our selection of Jaccard measure was fortunate and obtained superior results surpassed the NGD using the new proposed modulator and the genetic operator. In addition, both algorithms outperformed the standard baseline Microsoft Word Summarizer and Copernic methods.

international conference on computing electrical and electronic engineering | 2013

Multi document summarization based on cross-document relation using voting technique

Yogan Jaya Kumar; Naomie Salim; Albaraa Abuobieda; Ameer Tawfik

News articles which are available through online search often provide readers with large collection of texts. Especially in the case of news story, different news sources reporting on the same event usually returns multiple articles in response to a readers search. In this work, we first identify cross-document relations from un-annotated texts using Genetic-CBR approach. Following that, we develop a new sentence scoring model based on voting technique over the identified cross-document relations. Our experiments show that incorporating the proposed methods in the summarization process yields substantial improvement over the mainstream methods. The performances of all methods were evaluated using ROUGE - a standard evaluation metric used in text summarization.

Journal of Computer Science | 2016

A Review on Automatic Text Summarization Approaches

Yogan Jaya Kumar; Ong Sing Goh; Halizah Basiron; Ngo Hea Choon; Puspalata C Suppiah

It has been more than 50 years since the initial investigation on automatic text summarization was started. Various techniques have been successfully used to extract the important contents from text document to represent document summary. In this study, we review some of the studies that have been conducted in this still-developing research area. It covers the basics of text summarization, the types of summarization, the methods that have been used and some areas in which text summarization has been applied. Furthermore, this paper also reviews the significant efforts which have been put in studies concerning sentence extraction, domain specific summarization and multi document summarization and provides the theoretical explanation and the fundamental concepts related to it. In addition, the advantages and limitations concerning the approaches commonly used for text summarization are also highlighted in this study.

international conference on digital information processing and communications | 2015

Genetic semantic graph approach for multi-document abstractive summarization

Atif Khan; Naomie Salim; Yogan Jaya Kumar

The aim of automatic multi-document abstractive summarization is to create a compressed version of the source text and preserves the salient information. Existing graph based summarization methods treat sentence as bag of words, rely on content similarity measure and did not consider semantic relationships between sentences. These methods may fail in determining redundant sentences that are semantically equivalent. This paper introduces a genetic semantic graph based approach for multi-document abstractive summarization. Semantic graph from the document set is constructed in such a way that the graph nodes represent the predicate argument structures (PASs), extracted automatically by employing semantic role labeling (SRL); and the edges of graph correspond to semantic similarity weight determined from PAS-to-PAS semantic similarity, and PAS-to-document set relationship. The PAS-to-document set relationship is represented by different features, weighted and optimized by genetic algorithm. The salient graph nodes (PASs) are ranked based on modified graph based ranking algorithm. In order to reduce redundancy, we utilize maximal marginal relevance (MMR) to re-ranks the PASs and use language generation to generate summary sentences from the top ranked PASs. Experiment of this study is carried out using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach performs better than other summarization systems.

asian conference on intelligent information and database systems | 2013

Opposition differential evolution based method for text summarization

Albaraa Abuobieda; Naomie Salim; Yogan Jaya Kumar; Ahmed Hamza Osman

The Evolutionary Algorithms (EAs) save sufficient data about problem features, search space, and population information during the runtime. Accordingly, the machine learning (ML) techniques were employed for examining these data to improve the EAs search performance compared with their classical versions. This paper employs an Opposition-Based Learning as ML approach for enhancing the initial population of the Differential Evolution algorithm in problem of text summarization. In addition, it investigates the use of the OBL technique in integer-based evolutionary populations. The objective of this proposed enhancement is to adjust the algorithm booting instead of relying on random numbers generations only. Basically, all methodology steps in this paper were presented by a previous study whereas the differences between both of them will be shown later. So, this paper tries to estimate the improvement size the OBL can achieve and compare the results with a traditional DE-based text summarization application and other baseline methods. The DUC2002 data set was assigned as a test bed and the ROUGE toolkit used to evaluate the methods performances. The experimental results showed that our proposed method assured the need for learning and improve the random-based EAs before proceed generating the solutions. The study findings conclude that our proposed method outperformed a classical DE and other baseline methods in terms of F-measure. OBL was broadly tested before in numerical test beds, in this paper it will be tested on text-based test bed news article of text summarization problem.

International Conference on Advanced Machine Learning Technologies and Applications | 2012

Fuzzy Semantic Plagiarism Detection

Ahmed Hamza Osman; Naomie Salim; Yogan Jaya Kumar; Albaraa Abuobieda

This paper introduces a plagiarism detection scheme based on a Fuzzy Inference System and Semantic Role Labeling (FIS-SRL). The proposed technique analyses and compares text based on a semantic allocation for each term inside the sentence. SRL offers significant advantages when generating arguments for each sentence semantically. Voting for each argument generated by the FIS in order to select important arguments is also another feature of the proposed method. It has been concluded that not all arguments in the text affect the plagiarism detection process. Therefore, only the most important arguments were selected by the FIS, and the results have been used in the similarity calculation process. Experimental tests have been applied on the PAN-PC-09 data set and the results shows that the proposed method exhibits a better performance than the available recent methods of plagiarism detection, in terms of Recall, Precision and F-measure.

asian conference on intelligent information and database systems | 2017

Text Summarization Based on Classification Using ANFIS

Yogan Jaya Kumar; Fong Jia Kang; Ong Sing Goh; Atif Khan

The information overload faced by today’s society has created a big challenge for people who want to look for relevant information from the internet. There are a lot of online documents available and digesting such large texts collection is not an easy task. Hence, automatic text summarization is required to automate the process of summarizing text by extracting only the salient information from the documents. In this paper, we propose a text summarization model based on classification using Adaptive Neuro-Fuzzy Inference System (ANFIS). The model can learn to filter high quality summary sentences. We then compare the performance of our proposed model with the existing approaches which are based on neural network and fuzzy logic techniques. ANFIS was able to alleviate the limitations in the existing approaches and the experimental finding of this study shows that the proposed model yields better results in terms of precision, recall and F-measure on the Document Understanding Conference (DUC) data corpus.

Explore More