Luciano de Souza Cabral

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Luciano de Souza Cabral is active.

Explore More

Publication

Featured researches published by Luciano de Souza Cabral.

Expert Systems With Applications | 2013

Assessing sentence scoring techniques for extractive text summarization

Rafael Ferreira; Luciano de Souza Cabral; Rafael Dueire Lins; Gabriel de França Pereira e Silva; Fred Freitas; George D. C. Cavalcanti; Rinaldo Lima; Steven J. Simske; Luciano Favaro

Abstract Text summarization is the process of automatically creating a shorter version of one or more text documents. It is an important way of finding relevant information in large text libraries or in the Internet. Essentially, text summarization techniques are classified as Extractive and Abstractive. Extractive techniques perform text summarization by selecting sentences of documents according to some criteria. Abstractive summaries attempt to improve the coherence among sentences by eliminating redundancies and clarifying the contest of sentences. In terms of extractive summarization, sentence scoring is the technique most used for extractive text summarization. This paper describes and performs a quantitative and qualitative assessment of 15 algorithms for sentence scoring available in the literature. Three different datasets (News, Blogs and Article contexts) were evaluated. In addition, directions to improve the sentence extraction results obtained are suggested.

Expert Systems With Applications | 2014

A multi-document summarization system based on statistics and linguistic treatment

Rafael Ferreira; Luciano de Souza Cabral; Frederico Luiz Gonçalves de Freitas; Rafael Dueire Lins; Gabriel de França Pereira e Silva; Steven J. Simske; Luciano Favaro

The massive quantity of data available today in the Internet has reached such a huge volume that it has become humanly unfeasible to efficiently sieve useful information from it. One solution to this problem is offered by using text summarization techniques. Text summarization, the process of automatically creating a shorter version of one or more text documents, is an important way of finding relevant information in large text libraries or in the Internet. This paper presents a multi-document summarization system that concisely extracts the main aspects of a set of documents, trying to avoid the typical problems of this type of summarization: information redundancy and diversity. Such a purpose is achieved through a new sentence clustering algorithm based on a graph model that makes use of statistic similarities and linguistic treatment. The DUC 2002 dataset was used to assess the performance of the proposed system, surpassing DUC competitors by a 50% margin of f-measure, in the best case.

document analysis systems | 2014

A Context Based Text Summarization System

Rafael Ferreira; Frederico Luiz Gonçalves de Freitas; Luciano de Souza Cabral; Rafael Dueire Lins; Rinaldo Lima; Gabriel Franca; Steven J. Simske; Luciano Favaro

Text summarization is the process of creating a shorter version of one or more text documents. Automatic text summarization has become an important way of finding relevant information in large text libraries or in the Internet. Extractive text summarization techniques select entire sentences from documents according to some criteria to form a summary. Sentence scoring is the technique most used for extractive text summarization, today. Depending on the context, however, some techniques may yield better results than some others. This paper advocates the thesis that the quality of the summary obtained with combinations of sentence scoring methods depend on text subject. Such hypothesis is evaluated using three different contexts: news, blogs and articles. The results obtained show the validity of the hypothesis formulated and point at which techniques are more effective in each of those contexts studied.

Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013

A Four Dimension Graph Model for Automatic Text Summarization

Rafael Ferreira; Frederico Luiz Gonçalves de Freitas; Luciano de Souza Cabral; Rafael Dueire Lins; Rinaldo Lima; Gabriel Franca; Steven J. Simskez; Luciano Favaro

Text summarization is the process of automatically creating a shorter version of one or more text documents. In this context, word-based, sentence-based and graph-based methods approaches are largely used. Among these, graph based methods for automatic text summarization produce summaries based on the relationships between sentences. These relationships may also support the creation of several text processing applications such as extractive and abstractive summaries, question-answering and information retrieval systems, among others. A new graph model for text processing applications is proposed in this paper. It relies on four dimensions (similarity, semantic similarity, co reference, discourse information) to create the graph. The rationale behind the proposal presented here is resorting to more dimensions than previous works, and taking into account co reference resolution, taking into account to the role of pronouns in connecting the sentences. Co reference was not used in any previous graph based summarization technique. An experiment was performed using the Text Rank algorithm with the presented approach, on the CNN corpus. The results show that the model proposed here outperforms the current approaches both quantitatively and qualitatively.

document engineering | 2015

Automatic Text Document Summarization Based on Machine Learning

Gabriel de França Pereira e Silva; Rafael Ferreira; Rafael Dueire Lins; Luciano de Souza Cabral; Hilário Oliveira; Steven J. Simske; Marcelo Riss

The need for automatic generation of summaries gained importance with the unprecedented volume of information available in the Internet. Automatic systems based on extractive summarization techniques select the most significant sentences of one or more texts to generate a summary. This article makes use of Machine Learning techniques to assess the quality of the twenty most referenced strategies used in extractive summarization, integrating them in a tool. Quantitative and qualitative aspects were considered in such assessment demonstrating the validity of the proposed scheme. The experiments were performed on the CNN-corpus, possibly the largest and most suitable test corpus today for benchmarking extractive summarization strategies.

document engineering | 2014

A platform for language independent summarization

Luciano de Souza Cabral; Rafael Dueire Lins; Rafael Fe Mello; Fred Freitas; Bruno Tenório Ávila; Steven J. Simske; Marcelo Riss

The text data available on the Internet is not only huge in volume, but also in diversity of subject, quality and idiom. Such factors make it infeasible to efficiently scavenge useful information from it. Automatic text summarization is a possible solution for efficiently addressing such a problem, because it aims to sieve the relevant information in documents by creating shorter versions of the text. However, most of the techniques and tools available for automatic text summarization are designed only for the English language, which is a severe restriction. There are multilingual platforms that support, at most, 2 languages. This paper proposes a language independent summarization platform that provides corpus acquisition, language classification, translation and text summarization for 25 different languages.

mexican international conference on artificial intelligence | 2015

Automatic Summarization of News Articles in Mobile Devices

Luciano de Souza Cabral; Rinaldo Lima; Rafael Dueire Lins; Manoel Neto; Rafael Ferreira; Steven J. Simske; Marcelo Riss

Smartphones and tablets provide access to the Web anywhere and anytime. Automatic Text Summarization techniques aim to extract the fundamental information in documents. Making automatic summarization work in portable devices is a challenge, in several aspects. This paper presents an automatic summarization application for Android devices. The proposed solution is a multi-feature language independent summarization application targeted at news articles. Several evaluation assessments were conducted and indicate that the proposed solution provides good results.

document engineering | 2015

Automatic Document Classification using Summarization Strategies

Rafael Ferreira; Rafael Dueire Lins; Luciano de Souza Cabral; Fred Freitas; Steven J. Simske; Marcelo Riss

An efficient way to automatically classify documents may be provided by automatic text summarization, the task of creating a shorter text from one or several documents. This paper presents an assessment of the 15 most widely used methods for automatic text summarization from the text classification perspective. A naive Bayes classifier was used showing that some of the methods tested are better suited for such a task.

database and expert systems applications | 2013

An Inductive Logic Programming-Based Approach for Ontology Population from the Web

Rinaldo Lima; Bernard Espinasse; Hilário Oliveira; Rafael Ferreira; Luciano de Souza Cabral; Dimas Melo Filho; Fred Freitas; Renê Nóbrega de Sousa Gadelha

Developing linguistically data-compliant rules for entity extraction is usually an intensive and time-consuming process for any ontology engineer. Thus, an automated mechanism to convert textual data into ontology instances Ontology Population may be crucial. In this context, this paper presents an inductive logic programming-based method that induces rules for extracting instances of various entity classes. This method uses two sources of evidence: domain-independent linguistic patterns for identifying candidates of class instances, and a WordNet semantic similarity measure. These two evidences are integrated as background knowledge to automatically generate extractions rules by a generic inductive logic programming system. Some experiments were conducted on the class instance classification problem with encouraging results.

web intelligence | 2013

A Four Dimension Graph Model for Automatic Text Summarization.

Rafael Ferreira; Frederico Luiz Gonçalves de Freitas; Luciano de Souza Cabral; Rafael Dueire Lins; Rinaldo Lima; Gabriel de França Pereira e Silva; Steven J. Simske; Luciano Favaro

Explore More