Is this you? Create Your Porfile

Fred Freitas

Federal University of Pernambuco

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Fred Freitas is active.

Explore More

Publication

Featured researches published by Fred Freitas.

Expert Systems With Applications | 2013

Assessing sentence scoring techniques for extractive text summarization

Rafael Ferreira; Luciano de Souza Cabral; Rafael Dueire Lins; Gabriel de França Pereira e Silva; Fred Freitas; George D. C. Cavalcanti; Rinaldo Lima; Steven J. Simske; Luciano Favaro

Abstract Text summarization is the process of automatically creating a shorter version of one or more text documents. It is an important way of finding relevant information in large text libraries or in the Internet. Essentially, text summarization techniques are classified as Extractive and Abstractive. Extractive techniques perform text summarization by selecting sentences of documents according to some criteria. Abstractive summaries attempt to improve the coherence among sentences by eliminating redundancies and clarifying the contest of sentences. In terms of extractive summarization, sentence scoring is the technique most used for extractive text summarization. This paper describes and performs a quantitative and qualitative assessment of 15 algorithms for sentence scoring available in the literature. Three different datasets (News, Blogs and Article contexts) were evaluated. In addition, directions to improve the sentence extraction results obtained are suggested.

Journal of Web Semantics | 2012

MultiFarm: A benchmark for multilingual ontology matching

Christian Meilicke; Raúl García-Castro; Fred Freitas; Willem Robert van Hage; Elena Montiel-Ponsoda; Ryan Ribeiro de Azevedo; Heiner Stuckenschmidt; Ondřej Šváb-Zamazal; Vojtěch Svátek; Andrei Tamilin; Cássia Trojahn; Shenghui Shenghui Wang

In this paper we present the MultiFarm dataset, which has been designed as a benchmark for multilingual ontology matching. The MultiFarm dataset is composed of a set of ontologies translated in different languages and the corresponding alignments between these ontologies. It is based on the OntoFarm dataset, which has been used successfully for several years in the Ontology Alignment Evaluation Initiative (OAEI). By translating the ontologies of the OntoFarm dataset into eight different languages-Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish-we created a comprehensive set of realistic test cases. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.

Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013

Evaluating Ontologies with Competency Questions

Camila Bezerra; Fred Freitas; Filipe Santana

Competency Questions(CQs) play an important role in the ontology development lifecycle, as they represent the ontology requirements. Although the main methodologies describe and use CQs, the current practice of ontology engineering makes a superficial use of CQs. One of the main problems that hamper their proper use lies on the lack of tools that assist users to check if CQs are being fulfilled by the ontology being defined, particularly when these ontologies are defined in OWL (Ontology Web Language), under the Description Logic formalism. We propose a mechanism to support evaluating whether the ontology follows their correspondent CQs.

document engineering | 2014

A new sentence similarity assessment measure based on a three-layer sentence representation

Rafael Ferreira; Rafael Dueire Lins; Fred Freitas; Steven J. Simske; Marcelo Riss

Sentence similarity is used to measure the degree of likelihood between sentences. It is used in many natural language applications, such as text summarization, information retrieval, text categorization, and machine translation. The current methods for assessing sentence similarity represent sentences as vectors of bag of words or the syntactic information of the words in the sentence. The degree of likelihood between phrases is calculated by composing the similarity between the words in the sentences. Two important concerns in the area, the meaning problem and the word order, are not handled, however. This paper proposes a new sentence similarity assessment measure that largely improves and refines a recently published method that takes into account the lexical, syntactic and semantic components of sentences. The new method proposed here was benchmarked using a publically available standard dataset. The results obtained show that the new similarity assessment measure proposed outperforms the state of the art systems and achieve results comparable to the evaluation made by humans.

Expert Systems With Applications | 2016

Assessing shallow sentence scoring techniques and combinations for single and multi-document summarization

Hilário Oliveira; Rafael Ferreira; Rinaldo Lima; Rafael Dueire Lins; Fred Freitas; Marcelo Riss; Steven J. Simske

We investigate eighteen shallow sentence scoring techniques and ensemble strategies.Experiments were performed in several datasets for single- and multi-document task.Ensemble strategies lead to improvements over the individual scoring techniques.Ensembles that perform competitively against the state-of-the-art were identified. The volume of text data has been growing exponentially in the last years, mainly due to the Internet. Automatic Text Summarization has emerged as an alternative to help users find relevant information in the content of one or more documents. This paper presents a comparative analysis of eighteen shallow sentence scoring techniques to compute the importance of a sentence in the context of extractive single- and multi-document summarization. Several experiments were made to assess the performance of such techniques individually and applying different combination strategies. The most traditional benchmark on the news domain demonstrates the feasibility of combining such techniques, in most cases outperforming the results obtained by isolated techniques. Combinations that perform competitively with the state-of-the-art systems were found.

Expert Systems With Applications | 2013

RetriBlog: An architecture-centered framework for developing blog crawlers

Rafael Ferreira; Fred Freitas; Patrick H. S. Brito; Jean Melo; Rinaldo Lima; Evandro Costa

Blogs have become an important social tool. It allows the users to share their tastes, express their opinions, report news, form groups related to some subject, among others. The information obtained from the blogosphere may be used to create several applications in various fields. However, due to the growing number of blogs posted every day, as well as the dynamicity of the blogosphere, the task of extracting relevant information from the blogs has become difficult and time consuming. In this paper, we use information retrieval and extraction techniques to deal with this problem. Furthermore, as blogs have many variation points is required to provide applications that can be easily adapted. Faced with this scenario, the work proposes RetriBlog, an architecture-centered framework for the development of blog crawlers. Finally, it presents an evaluation of the proposed algorithms and three case studies.

Computer Speech & Language | 2016

Assessing sentence similarity through lexical, syntactic and semantic analysis

Rafael Ferreira; Rafael Dueire Lins; Steven J. Simske; Fred Freitas; Marcelo Riss

Abstract The degree of similarity between sentences is assessed by sentence similarity methods. Sentence similarity methods play an important role in areas such as summarization, search, and categorization of texts, machine translation, etc. The current methods for assessing sentence similarity are based only on the similarity between the words in the sentences. Such methods either represent sentences as bag of words vectors or are restricted to the syntactic information of the sentences. Two important problems in language understanding are not addressed by such strategies: the word order and the meaning of the sentence as a whole. The new sentence similarity assessment measure presented here largely improves and refines a recently published method that takes into account the lexical, syntactic and semantic components of sentences. The new method was benchmarked using Li–McLean, showing that it outperforms the state of the art systems and achieves results comparable to the evaluation made by humans. Besides that, the method proposed was extensively tested using the SemEval 2012 sentence similarity test set and in the evaluation of the degree of similarity between summaries using the CNN-corpus. In both cases, the measure proposed here was proved effective and useful.

Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2014

An Approach for Learning and Construction of Expressive Ontology from Text in Natural Language

Ryan Ribeiro de Azevedo; Fred Freitas; Rodrigo G. C. Rocha; José Antônio Alves de Menezes; Cleyton Rodrigues; Gabriel de França Pereira e Silva

In this paper, we present an approach based on Ontology Learning and Natural Language Processing for automatic construction of expressive Ontologies, specifically in OWL DL with ALC expressivity, from a natural language text. The viability of our approach is demonstrated through the generation of descriptions of complex axioms from concepts defined by users and glossaries found at Wikipedia. We evaluated our approach in an experiment with entry sentences enriched with hierarchy axioms, disjunction, conjunction, negation, as well as existential and universal quantification to impose restriction of properties. The obtained results prove that our model is an effective solution for knowledge representation and automatic construction of expressive Ontologies. Thereby, it assists professionals involved in processes for obtain, construct and model knowledge domain.

intelligent systems in molecular biology | 2011

Ontology patterns for tabular representations of biomedical knowledge on neglected tropical diseases

Filipe Santana; Daniel Schober; Zulma Medeiros; Fred Freitas; Stefan Schulz

Motivation: Ontology-like domain knowledge is frequently published in a tabular format embedded in scientific publications. We explore the re-use of such tabular content in the process of building NTDO, an ontology of neglected tropical diseases (NTDs), where the representation of the interdependencies between hosts, pathogens and vectors plays a crucial role. Results: As a proof of concept we analyzed a tabular compilation of knowledge about pathogens, vectors and geographic locations involved in the transmission of NTDs. After a thorough ontological analysis of the domain of interest, we formulated a comprehensive design pattern, rooted in the biomedical domain upper level ontology BioTop. This pattern was implemented in a VBA script which takes cell contents of an Excel spreadsheet and transforms them into OWL-DL. After minor manual post-processing, the correctness and completeness of the ontology was tested using pre-formulated competence questions as description logics (DL) queries. The expected results could be reproduced by the ontology. The proposed approach is recommended for optimizing the acquisition of ontological domain knowledge from tabular representations. Availability and implementation: Domain examples, source code and ontology are freely available on the web at http://www.cin.ufpe.br/~ntdo. Contact: [email protected]

document engineering | 2014

A platform for language independent summarization

Luciano de Souza Cabral; Rafael Dueire Lins; Rafael Fe Mello; Fred Freitas; Bruno Tenório Ávila; Steven J. Simske; Marcelo Riss

The text data available on the Internet is not only huge in volume, but also in diversity of subject, quality and idiom. Such factors make it infeasible to efficiently scavenge useful information from it. Automatic text summarization is a possible solution for efficiently addressing such a problem, because it aims to sieve the relevant information in documents by creating shorter versions of the text. However, most of the techniques and tools available for automatic text summarization are designed only for the English language, which is a severe restriction. There are multilingual platforms that support, at most, 2 languages. This paper proposes a language independent summarization platform that provides corpus acquisition, language classification, translation and text summarization for 25 different languages.

Explore More