Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maria Lucía del Rosario Castro Jorge is active.

Publication


Featured researches published by Maria Lucía del Rosario Castro Jorge.


processing of the portuguese language | 2010

Formalizing CST-based content selection operations

Maria Lucía del Rosario Castro Jorge; Thiago Alexandre Salgueiro Pardo

This paper presents the definition and formalization of content selection operations based on CST (Cross-document Structure Theory) for multidocument summarization purposes.


2009 Seventh Brazilian Symposium in Information and Human Language Technology | 2009

Content Selection Operators for Multidocument Summarization Based on Cross-Document Structure Theory

Maria Lucía del Rosario Castro Jorge; Thiago Alexandre Salgueiro Pardo

This paper aims at presenting an analysis of content selection techniques for multidocument summarization based on the multidocument discourse theory CST (Cross-document Structure Theory). We approach the task of content selection by using CST-based operators and focus specifically on redundancy treatment, which is an important and pervasive problem in multidocument summarization. Our experiments with Brazilian Portuguese news texts show that CST improves summaries quality by exploring relations among texts. Particularly, redundancy is reduced by identifying common information among texts, especially when compression rate is low.


brazilian conference on intelligent systems | 2014

Building a Language Model for Local Coherence in Multi-document Summaries Using a Discourse-Enriched Entity-Based Model

Maria Lucía del Rosario Castro Jorge; Márcio de Souza Dias; Thiago Alexandre Salgueiro Pardo

Local Coherence is a very important aspect in multi-document summarization, since good summaries not only condense the most relevant information, but also present it in a well-organized structure. One of the most investigated models for local coherence is the Entity-based model, which has been successfully used, once it facilitates the computational approach for coherence measurement. Particularly, this model was used for the evaluation of local coherence in multi-document summaries, achieving promising results. In order to improve the potential of the Entity-based model, we propose the creation of a language model for multi-document summaries that integrates the Entity-based model with discourse knowledge, mainly from Cross-document Structure Theory. Our results show that this type of information enriches the Entity-based Model by capturing other phenomena that are inherent to multi-document summaries, such as redundancy and complementarily, which improves the performance of the original model.


Information Processing and Management | 2014

Revisiting Cross-document Structure Theory for multi-document discourse parsing

Erick Galani Maziero; Maria Lucía del Rosario Castro Jorge; Thiago Alexandre Salgueiro Pardo

Multi-document discourse parsing aims to automatically identify the relations among textual spans from different texts on the same topic. Recently, with the growing amount of information and the emergence of new technologies that deal with many sources of information, more precise and efficient parsing techniques are required. The most relevant theory to multi-document relationship, Cross-document Structure Theory (CST), has been used for parsing purposes before, though the results had not been satisfactory. CST has received many critics because of its subjectivity, which may lead to low annotation agreement and, consequently, to poor parsing performance. In this work, we propose a refinement of the original CST, which consists in (i) formalizing the relationship definitions, (ii) pruning and combining some relations based on their meaning, and (iii) organizing the relations in a hierarchical structure. The hypothesis for this refinement is that it will lead to better agreement in the annotation and consequently to better parsing results. For this aim, it was built an annotated corpus according to this refinement and it was observed an improvement in the annotation agreement. Based on this corpus, a parser was developed using machine learning techniques and hand-crafted rules. Specifically, hierarchical techniques were used to capture the hierarchical organization of the relations according to the proposed refinement of CST. These two approaches were used to identify the relations among texts spans and to generate multi-document annotation structure. Results outperformed other CST parsers, showing the adequacy of the proposed refinement in the theory.


natural language processing and cognitive science | 2010

Identifying Multidocument Relations

Erick Galani Maziero; Maria Lucía del Rosario Castro Jorge; Thiago Alexandre Salgueiro Pardo


Archive | 2011

A Generative Approach for Multi-Document Summarization using the Noisy Channel Model

Maria Lucía del Rosario Castro Jorge; Thiago Alexandre; Salgueiro Pardo; Núcleo Interinstitucional


processing of the portuguese language | 2014

Enriquecendo o Córpus CSTNews - a Criação de Novos Sumários Multidocumento

Márcio de Souza Dias; Alessandro Yovan Bokan Garay; Carla Chuman; Cláudia Dias de Barros; Erick Galani Maziero; Fernando Antônio Asevedo Nóbrega; Jackson Wilke da Cruz Souza; Marco Antonio Sobrevilla Cabezudo; Marina Delege; Maria Lucía del Rosario Castro Jorge; Naira L. Silva; Paula Christina Figueira Cardoso; Pedro Paulo Balage Filho; Roque Enrique López Condori; Vanessa Marcasso; Ariani Di Felippo; Maria das Graças Volpe Nunes; Thiago Alexandre Salgueiro Pardo


Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural, XXXI | 2015

Exploring the Rhetorical Structure Theory for multi-document summarization

Paula Christina Figueira Cardoso; Maria Lucía del Rosario Castro Jorge; Thiago Alexandre Salgueiro Pardo


Archive | 2012

Anotação de subtópicos do córpus multidocumento CSTNews.

Paula Christina Figueira Cardoso; Amanda P. Rassi; Erick Galani Maziero; Fernando Antônio Asevedo Nóbrega; Jackson Wilke da Cruz Souza; Márcio de Souza Dias; Maria Lucía del Rosario Castro Jorge; Pedro Paulo Balage Filho; Renata T. Camargo; Verônica Agostini; Ariani Di Felippo; Lucia Helena Machado Rino; Thiago Alexandre Salgueiro Pardo


STIL | 2011

A Generative Approach for Multi-Document Summarization using Semantic-Discursive information

Maria Lucía del Rosario Castro Jorge; Thiago Alexandre; Salgueiro Pardo; Núcleo Interinstitucional

Collaboration


Dive into the Maria Lucía del Rosario Castro Jorge's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ariani Di Felippo

Federal University of São Carlos

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jackson Wilke da Cruz Souza

Federal University of São Carlos

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge