Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sandra Maria Aluísio is active.

Publication


Featured researches published by Sandra Maria Aluísio.


document engineering | 2008

Towards Brazilian Portuguese automatic text simplification systems

Sandra Maria Aluísio; Lucia Specia; Thiago Alexandre Salgueiro Pardo; Erick Galani Maziero; Renata Pontin de Mattos Fortes

In this paper we investigate the main linguistic phenomena that can make texts complex and how they could be simplified. We focus on a corpus analysis of simple account texts available on the web for Brazilian Portuguese and propose simplification strategies for this language. This study illustrates the need for text simplification to facilitate accessibility to information by poor literacy readers and potentially by people with other cognitive disabilities. It also highlights characteristics of simplification for Portuguese, which may differ from other languages. Such study consists of the first step towards building Brazilian Portuguese text simplification systems. One of the scenarios in which these systems could be used is that of reading electronic texts produced, e.g., by the Brazilian government or by relevant news agencies.


international conference on design of communication | 2009

Facilita: reading assistance for low-literacy readers

Willian Massami Watanabe; Arnaldo Candido Junior; Vinícius Rodrigues de Uzêda; Renata Pontin de Mattos Fortes; Thiago Alexandre Salgueiro Pardo; Sandra Maria Aluísio

Texts are the media content primarily available on Web sites and applications. However, this heavy use of texts creates an accessibility barrier to those who cannot read fluently in their mother tongue due to both text length and linguistic complexity. To offer an accessible alternative to these readers, shorter and simplified versions of text content should be provided. Taking that into consideration, this paper introduces Facilita, an assistive technology to help lower-literacy users to understand the text content of Web applications. Facilita generates an accessible content from Web pages automatically, using summarization and simplification techniques. It is also important to consider interface design requirements, since Facilitas target audience (the functionally illiterate) is often classified as computer illiterate as well. Thus, interaction and user interface design were developed considering the limitations and skills of the functionally illiterate.


workshop on innovative use of nlp for building educational applications | 2009

Supporting the Adaptation of Texts for Poor Literacy Readers: a Text Simplification Editor for Brazilian Portuguese

Arnaldo Candido; Erick Galani Maziero; Lucia Specia; Caroline Gasperin; Thiago Alexandre Salgueiro Pardo; Sandra Maria Aluísio

In this paper we investigate the task of text simplification for Brazilian Portuguese. Our purpose is three-fold: to introduce a simplification tool for such language and its underlying development methodology, to present an on-line authoring system of simplified text based on the previous tool, and finally to discuss the potentialities of such technology for education. The resources and tools we present are new for Portuguese and innovative in many aspects with respect to previous initiatives for other languages.


processing of the portuguese language | 2003

An account of the challenge of tagging a reference corpus for Brazilian Portuguese

Sandra Maria Aluísio; Jorge Marques Pelizzoni; Ana Raquel Marchi; Lucélia de Oliveira; Regiana Manenti; Vanessa Marquiafável

This article identifies and addresses the major linguistic/conceptual, as opposed to logistic, issues faced in the morphosyntactic tagging of MAC-Morpho, a 1.1 million word Brazilian Portuguese corpus of newspaper articles that has been developed in the Lacio-Web Project. Rather than simply presenting the annotated corpus and describing its tagset, we elaborate on the criteria for establishing the tagset and analyze some interesting cases amongst the linguistic problems we faced in this work.


international conference on design of communication | 2008

A corpus analysis of simple account texts and the proposal of simplification strategies: first steps towards text simplification systems

Sandra Maria Aluísio; Lucia Specia; Thiago Alexandre Salgueiro Pardo; Erick Galani Maziero; Helena de Medeiros Caseli; Renata Pontin de Mattos Fortes

In this paper we investigate the main linguistic phenomena that can make texts complex and how they could be simplified. We focus on a corpus analysis of simple account texts available on the web for Brazilian Portuguese (BP). This study illustrates the need for text simplification to facilitate accessibility to information by poor readers and by people with cognitive disabilities. It also highlights features of simplification for BP, which may differ from other languages. Moreover, we propose simplification strategies and a Simplification Annotation Editor. This study consists of the first step towards building BP text simplification systems. One of the scenarios in which these systems could be used is that of reading electronic texts produced, e.g., by the Brazilian government or by news agencies.


Journal of the Brazilian Computer Society | 2015

Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese

Erick Rocha Fonseca; João Luís Garcia Rosa; Sandra Maria Aluísio

BackgroundPart-of-speech tagging is an important preprocessing step in many natural language processing applications. Despite much work already carried out in this field, there is still room for improvement, especially in Portuguese. We experiment here with an architecture based on neural networks and word embeddings, and that has achieved promising results in English.MethodsWe tested our classifier in different corpora: a new revision of the Mac-Morpho corpus, in which we merged some tags and performed corrections and two previous versions of it. We evaluate the impact of using different types of word embeddings and explicit features as input.ResultsWe compare our tagger’s performance with other systems and achieve state-of-the-art results in the new corpus. We show how different methods for generating word embeddings and additional features differ in accuracy.ConclusionsThe work reported here contributes with a new revision of the Mac-Morpho corpus and a state-of-the-art new tagger available for use out-of-the-box.


EPL | 2012

Complex networks analysis of language complexity

Diego R. Amancio; Sandra Maria Aluísio; Osvaldo N. Oliveira; Luciano da Fontoura Costa

Methods from statistical physics, such as those involving complex networks, have been increasingly used in the quantitative analysis of linguistic phenomena. In this paper, we represented pieces of text with different levels of simplification in co-occurrence networks and found that topological regularity correlated negatively with textual complexity. Furthermore, in less complex texts the distance between concepts, represented as nodes, tended to decrease. The complex networks metrics were treated with multivariate pattern recognition techniques, which allowed us to distinguish between original texts and their simplified versions. For each original text, two simplified versions were generated manually with increasing number of simplification operations. As expected, distinction was easier for the strongly simplified versions, where the most relevant metrics were node strength, shortest paths and diversity. Also, the discrimination of complex texts was improved with higher hierarchical network metrics, thus pointing to the usefulness of considering wider contexts around the concepts. Though the accuracy rate in the distinction was not as high as in methods using deep linguistic knowledge, the complex network approach is still useful for a rapid screening of texts whenever assessing complexity is essential to guarantee accessibility to readers with limited reading ability.


international conference on advanced learning technologies | 2001

How to learn the many unwritten "rules of the game" of the academic discourse: a hybrid approach based on critiques and cases to support scientific writing

Sandra Maria Aluísio; Iris Barcelos; Jandir Sampaio; Osvaldo N. Oliveira

We present the computational and composition theoretical bases for the design of a collaborative writing tool, based on the critiquing approach, to assist non-native novice researchers in understanding and production of the structure of scientific papers. This critiquing tool is embedded in a suite named AMADEUS that caters for various needs of non-native English users to produce a first draft of a paper, relying on the reuse of contextualized linguistic material as input for the user. Our emphasis is on the architecture and methodology to build the linguistic resources for the critiquing tool. Though originally targeted at non-native authors, the critiquing tool may also be useful for novice native English writers and as a teaching resource for English for Academic Purposes practitioners.


international conference on case based reasoning | 1995

A Case-Based Approach for Developing Writing Tools Aimed at Non-native English Users

Sandra Maria Aluísio; Osvaldo N. Oliviera Jr.

A writing tool has been developed for helping non-native English users to produce a first draft of Introductory Sections of scientific papers. A corpus analysis was carried out in 54 papers of Experimental Physics which allowed one to identify the schematic structure of Introductions and 30 rhetorical strategies generally employed. Each one of the Introductions analysed constituted a case. The user chooses from menus features related to the rhetorical strategies for each component and gives the intended order for his/her Introduction, thus forming the requisition. Using three types of metric, the tool recovers the best-match cases that can be later modified in a revision process. Preliminary experiments showed that high precision and recall will only be obtained if the number of cases in the case base is considerably increased. In the revision process, four operations are suggested which consist in modifying/adding/deleting the different rhetorical messages that constitute the strategies of the chosen case.


processing of the portuguese language | 2010

Challenging choices for text simplification

Caroline Gasperin; Erick Galani Maziero; Sandra Maria Aluísio

In this paper we discuss particular choices we made during the development of a rule-based syntactic text simplification system. Such choices concern 1) how to deal with adverbial phrases in order to simplify sentences, and 2) the order in which to apply our set of simplification rules. Adverbial phrases have not been considered by previous work on text simplification, but have a considerable impact on the complexity of a sentence. Considering our whole set of simplification rules, we discuss and compare two different orders in which to apply them: empirical and hierarchical.

Collaboration


Dive into the Sandra Maria Aluísio's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rachel Aires

University of São Paulo

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge