Merley da Silva Conrado
University of São Paulo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Merley da Silva Conrado.
Journal of the Brazilian Computer Society | 2014
Merley da Silva Conrado; Ariani Di Felippo; Thiago Alexandre Salgueiro Pardo; Solange Oliveira Rezende
BackgroundTerm extraction is highly relevant as it is the basis for several tasks, such as the building of dictionaries, taxonomies, and ontologies, as well as the translation and organization of text data.Methods and ResultsIn this paper, we present a survey of the state of the art in automatic term extraction (ATE) for the Brazilian Portuguese language. In this sense, the main contributions and projects related to such task have been classified according to the knowledge they use: statistical, linguistic, and hybrid (statistical and linguistic). We also present a study/review of the corpora used in the term extraction in Brazilian Portuguese, as well as a geographic mapping of Brazil regarding such contributions, projects, and corpora, considering their origins.ConclusionsIn spite of the importance of the ATE, there are still several gaps to be filled, for instance, the lack of consensus regarding the formal definition of meaning of ‘term’. Such gaps are larger for the Brazilian Portuguese when compared to other languages, such as English, Spanish, and French. Examples of gaps for Brazilian Portuguese include the lack of a baseline ATE system, as well as the use of more sophisticated linguistic information, such as the WordNet and Wikipedia knowledge bases. Nevertheless, there is an increase in the number of contributions related to ATE and an interesting tendency to use contrasting corpora and domain stoplists, even though most contributions only use frequency, noun phrases, and morphosyntactic patterns.
mexican international conference on artificial intelligence | 2013
Merley da Silva Conrado; Thiago Alexandre Salgueiro Pardo; Solange Oliveira Rezende
Despite the importance of the term extraction methods and that several efforts have been devoted to improve them, they still have 4 main problems: (i) noise and silence generation; (ii) difficulty dealing with high number of terms; (iii) human effort and time to evaluate the terms; and (iv) still limited extraction results. In this paper, we deal with these four major problems in automatic term extraction by exploring a rich feature set in a machine learning approach. We minimized these problems and achieved state of the art results for unigrams in Brazilian Portuguese.
Expert Systems With Applications | 2016
Camila Vaccari Sundermann; Marcos Aurélio Domingues; Merley da Silva Conrado; Solange Oliveira Rezende
WA method that treats additional information as virtual items in recommender systems.The method is instantiated in three different algorithms for recommender systems.The algorithms are evaluated among themselves and against the state-of-the-art.Results show that the proposal improves the predictive ability of the recommenders. A recommender system is used in various fields to recommend items of interest to the users. Most recommender approaches focus only on the users and items to make the recommendations. However, in many applications, it is also important to incorporate contextual information into the recommendation process. Although the use of contextual information has received great focus in recent years, there is a lack of automatic methods to obtain such information for context-aware recommender systems. Some works address this problem by proposing supervised methods, which require greater human effort and whose results are not so satisfactory. In this scenario, we propose an unsupervised method to extract contextual information from web page content. Our method builds topic hierarchies from page textual content considering, besides the traditional bag-of-words, valuable information of texts as named entities and domain terms (privileged information). The topics extracted from the hierarchies are used as contextual information in context-aware recommender systems. We conducted experiments by using two data sets and two baselines: the first baseline is a recommendation system that does not use contextual information and the second baseline is a method proposed in literature to extract contextual information. The results are, in general, very good and present significant gains. In conclusion, our method has advantages and innovations:(i) it is unsupervised; (ii) it considers the context of the item (Web page), instead of the context of the user as in most of the few existing methods, which is an innovation; (iii) it uses privileged information in addition to the existing technical information from pages; and (iv) it presented good and promising empirical results. This work represents an advance in the state-of-the-art in context extraction, which means an important contribution to context-aware recommender systems, a kind of specialized and intelligent system.
Archive | 2015
Merley da Silva Conrado; Thiago Alexandre Salgueiro Pardo; Solange Oliveira Rezende
Term extraction is the basis for many tasks such as building of taxonomies, ontologies and dictionaries, for translation, organization and retrieval of textual data. This paper studies the main challenge of semi-automatic term extractionmethods, which is the difficulty to analyze the rank of candidates created by these methods. With the experimental evaluation performed in this work, it is possible to fairly compare a wide set of semi-automatic term extraction methods, which allows other future investigations. Additionally, we discovered which level of knowledge and threshold should be adopted for these methods in order to obtain good precision or F-measure. The results show there is not a unique method that is the best one for the three used corpora.
Information Retrieval | 2016
Marcelo G. Manzato; Marcos Aurélio Domingues; Arthur da Costa Fortes; Camila Vaccari Sundermann; Rafael Martins D'Addio; Merley da Silva Conrado; Solange Oliveira Rezende; Maria G. C. Pimentel
Recommendation of textual documents requires indexing mechanisms to extract structured metadata for attribute-aware recommender systems. Applying a variety of text mining algorithms has the advantage of capturing different aspects of unstructured content, resulting in richer descriptions. However, it is difficult to integrate them into a unique model so that these descriptions can efficiently improve recommendation accuracy. This article proposes a generic model based on ensemble learning that combines simple text mining methods in a post-processing approach. After executing each text mining technique, each set of metadata of a particular type is applied to the recommender module, which generates attribute-specific rankings. Then, the resulting recommendations are ensembled to generate a final personalized ranking to the user. We evaluated our ensemble technique with two attribute-aware collaborative recommenders (k-Nearest Neighbors and BPR-Mapping) and we demonstrate its generality by means of comparisons among different types of ensembles. We used two datasets from different domains, the first is from the Brazilian Embrapa Agency of Technology Information website, whose documents are written in Portuguese language, and the second is the HetRec MovieLens 2k, published by the GroupLens Research Group, whose movies’ storylines are written in English. The experiments show that, particularly to the k-NN recommender, better accuracy can be obtained when multiple metadata types are combined. The proposed approach is extensible and flexible to new indexing and recommendation techniques.
international conference on computational science and its applications | 2012
Merley da Silva Conrado; Víctor Antonio Laguna Gutiérrez; Solange Oliveira Rezende
Text classification is an important task of Artificial Intelligence. Normally, this task uses large textual datasets whose representation is feasible because of normalization and selection techniques. In the literature, we can find three normalization techniques: stemming, lemmatization, and nominalization. Nevertheless, it is difficult to choose the most suitable technique for the text classification task. In this paper, we investigate this question experimentally by applying five different classifiers to four textual datasets in the Portuguese language. Additionally, the classification results are evaluated using unigrams, bigrams, and the combination of unigrams and bigrams. The results indicate that, in general, the number of terms obtained by each of the cases and the comprehensibility required in the results of the classification can be used as criteria to define the most suitable technique for the text classification task.
soft computing | 2013
Renan de Padua; Fabiano Fernandes dos Santos; Merley da Silva Conrado; Veronica Oliveira de Carvalho; Solange Oliveira Rezende
Among the post-processing association rule approaches, clustering is an interesting one. When an association rule set is clustered, the user is provided with an improved presentation of the mined patters. The domain to be explored is structured aiming to join association rules with similar knowledge. To take advantage of this organization, it is essential that good labels be assigned to the groups, in order to guide the user during the association rule exploration process. Few works have explored and proposed labeling methods for this context. Moreover, these methods have not been explored through subjective evaluations in order to measure their quality; usually, only objective evaluations are used. This paper subjectively evaluates five labeling methods used on association rule clustering. The evaluation aims to find out the methods that presents the best results based on the analysis of the domain experts. The experimental results demonstrate that there is a disagreement between objective and subjective evaluations as reported in other works from literature.
north american chapter of the association for computational linguistics | 2013
Merley da Silva Conrado; Thiago Alexandre Salgueiro Pardo; Solange Oliveira Rezende
Archive | 2008
Bruno M. Nogueira; Maria Fernanda Moura; Merley da Silva Conrado; Rafael Geraldeli Rossi; Ricardo Marcondes Marcacini; Solange Oliveira Rezende
brazilian symposium on multimedia and the web | 2014
Rafael Martins D'Addio; Merley da Silva Conrado; Solange Resende; Marcelo G. Manzato