Myrian C. A. Costa
Federal University of Rio de Janeiro
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Myrian C. A. Costa.
ieee international conference on high performance computing data and analytics | 2004
Alexandre G. Evsukoff; Myrian C. A. Costa; Nelson F. F. Ebecken
This works presents an implementation of a fuzzy rule based classifier where each single variable fuzzy rule based classifier (or a set of them) is assigned to a different processor in a parallel architecture. Partial conclusions are synchronized and processed by a master processor. This approach has been applied to a very large database and results are compared with a parallel neural network approach.
high performance computing for computational science (vector and parallel processing) | 2008
Marta V. Modenesi; Alexandre G. Evsukoff; Myrian C. A. Costa
This work proposes a load balance algorithm to parallel processing based on a variation of the classical knapsack problem. The problem considers the distribution of a set of partitions, defined by the number of clusters, over a set of processors attempting to achieve a minimal overall processing cost. The work is an optimization for the parallel fuzzy c-means (FCM) clustering analysis algorithm proposed in a previous work composed by two distinct parts: the cluster analysis, properly said, using the FCM algorithm to calculate of clusters centers and the PBM index to evaluate partitions, and the load balance, which is modeled by the multiple knapsack problem and implemented through a heuristic that incorporates the restrictions related to cluster analysis in order to gives more efficiency to the parallel process.
5. Congresso Brasileiro de Redes Neurais | 2016
Myrian C. A. Costa; Nelson F. F. Ebecken
This work discusses the implementation of a neural network algorithm within a data mining strategy for high performance computers. The study focuses an application comprising all the aspects related to the data preparation, the neural network implementation and training. The performance evaluation of the implementation is also addressed for two different computer architectures. Some conclusions and recommendations from this experience were done.
ieee international conference on high performance computing data and analytics | 2010
Valeriana G. Roncero; Myrian C. A. Costa; Nelson F. F. Ebecken
The enormous amount of information stored in unstructured texts cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Text mining is the process of extracting interesting information and knowledge from unstructured text. One key difficulty with text classification learning algorithms is that they require many hand-labeled documents to learn accurately. In the text mining pattern discovery phase, the text classification step aims at automatically attribute one or more predefined classes to text documents. In this research, we propose to use an algorithm for learning from labeled and unlabeled documents based on the combination of Expectation-Maximization (EM) and a naive Bayes classifier on a grid environment, this combination is based on a mixture of multinomials, which is commonly used in text classification. Naive Bayes is a probabilistic approach to inductive learning. It estimates the a posteriori probability that a document belongs to a class given the observed feature values of the documents, assuming independence of the features. The class with the maximum a posteriori probability is assigned to the document. Expectation-Maximization (EM) is a class of iterative algorithms for maximum likelihood or maximum a posteriori estimation in problems with unlabeled data. The grid environment is a geographically distributed computation infrastructure composed of a set of heterogeneous resources. The semi-supervised learning classifier in the grid is available as a grid service, expanding the functionality of Aiuri Portal, which is a framework for a cooperative academic environment for education and research. Text classification mining methods are time-consuming by using the grid infrastructure can bring significant benefits in learning and the classification process.
international conference on data technologies and applications | 2009
V. G. Roncero; Myrian C. A. Costa; Nelson F. F. Ebecken
The enormous amount of information stored in unstructured texts cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Text mining is the process of extracting interesting information and knowledge from unstructured text. One key difficulty with text classification learning algorithms is that they require many hand-labeled documents to learn accurately. In the text mining pattern discovery phase, the text classification step aims to automatically attribute one or more pre-defined classes to text documents. In this research, we propose to use an algorithm for learning from labeled and unlabeled documents based on the combination of Expectation-Maximization (EM) and a naive Bayes classifier on a grid environment, this combination is based on a mixture of multinomials, which is commonly used in text classification. Naive Bayes is a probabilistic approach to inductive learning. It estimates the a posteriori probability that a document belongs to a class given the observed feature values of the document, assuming independence of the features. The class with the maximum a posteriori probability is assigned to the document. EM is a class of iterative algorithms for maximum likelihood or maximum a posteriori estimation in problems with unlabeled data. The grid environment is a geographically distributed computation infrastructure composed of a set of heterogeneous resources. Text classification mining methods are time-consuming, but using the grid infrastructure can bring significant benefits in the learning and classification process.
high performance computing for computational science (vector and parallel processing) | 2008
Antonio Anddre Serpa; Valeriana G. Roncero; Myrian C. A. Costa; Nelson F. F. Ebecken
The objective of this paper is to describe the implementation of text mining grid services for Aiuri Project, which is a framework that includes a friendly user interface, data and text mining tasks, database access and a visualization tool integrated with various grid environments. The focus is the development and test of components for analysis and evaluation of unstructured data into distinct grid environments. These components are grid services for text mining processes using several approaches of execution, depending on which grid environment the user choose to submit his jobs. All components are open source and are freely available to the scientific community, providing access to existing services as well as encouraging the addition of new ones.
grid and cooperative computing | 2003
S. R. R. Costa; L. G. Neves; F. Ayres; C. E. Mendonça; R. S. N. de Bragança; F. Gandour; L. V. Ferreira; Myrian C. A. Costa; Nelson F. F. Ebecken
The use of grid computing technology is being boosted in last years by the growing demand of low cost computing resources and idle computing capacity in collaborative research and development environments. GridBR is a PETROBRAS project done with the collaboration of COPPE/UFRJ and IBM Brazil partnership. This project aims at grid computing technology in the information technology strategy of PETROBRAS Research and Development Center (CENPES). The present environment comprises a heterogeneous mix of architectures and operating systems with AIX IBM and Linux workstations providing the required support for collaborative execution of applications. The results of a genetic algorithm optimization application are presented as an example of how to take advantage of the existing grid computing infrastructure at PETROBRAS.
XXXVIII Iberian-Latin American Congress on Computational Methods in Engineering | 2017
Daniel Lopes Braz dos Santos; Thais Medina Coeli Rochel de Camargo; Myrian C. A. Costa; Valeria M. Bastos; Nelson F. F. Ebecken
Com o aumento da discussao e interesse sobre politica, tanto pela sociedade como por especialistas, surge a necessidade de se analisar, estudar e acompanhar as acoes realizadas pelo legislativo e executivo. Mas o grande volume de documentos, projetos de lei, assim como seu tamanho e padrao de escrita e layout, acabam dificultando este processo. Ve-se a necessidade de uma ferramenta que consiga identificar a agrupar cada tema de projeto e separa estes documentos quanto as suas polaridades em relacao ao tema identificado. O objetivo desta dissertacao e apresentar a pesquisa de uma metodologia eficiente de identificacao de polaridade e similaridade de documentos sobre um mesmo tema, usando como estudo de caso projetos de lei contra e a favor da liberalizacao do aborto no Brasil. Busca-se primeiramente identificar e comparar tecnicas de aprendizado de maquinas sobre textos que consigam classificar os projetos nestes dois vieses citados. Esta metodologia analisa cada parte do projeto em questao, utilizando a tecnica de agrupamento K-means, aplicando em seguida um metodo baseado em grafos para processar todas as combinacoes de parâmetros e verificar os projetos que possuem maiores ligacoes entre si. Para a realizacao de teste foram utilizados projetos previamente classificados e os resultados obtidos com esta metodologia demonstraram ligacoes e peculiaridades muito interessantes e promissoras, ajudando assim, na identificacao de similaridades e padroes em documentos.
high performance computing for computational science (vector and parallel processing) | 2008
Valeriana G. Roncero; Myrian C. A. Costa; Nelson F. F. Ebecken
Stemming algorithms are commonly used in Information Retrieval with the goal of reducing the number of the words which are in the same morpho-logical variant in a common representation. Stemming analysis is one of the tasks of the pre-processing phase on text mining that consumes a lot of time. This study proposes a model of distributed stemming analysis on a grid environment to reduce the stemming processing time; this speeds up the text preparation. This model can be integrated into grid-based text mining tool, helping to improve the overall performance of the text mining process.
ieee international conference on high performance computing data and analytics | 2006
Marta V. Modenesi; Myrian C. A. Costa; Alexandre G. Evsukoff; Nelson F. F. Ebecken