Estevam R. Hruschka | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Estevam R. Hruschka is active.

Explore More

Publication

Featured researches published by Estevam R. Hruschka.

decision support systems | 2014

Tweet sentiment analysis with classifier ensembles

Nádia Félix Felipe da Silva; Eduardo R. Hruschka; Estevam R. Hruschka

Twitter is a microblogging site in which users can post updates (tweets) to friends (followers). It has become an immense dataset of the so-called sentiments. In this paper, we introduce an approach that automatically classifies the sentiment of tweets by using classifier ensembles and lexicons. Tweets are classified as either positive or negative concerning a query term. This approach is useful for consumers who can use sentiment analysis to search for products, for companies that aim at monitoring the public sentiment of their brands, and for many other applications. Indeed, sentiment classification in microblogging services (e.g., Twitter) through classifier ensembles and lexicons has not been well explored in the literature. Our experiments on a variety of public tweet sentiment datasets show that classifier ensembles formed by Multinomial Naive Bayes, SVM, Random Forest, and Logistic Regression can improve classification accuracy. We show that classifier ensembles are promising for tweet sentiment analysis.We compare bag-of-words and feature hashing for the representation of tweets.Classifier ensembles obtained from bag-of-words and feature hashing are discussed.

Engineering Applications of Artificial Intelligence | 2009

Using Bayesian networks with rule extraction to infer the risk of weed infestation in a corn-crop

Gláucia M. Bressan; Vilma A. Oliveira; Estevam R. Hruschka; Maria do Carmo Nicoletti

This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed-crop competitiveness using as input categorical variables for the total density of weeds and corresponding proportions of narrow and broad-leaved weeds. The inferred categorical variable values for the weed-crop competitiveness along with three other categorical variables extracted from estimated maps for the weed seed production and weed coverage are then used as input for a second Bayesian network classifier to infer categorical variables values for the risk of infestation. Weed biomass and yield loss data samples are used to learn the probability relationship among the nodes of the first and second Bayesian classifiers in a supervised fashion, respectively. For comparison purposes, two types of Bayesian network structures are considered, namely an expert-based Bayesian classifier and a naive Bayes classifier. The inference system focused on the knowledge interpretation by translating a Bayesian classifier into a set of classification rules. The results obtained for the risk inference in a corn-crop field are presented and discussed.

Information Sciences | 2016

Using unsupervised information to improve semi-supervised tweet sentiment classification

Nádia Félix Felipe da Silva; Luiz F. S. Coletta; Eduardo R. Hruschka; Estevam R. Hruschka

Abstract Supervised algorithms require a set of representative labeled data for building classification models. However, labeled data are usually difficult and expensive to obtain, which motivates the interest in semi-supervised learning. This type of learning uses both labeled and unlabeled data in the training process and is particularly useful in applications such as tweet sentiment analysis, where a large amount of unlabeled data is available. Semi-supervised learning for tweet sentiment analysis, although quite appealing, is relatively new. We propose a semi-supervised learning framework that combines unsupervised information, captured from a similarity matrix constructed from unlabeled data, with a classifier. Our motivation is that such a similarity matrix is a powerful knowledge-discovery tool that can help classify unlabeled tweet sets. Our framework makes use of the well-known Self-training algorithm to induce a better tweet sentiment classifier. Experimental results in real-world datasets demonstrate that the proposed framework can improve the accuracy of tweet sentiment analysis.

international conference on computational linguistics | 2014

Biocom Usp: Tweet Sentiment Analysis with Adaptive Boosting Ensemble

Nádia Félix Felipe da Silva; Estevam R. Hruschka; Eduardo R. Hruschka

We describe our approach for the SemEval-2014 task 9: Sentiment Analysis in Twitter. We make use of an ensemble learning method for sentiment classification of tweets that relies on varied features such as feature hashing, part-of-speech, and lexical features. Our system was evaluated in the Twitter message-level task.

international conference on machine learning and applications | 2010

Evolutionary Algorithm Using Random Multi-point Crossover Operator for Learning Bayesian Network Structures

Edimilson Batista dos Santos; Estevam R. Hruschka; Nelson F. F. Ebecken

Variable Ordering plays an important role when inducing Bayesian Networks. Previous works in the literature suggest that the use of genetic/evolutionary algorithms (EAs) for dealing with VO, when learning a Bayesian Network structure from data, is worth pursuing. This work proposes a new crossover operator, named Random Multi-point Crossover Operator (RMX), to be used with the Variable Ordering Evolutionary Algorithm (VOEA). Empirical results obtained by VOEA are compared to the ones achieved by VOGA (Variable Ordering Genetic Algorithm), and indicated improvement in the quality of VO and the induced BN structure.

Fundamenta Informaticae | 2013

Automatic Learning of Temporal Relations Under the Closed World Assumption

Maria do Carmo Nicoletti; Flávia O. Santos de Sá Lisboa; Estevam R. Hruschka

Time plays an important role in the vast majority of problems and, as such, it is a vital issue to be considered when developing computer systems for solving problems. In the literature, one of the most influential formalisms for representing time is known as Allens Temporal Algebra based on a set of 13 relations basic and reversed that may hold between two time intervals. In spite of having a few drawbacks and limitations, Allens formalism is still a convenient representation due to its simplicity and implementability and also, due to the fact that it has been the basis of several extensions. This paper explores the automatic learning of Allens temporal relations by the inductive logic programming system FOIL, taking into account two possible representations for a time interval: i as a primitive concept and ii as a concept defined by the primitive concept of time point. The goals of the experiments described in the paper are 1 to explore the viability of both representations for use in automatic learning; 2 compare the facility and interpretability of the results; 3 evaluate the impact of the given examples for inducing a proper representation of the relations and 4 experiment with both representations under the assumption of a closed world CWA, which would ease continuous learning using FOIL. Experimental results are presented and discussed as evidence that the CWA can be a convenient strategy when learning Allens temporal relations.

systems, man and cybernetics | 2006

A Comparative Evaluation of Constructive Neural Networks Methods using PRM and BCP as TLU Training Algorithms

João Roberto Bertini; Maria do Carmo Nicoletti; Estevam R. Hruschka

Constructive neural network algorithms enable the architecture of a neural network to be constructed as an intrinsic part of the learning process. These algorithms are very dependent on the TLU training algorithm they employ. Generally they use a Perceptron-based algorithm (such as Pocket or Pocket with Ratchet Modification (PRM)) for training each individual node added to the network, during the learning process. In the literature can be found a vast selection of algorithms for training individual TLUs. This paper investigates the use of the Barycentric Correction Procedure (BCP) algorithm with four constructive algorithms namely Tower, Pyramid, Shift and Perceptron-Cascade. Results show that some constructive neural algorithms have better performance using BCP than using PRM.

Risk Analysis | 2014

Bayesian classifiers applied to the Tennessee Eastman process.

Edimilson Batista dos Santos; Nelson F. F. Ebecken; Estevam R. Hruschka; Ali Elkamel; Chandra Mouli R. Madhuranthakam

Fault diagnosis includes the main task of classification. Bayesian networks (BNs) present several advantages in the classification task, and previous works have suggested their use as classifiers. Because a classifier is often only one part of a larger decision process, this article proposes, for industrial process diagnosis, the use of a Bayesian method called dynamic Markov blanket classifier that has as its main goal the induction of accurate Bayesian classifiers having dependable probability estimates and revealing actual relationships among the most relevant variables. In addition, a new method, named variable ordering multiple offspring sampling capable of inducing a BN to be used as a classifier, is presented. The performance of these methods is assessed on the data of a benchmark problem known as the Tennessee Eastman process. The obtained results are compared with naive Bayes and tree augmented network classifiers, and confirm that both proposed algorithms can provide good classification accuracies as well as knowledge about relevant variables.

Fundamenta Informaticae | 2013

Coupling as Strategy for Reducing Concept-Drift in Never-ending Learning Environments

Estevam R. Hruschka; Maisa C. Duarte; Maria do Carmo Nicoletti

The project and implementation of autonomous computational systems that incrementally learn and use what has been learnt to, continually, refine its learning abilities throughout time is still a goal far from being achieved. Such dynamic systems would conform to the main ideas of the automatic learning model conventionally characterized as never-ending learning NEL. The never-ending approach to learning exhibits similarities to the semi-supervised SS model which has been successfully implemented by bootstrap learning methods. Bootstrap learning has been one of the most successful among the SS-methods proposed to date and, as such, the natural candidate for implementing NEL systems. Bootstrap methods learn from an available labeled set of data, use the induced knowledge to label some unlabeled new data and, recurrently, learn again from both sets of data in a cyclic manner. However the use of SS methods, particularly bootstrapping methods, to implement NEL systems can give rise to a problem known as concept-drift. Errors that may occur when the system automatically labels new unlabeled data can, over time, cause the system to run off track. The development of new strategies to lessen the impact of concept-drift is an important issue that should be addressed if the goal is to increase the plausibility of developing such systems, employing bootstrap methods. Coupling techniques can play an important role in reducing concept-drift effects over machine learning systems, particularly those designed to perform tasks related to machine reading. This paper proposes and formalizes relevant coupling strategies for dealing with the concept-drift problem in a NEL environment implemented as the system RTWP Read The Web in Portuguese; initial results have shown they are promising strategies for minimizing the problem taking into account a few system settings.

intelligent systems design and applications | 2007

Biomass Based Weed-Crop Competitiveness Classification Using Bayesian Networks

Gláucia M. Bressan; Vilma A. Oliveira; Estevam R. Hruschka; Maria do Carmo Nicoletti

This paper describes the modeling of a biomass based weed-crop competitiveness classification process, based on a Bayesian network classifier. The understandability of the model is improved by its automatic translation into a set of classification rules, which are easily understood by human beings. The Bayes approach is based on empirical data collected in a corn-crop and uses the concept of maximum a posteriori probability to extract a set of probabilistic rules from the induced Bayesian network classifier. The features used to build the Bayesian network classifier are the total density of weeds and the corresponding proportions of narrow and broadleaf weeds and the class variable is the weeds biomass from which the weed-crop competitiveness is inferred. The paper presents a set of 27 rules extracted from the Bayesian network classifier which classify the biomass of weeds.

Explore More