Alair Pereira do Lago

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alair Pereira do Lago is active.

Explore More

Publication

Featured researches published by Alair Pereira do Lago.

international conference on artificial immune systems | 2008

Credit Card Fraud Detection with Artificial Immune System

Manoel Fernando Alonso Gadi; Xidi Wang; Alair Pereira do Lago

We apply Artificial Immune Systems(AIS) [4] for credit card fraud detection and we compare it to other methods such as Neural Nets(NN) [8] and Bayesian Nets(BN) [2], Naive Bayes(NB) and Decision Trees(DT) [13]. Exhaustive search and Genetic Algorithm(GA) [7] are used to select optimized parameters sets, which minimizes the fraud cost for a credit card database provided by a Brazilian card issuer. The specifics of the fraud database are taken into account, such as skewness of data and different costs associated with false positives and negatives. Tests are done with holdout sample sets, and all executions are run using Weka [18], a publicly available software. Our results are consistent with the early result of Maes in [12] which concludes that BN is better than NN, and this occurred in all our evaluated tests. Although NN is widely used in the market today, the evaluated implementation of NN is among the worse methods for our database. In spite of a poor behavior if used with the default parameters set, AIS has the best performance when parameters optimized by GA are used.

workshop on algorithms in bioinformatics | 2006

Alignment with non-overlapping inversions in O ( n 3 )-time

Augusto F. Vellozo; Carlos Eduardo Rodrigues Alves; Alair Pereira do Lago

Alignments of sequences are widely used for biological sequence comparisons. Only biological events like mutations, insertions and deletions are usually modeled and other biological events like inversions are not automatically detected by the usual alignment algorithms. Alignment with inversions does not have a known polynomial algorithm and a simplification to the problem that considers only non-overlapping inversions were proposed by Schoniger and Waterman [20] in 1992 as well as a corresponding O(n6) solution. An improvement to an algorithm with O(n3 logn)-time complexity was announced in an extended abstract [1] and, in this present paper, we give an algorithm that solves this simplified problem in O(n3)-time and O(n2)-space in the more general framework of an edit graph. Inversions have recently [4,7,13,17] been discovered to be very important in Comparative Genomics and Scherer et al. in 2005 [11] experimentally verified inversions that were found to be polymorphic in the human genome. Moreover, 10% of the 1,576 putative inversions reported overlap RefSeq genes in the human genome. We believe our new algorithms may open the possibility to more detailed studies of inversions on DNA sequences using exact optimization algorithms and we hope this may be particularly interesting if applied to regions around known rearrangements boundaries. Scherer report 29 such cases and prioritize them as candidates for biological and evolutionary studies.

acm symposium on applied computing | 2010

Fraud detection in reputation systems in e-markets using logistic regression

Rafael P. Maranzato; Adriano C. M. Pereira; Alair Pereira do Lago; Marden Neubert

Reputation systems are specially important in e-markets, where they help buyers to decide whether or not to purchase a product. This work addresses the task of finding attempts to deceive reputation systems in e-markets. Our goal is to generate a list of users (sellers) ranked by the probability of fraud. First we describe characteristics related to transactions that may indicate frauds evidence and they are expanded to the sellers. We describe results of a simple approach that ranks sellers by counting characteristics of fraud. Then we incorporate characteristics that cannot be used by the counting approach, and we apply logistic regression to both, improved and not improved. We use real data from a large Brazilian e-market to train and evaluate our methods and the improved set with logistic regression performes better. The list with 32.1% of topmost probable fraudsters against the reputation system was selected. We increased by 110% the number of identified fraudsters against the reputation system and no false positives were confirmed.

Algorithms for Molecular Biology | 2009

Lossless Filter for Multiple Repeats with Bounded Edit Distance

Pierre Peterlongo; Gustavo Sacomoto; Alair Pereira do Lago; Nadia Pisanti; Marie-France Sagot

BackgroundIdentifying local similarity between two or more sequences, or identifying repeats occurring at least twice in a sequence, is an essential part in the analysis of biological sequences and of their phylogenetic relationship. Finding such fragments while allowing for a certain number of insertions, deletions, and substitutions, is however known to be a computationally expensive task, and consequently exact methods can usually not be applied in practice.ResultsThe filter TUIUIU that we introduce in this paper provides a possible solution to this problem. It can be used as a preprocessing step to any multiple alignment or repeats inference method, eliminating a possibly large fraction of the input that is guaranteed not to contain any approximate repeat. It consists in the verification of several strong necessary conditions that can be checked in a fast way. We implemented three versions of the filter. The first is simply a straightforward extension to the case of multiple sequences of an application of conditions already existing in the literature. The second uses a stronger condition which, as our results show, enable to filter sensibly more with negligible (if any) additional time. The third version uses an additional condition and pushes the sensibility of the filter even further with a non negligible additional time in many circumstances; our experiments show that it is particularly useful with large error rates. The latter version was applied as a preprocessing of a multiple alignment tool, obtaining an overall time (filter plus alignment) on average 63 and at best 530 times smaller than before (direct alignment), with in most cases a better quality alignment.ConclusionTo the best of our knowledge, TUIUIU is the first filter designed for multiple repeats and for dealing with error rates greater than 10% of the repeats length.

Journal of Discrete Algorithms | 2008

Lossless filter for multiple repetitions with Hamming distance

Pierre Peterlongo; Nadia Pisanti; Frédéric Boyer; Alair Pereira do Lago; Marie-France Sagot

Similarity search in texts, notably in biological sequences, has received substantial attention in the last few years. Numerous filtration and indexing techniques have been created in order to speed up the solution of the problem. However, previous filters were made for speeding up pattern matching, or for finding repetitions between two strings or occurring twice in the same string. In this paper, we present an algorithm called Nimbus for filtering strings prior to finding repetitions occurring twice or more in a string, or in two or more strings. Nimbus uses gapped seeds that are indexed with a new data structure, called a bi-factor array, that is also presented in this paper. Experimental results show that the filter can be very efficient: preprocessing with Nimbus a data set where one wants to find functional elements using a multiple local alignment tool such as Glam, the overall execution time can be reduced from 7.5 hours to 2 minutes.

Discrete Applied Mathematics | 2014

A convexity upper bound for the number of maximal bicliques of a bipartite graph

Alexandre Albano; Alair Pereira do Lago

Given a bipartite graph, we present an upper bound for its number of maximal bicliques as the product of the numbers of maximal bicliques of two appropriate subgraphs. Such an upper bound is a function of bipartite convexity, a generalization of the convex property for bipartite graphs. We survey known upper bounds present in the literature and construct families of graphs for which our bound is sharper than all the other known bounds. For particular families, only our upper bound is polynomial. We also show that determining convexity is NP-hard.

latin american web congress | 2009

Feature Extraction for Fraud Detection in Electronic Marketplaces

Rafael P. Maranzato; Marden Neubert; Adriano C. M. Pereira; Alair Pereira do Lago

Electronic markets are software systems that enable online transactions between buyers and sellers. One of the major challenges in these markets is to establish the notion of trust among users. This is normally addressed by introducing a reputation system that allows users to be evaluated for each transaction they perform. This work considers the problem of detecting fraudulent behavior of users against reputation systems in Electronic Marketplaces. We select and exhibit seventeen features with good discrimination power that are effective for this task, and we conducted experiments using data from a real-world dataset from a large Brazilian marketplace, including a list of known fraudsters identified by fraud experts. As a quick and first application of these features, we find out how a minimal number of features k could be used as a stronger evidence of fraud. With k = 1 we cover as much as 97% of known frauds, but the precision is only 14.31% (F-measure 0.25). The best F-measure is 0.43 and occurs for k = 4 and k = 5. Since many sellers who fraud the reputation system are still undetected, the computed precisions are not reliable. Almost all supposed false positives with at least ten features were manually checked and confirmed by experts to have fraudulent behavior, changing precision from 47% to at least 98%, for k = 10. At the end, the fraudster list was increased by 32% by this first analysis and the largest reviewed F-measure is 0.60.

latin american symposium on theoretical informatics | 1998

Maximal Groups in Free Burnside Semigroups

Alair Pereira do Lago

We prove that any maximal group in the free Burnside semigroup defined by the equation x n =x n+m for any n ≥ 1 and any m ≥ 1 is a free Burnside group satisfying x m =1. We show that such group is free over a well described set of generators whose cardinality is the cyclomatic number of a graph associated to the ℑ-class containing the group. For n=2 and for every m ≥ 2 we present examples with 2m−1 generators. Hence, in these cases, we have infinite maximal groups for large enough m. This allows us to prove important properties of Burnside semigroups for the case n=2, which was almost completely unknown until now. Surprisingly, the case n=2 presents simultaneously the complexities of the cases n=1 and n ≥ 3: the maximal groups are cyclic of order m for n ≥ 3 but they can have more generators and be infinite for n ≤ 2; there are exactly 2¦A¦ ℑ-classes and they are easily characterized for n=1 but there are infinitely many, ℑ-classes and they are difficult to characterize for n ≥ 2.

Theoretical Informatics and Applications | 2001

Free Burnside Semigroups

Alair Pereira do Lago; Imre Simon

This paper surveys the area of Free Burnside Semigroups. The theory of these semigroups, as is the case for groups, is far from being completely known. For semigroups, the most impressive results were obtained in the last 10 years. In this paper we give priority to the mathematical treatment of the problem and do not stress too much neither motivation nor the historical aspects. No proofs are presented in this paper, but we tried to give as many examples as was possible.

Theoretical Informatics and Applications | 2005

A sparse dynamic programming algorithm for alignment with non-overlapping inversions

Alair Pereira do Lago; Ilya B. Muchnik; Casimir A. Kulikowski

Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kind of rearrangements. Despite many important results in the last decade, the complexity of the problem of alignment with inversions is still unknown. In 1992, | | || || | | | | | || ||****|

Explore More