Carlos Santa Cruz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carlos Santa Cruz is active.

Explore More

Publication

Featured researches published by Carlos Santa Cruz.

Pattern Recognition | 2012

Hierarchical linear support vector machine

Irene Rodriguez-Lujan; Carlos Santa Cruz; Ramón Huerta

The increasing size and dimensionality of real-world datasets make it necessary to design efficient algorithms not only in the training process but also in the prediction phase. In applications such as credit card fraud detection, the classifier needs to predict an event in 10ms at most. In these environments the speed of the prediction constraints heavily outweighs the training costs. We propose a new classification method, called a Hierarchical Linear Support Vector Machine (H-LSVM), based on the construction of an oblique decision tree in which the node split is obtained as a Linear Support Vector Machine. Although other methods have been proposed to break the data space down in subregions to speed up Support Vector Machines, the H-LSVM algorithm represents a very simple and efficient model in training but mainly in prediction for large-scale datasets. Only a few hyperplanes need to be evaluated in the prediction step, no kernel computation is required and the tree structure makes parallelization possible. In experiments with medium and large datasets, the H-LSVM reduces the prediction cost considerably while achieving classification results closer to the non-linear SVM than that of the linear case.

IEEE Transactions on Signal Processing | 2003

Autoassociative neural networks and noise filtering

José R. Dorronsoro; Vicente López; Carlos Santa Cruz; Juan A. Sigüenza

We introduce linear autoassociative neural (AN) network filters for the removal of additive noise from one-dimensional (1-D) time series. The AN network will have a (2M+1)/spl times/L/spl times/(2M+1) architecture, and for M fixed, we show how to choose the optimal L value and output coordinate from square error estimates between the AN filter outputs and the clean series. The frequency response of AN filters are also studied, and they are shown to act as matched band filters. A noise variance estimate is also derived from this analysis. We numerically illustrate their behavior on two examples and also compare their theoretical performance with that of optimal Wiener filters.

international conference on artificial neural networks | 1996

A Nonlinear Discriminant Algorithm for Data Projection and Feature Extraction

Carlos Santa Cruz; José R. Dorronsoro

A nonlinear supervised feature extraction algorithm that directly combines Fishers criterion function with a preliminary non linear projection of vectors in pattern space will be described. After some computational details are given, a comparison with Fishers linear method will be made over a concrete example.

international work conference on artificial and natural neural networks | 2001

Natural Gradient Learning in NLDA Networks

José R. Dorronsoro; Ana M. González; Carlos Santa Cruz

Neural network training is usually formulated as a problem in function minimization. More precisely, if W are the weights defining a network’s architecture And e(W) is the weight depending error function, its gradient ∇e(W) is usually employed to arrive at the optimal weight set W*. There may be several ways of exploting this information and the simplest is just plain gradient descent, Which assumes an “Euclidean” structure in the underlying space of the W weights. Although very natural, this may result sometimes in quite slow network Learning in some problems, both in batch and, especially, on line error Minimization, where the global error function e(W) is replaced by an individual, Z pattern depending error function e(Z,W). Several procedures such as Adaptive learning rates or the addition of momentum terms have been proposed [6]. A different approach is suggested by the fact that in some instances, there May be metrics other than the euclidean one better suited to describe weight Space. This has been shown to be the case for a related problem, likelihood Estimates for parametric probability models [1], [4], for which a Riemannian structure Can be defined in weight space. The same reasoning can be applied for a Concrete network model, Multilayer Perceptrons (MLPs). When used in regression Problems, that is when the MLP tries to establish a relationship between An input X and output y for each pattern Z = (X,y), a probability model p(Z;W) = p(X,y; W) can be defined in pattern space so that the on line MLP Error function e(Z,W) = e(X,y,W) = (y - F(X,W)2/2 is seen as the log-likelihood Of p(Z;W); here F(X,W) denotes the network’s transfer function. This allows one to recast network learning as the likelihood estimation of a certain semi—parametric probability density p(X,y,W). In this setting, there is [2] a natural Riemannian metric on the space {p(X, y; W): W} of these densities, determined by a metric tensor given by the matrix

Archive | 2002

Extreme Sample Classification and Credit Card Fraud Detection

José R. Dorronsoro; Ana M. González; Carlos Santa Cruz

international conference on artificial neural networks | 2001

Architecture Selection in NLDA Networks

José R. Dorronsoro; Ana M. González; Carlos Santa Cruz

G(W) = E\left[ {(\nabla _W \log p)(\nabla _W \log p)^t } \right] = \int {\int {\frac{{\partial \log p}} {{\partial W}}} \left( {\frac{{\partial \log p}} {{\partial W}}} \right)} ^t p(X,y;W)dXdy.

international work-conference on artificial and natural neural networks | 1999

Small sample discrimination and professional performance assessment

David Aguado; José R. Dorronsoro; Beatriz Lucía; Carlos Santa Cruz

international work-conference on artificial and natural neural networks | 1997

Noise Discrimination and Autoassociative Neural Networks

Carlos Santa Cruz; José R. Dorronsoro; Juan A. Sigüenza; Vicente López

G(W) is also known as the Fisher Information matrix, as it gives the variance of Cramer—Rao bound for the optimal parameter estimator. This suggests to use the “natural” gradient in the Riemannian setting, that is G(W)−1∇w e(X, y; W), Instead of the ordinary Euclidean gradient ∇w e(X, y; W).

international work-conference on artificial and natural neural networks | 1995

Fast Automatic Architecture Selection on RBF Networks

Ana M. González; Carlos Santa Cruz; Vicente López; José R. Dorronsoro

Credit card fraud detection is an obviously difficult problem. There are two reasons for that. The first one is the overwhelming majority of good operations over fraudulent ones. The second one is the similarity of many bad operations to legal ones. In other words, to catch a fraudulent operation is akin to find needles in a haystack, only that some needles are in fact hay! In this type of problems (that we term below as Extreme Sample problems) well established methods for classifier construction, such as Multilayer Perceptrons (MLPs), may fail. Non Linear Discriminant Analysis, an alternative method, is described here and some issues pertaining to its practical use, such as fast convergence and architecture selection, are also discussed. Its performance is also compared with that of MLPs over Extreme Sample problems, and it is shown that it gives better results both over synthetic data and on credit card fraud.

Journal of Machine Learning Research | 2010

Quadratic Programming Feature Selection

Irene Rodriguez-Lujan; Ramón Huerta; Charles Elkan; Carlos Santa Cruz

In Non Linear Discriminant Analysis (NLDA) an MLP like architecture is used to minimize a Fishers discriminant analysis criterion function. In this work we study the architecture selection problem for NLDA networks. We shall derive asymptotic distribution results for NLDA weights, from which Wald like tests can be derived. We also discuss how to use them to make decisions on unit relevance based on the acceptance or rejection of a certain null hypothesis.

Explore More