Erick Cantu-Paz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Erick Cantu-Paz is active.

Explore More

Publication

Featured researches published by Erick Cantu-Paz.

Information Retrieval | 2011

The sum of its parts: reducing sparsity in click estimation with query segments

Dustin Hillard; Eren Manavoglu; Hema Raghavan; Chris Leggetter; Erick Cantu-Paz; Rukmini Iyer

The critical task of predicting clicks on search advertisements is typically addressed by learning from historical click data. When enough history is observed for a given query-ad pair, future clicks can be accurately modeled. However, based on the empirical distribution of queries, sufficient historical information is unavailable for many query-ad pairs. The sparsity of data for new and rare queries makes it difficult to accurately estimate clicks for a significant portion of typical search engine traffic. In this paper we provide analysis to motivate modeling approaches that can reduce the sparsity of the large space of user search queries. We then propose methods to improve click and relevance models for sponsored search by mining click behavior for partial user queries. We aggregate click history for individual query words, as well as for phrases extracted with a CRF model. The new models show significant improvement in clicks and revenue compared to state-of-the-art baselines trained on several months of query logs. Results are reported on live traffic of a commercial search engine, in addition to results from offline evaluation.

Parameter Setting in Evolutionary Algorithms | 2007

Parameter Setting in Parallel Genetic Algorithms

Erick Cantu-Paz

Parallel genetic algorithms (GAs) have numerous parameters that affect their efficiency and accuracy. Traditionally, these parameters have been studied using empirical studies whose generality and limitations are difficult to assess. This chapter reviews existing theoretical models that predict the effects of the parameters. The models are used to examine the effect of communication topologies, migration rates, population sizing, and the choice of migrants and the individuals they replace in the receiving populations. The models should help practitioners make informed decisions about the setting of parameters of parallel GAs.

soft computing | 2008

Special issue on distributed bioinspired algorithms

Francisco Fernández de Vega; Erick Cantu-Paz

Parallel computing and distribution of information have been part of the history of computers from the first days. Yet, with the advent and the explosive growth of the Internet in the last decades, distributed systems have become a required backbone supporting everyday task. Meanwhile, microprocessor manufacturers are timelyproviding to the general publicwhat was a chimera in the past:multiple core processors on a single chip. On the other hand, grid computing is moving from being a promising proposal, to a useful technology allowing institutions to share computing resources (Foster and Kesselman 2004). Even computer users have been invited to collaborate among themselves and also with scientists thanks to peer to peer (P2P) (Minar and Hedlund 2004) and volunteer computing technologies (Anderson 2004). In summary, the advancements of communication technologies as well as decreasing costs of hardware, have allowed distributed algorithms to benefit any area ranging from cognitive science to particle physics. Scientists are increasingly realizing the potential provided by parallel and distributed computing. Time consuming algorithms from the past are being reconsidered for profiting the potential underlying parallel and distributed computing. Bioinspired algorithms have been applied to numerous problems in many different domains (Olariu and Zomaya

Scalable Optimization via Probabilistic Modeling | 2006

Feature Subset Selection with Hybrids of Filters and Evolutionary Algorithms

Erick Cantu-Paz

Summary. The performance of classification algorithms is affected by the features used to describe the labeled examples presented to the inducers. Therefore, the problem of feature subset selection has received considerable attention. Approaches to this problem based on evolutionary algorithms (EAs) typically use the wrapper method, treating the inducer as a black box that is used to evaluate candidate feature subsets. However, the evaluations might take a considerable time and the wrapper approach might be impractical for large data sets. Alternative filter methods use heuristics to select feature subsets from the data and are usually considered more scalable than wrappers to the dimensionality and volume of the data. This chapter describes hybrids of evolutionary algorithms (EAs) and filter methods applied to the selection of feature subsets for classification problems. The proposed hybrids were compared against each of their components, two feature selection wrappers that are in wide use, and another filter-wrapper hybrid. The objective of this chapter is to determine if the proposed evolutionary hybrids present advantages over the other methods in terms of accuracy or speed. The experiments used are decision tree and naive Bayes (NB) classifiers on public-domain and artificial data sets. The experimental results suggest that the evolutionary hybrids usually find compact feature subsets that result in the most accurate classifiers, while beating the execution time of the other wrappers.

international acm sigir conference on research and development in information retrieval | 2016

Amazon Search: The Joy of Ranking Products

Daria Sorokina; Erick Cantu-Paz

Amazon is one of the worlds largest e-commerce sites and Amazon Search powers the majority of Amazons sales. As a consequence, even small improvements in relevance ranking both positively influence the shopping experience of millions of customers and significantly impact revenue. In the past, Amazons product search engine consisted of several hand-tuned ranking functions using a handful of input features. A lot has changed since then. In this talk we are going to cover a number of relevance algorithms used in Amazon Search today. We will describe a general machine learning framework used for ranking within categories, blending separate rankings in All Product Search, NLP techniques used for matching queries and products, and algorithms targeted at unique tasks of specific categories --- books and fashion.

web search and data mining | 2010