Enrique Puertas Sanz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Enrique Puertas Sanz is active.

Explore More

Publication

Featured researches published by Enrique Puertas Sanz.

document engineering | 2006

Content based SMS spam filtering

José María Gómez Hidalgo; Guillermo Cajigas Bringas; Enrique Puertas Sanz; Francisco Carrero García

In the recent years, we have witnessed a dramatic increment in the volume of spam email. Other related forms of spam are increasingly revealing as a problem of importance, specially the spam on Instant Messaging services (the so called SPIM), and Short Message Service (SMS) or mobile spam.Like email spam, the SMS spam problem can be approached with legal, economic or technical measures. Among the wide range of technical measures, Bayesian filters are playing a key role in stopping email spam. In this paper, we analyze to what extent Bayesian filtering techniques used to block email spam, can be applied to the problem of detecting and stopping mobile spam. In particular, we have built two SMS spam test collections of significant size, in English and Spanish. We have tested on them a number of messages representation techniques and Machine Learning algorithms, in terms of effectiveness. Our results demonstrate that Bayesian filtering techniques can be effectively transferred from email to SMS spam.

conference on information and knowledge management | 2007

Spam filtering for short messages

Gordon V. Cormack; José María Gómez Hidalgo; Enrique Puertas Sanz

We consider the problem of content-based spam filtering for short text messages that arise in three contexts: mobile (SMS) communication, blog comments, and email summary information such as might be displayed by a low-bandwidth client. Short messages often consist of only a few words, and therefore present a challenge to traditional bag-of-words based spam filters. Using three corpora of short messages and message fields derived from real SMS, blog, and spam messages, we evaluate feature-based and compression-model-based spam filters. We observe that bag-of-words filters can be improved substantially using different features, while compression-model filters perform quite well as-is. We conclude that content filtering for short messages is surprisingly effective.

international acm sigir conference on research and development in information retrieval | 2007

Feature engineering for mobile (SMS) spam filtering

Gordon V. Cormack; José María Gómez Hidalgo; Enrique Puertas Sanz

Mobile spam in an increasing threat that may be addressed using filtering systems like those employed against email spam. We believe that email filtering techniques require some adaptation to reach good levels of performance on SMS spam, especially regarding message representation. In order to test this assumption, we have performed experiments on SMS filtering using top performing email spam filters on mobile spam messages using a suitable feature representation, with results supporting our hypothesis.

conference on computational natural language learning | 2000

Combining text and heuristics for cost-sensitive spam filtering

José María Gómez Hidalgo; Manual Maña López; Enrique Puertas Sanz

Spam filtering is a text categorization task that shows especial features that make it interesting and difficult. First, the task has been performed traditionally using heuristics from the domain. Second, a cost model is required to avoid misclassification of legitimate messages. We present a comparative evaluation of several machine learning algorithms applied to spam filtering, considering the text of the messages and a set of heuristics for the task. Cost-oriented biasing and evaluation is performed.

Advances in Computers | 2008

Email spam filtering

Enrique Puertas Sanz; José María Gómez Hidalgo; José Carlos Cortizo Pérez

Abstract In recent years, email spam has become an increasingly important problem, with a big economic impact in society. In this work, we present the problem of spam, how it affects us, and how we can fight against it. We discuss legal, economic, and technical measures used to stop these unsolicited emails. Among all the technical measures, those based on content analysis have been particularly effective in filtering spam, so we focus on them, explaining how they work in detail. In summary, we explain the structure and the process of different Machine Learning methods used for this task, and how we can make them to be cost sensitive through several methods like threshold optimization, instance weighting, or MetaCost. We also discuss how to evaluate spam filters using basic metrics, TREC metrics, and the receiver operating characteristic convex hull method, that best suits classification problems in which target conditions are not known, as it is the case. We also describe how actual filters are used in practice. We also present different methods used by spammers to attack spam filters and what we can expect to find in the coming years in the battle of spam filters against spammers.

international conference natural language processing | 2005

Named entity recognition for web content filtering

José María Gómez Hidalgo; Francisco Carrero García; Enrique Puertas Sanz

Effective Web content filtering is a necessity in educational and workplace environments, but current approaches are far from perfect. We discuss a model for text-based intelligent Web content filtering, in which shallow linguistic analysis plays a key role. In order to demonstrate how this model can be realized, we have developed a lexical Named Entity Recognition system, and used it to improve the effectiveness of statistical Automated Text Categorization methods. We have performed several experiments that confirm this fact, and encourage the integration of other shallow linguistic processing techniques in intelligent Web content filtering.

international workshop on ambient assisted living | 2015

Big Data Processing Using Wearable Devices for Wellbeing and Healthy Activities Promotion

Diego Gachet Páez; Manuel de Buenaga Rodríguez; Enrique Puertas Sanz; María Teresa Villalba; Rafael Muñoz Gil

The aging population and economic crisis specially in developed countries have as a consequence the reduction in funds dedicated to healthcare, is then desirable to optimize the costs of public and private healthcare systems reducing the affluence of chronic and dependent people to care centers; promoting healthy lifestyle and activities can allow people to avoid chronic diseases as for example hypertension. In this paper we describe a system for promoting an active and healthy lifestyle for people and to recommend with guidelines and valuable information about their habits. The proposed system is being developed around the Big Data parading using bio-signals sensors and machine learning algorithms for recommendations.

Advances in Computers | 2009

Chapter 7 Web Content Filtering

José María Gómez Hidalgo; Enrique Puertas Sanz; Francisco Carrero García; Manuel de Buenaga Rodríguez

Abstract Across the years, Internet has evolved from an academic network to a truly communication medium, reaching impressive levels of audience and becoming a billionaire business. Many of our working, studying, and entertainment activities are nowadays overwhelmingly limited if we get disconnected from the net of networks. And of course, with the use comes abuse. The World Wide Web features a wide variety of content that are harmful for children or just inappropriate in the workplace. Web filtering and monitoring systems have emerged as valuable tools for the enforcement of suitable usage policies. These systems are routinely deployed in corporate, library, and school networks, and contribute to detect and limit Internet abuse. Their techniques are increasingly sophisticated and effective, and their development is contributing to the advance of the state of the art in a number of research fields, like text analysis and image processing. In this chapter, we review the main issues regarding Web content filtering, including its motivation, the main operational concerns and techniques used in filtering tools’ development, their evaluation and security, and a number of singular projects in this field.

Health Informatics Journal | 2018

Healthy and wellbeing activities' promotion using a Big Data approach.

Diego Gachet Páez; Manuel de Buenaga Rodríguez; Enrique Puertas Sanz; María Teresa Villalba; Rafael Muñoz Gil

The aging population and economic crisis specially in developed countries have as a consequence the reduction in funds dedicated to health care; it is then desirable to optimize the costs of public and private healthcare systems, reducing the affluence of chronic and dependent people to care centers; promoting healthy lifestyle and activities can allow people to avoid chronic diseases as for example hypertension. In this article, we describe a system for promoting an active and healthy lifestyle for people and to recommend with guidelines and valuable information about their habits. The proposed system is being developed around the Big Data paradigm using bio-signal sensors and machine-learning algorithms for recommendations.

Archive | 2007