Jesús Bescós
Autonomous University of Madrid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jesús Bescós.
advanced concepts for intelligent vision systems | 2009
Sonsoles Herrero; Jesús Bescós
Moving object detection is a critical task for many computer vision applications: the objective is the classification of the pixels in the video sequence into either foreground or background. A commonly used technique to achieve it in scenes captured by a static camera is Background Subtraction (BGS). Several BGS techniques have been proposed in the literature but a rigorous comparison that analyzes the different parameter configuration for each technique in different scenarios with precise ground-truth data is still lacking. In this sense, we have implemented and evaluated the most relevant BGS techniques, and performed a quantitative and qualitative comparison between them.
machine vision applications | 2013
Javier Molina; Marcos Escudero-Viñolo; Alessandro Signoriello; Montse Pardàs; Christian Ferran; Jesús Bescós; Ferran Marqués; José M. Martínez
The use of hand gestures offers an alternative to the commonly used human computer interfaces, providing a more intuitive way of navigating among menus and multimedia applications. This paper presents a system for hand gesture recognition devoted to control windows applications. Starting from the images captured by a time-of-flight camera (a camera that produces images with an intensity level inversely proportional to the depth of the objects observed) the system performs hand segmentation as well as a low-level extraction of potentially relevant features which are related to the morphological representation of the hand silhouette. Classification based on these features discriminates between a set of possible static hand postures which results, combined with the estimated motion pattern of the hand, in the recognition of dynamic hand gestures. The whole system works in real-time, allowing practical interaction between user and application.
international conference on image processing | 2008
Fabrizio Tiburzi; Marcos Escudero; Jesús Bescós; José M. Martínez
This paper describes the design procedure followed to generate a ground truth for the evaluation of motion-based algorithms for video-object segmentation. A thorough review and classification of the critical factors that affect the behavior of segmentation algorithms results in a set of video scripts which have then been filmed. Foreground objects have been recorded in a chroma studio, in order to automatically obtain pixel-level high quality segmentation masks for each generated sequence. The resulting corpus (segmentation ground-.truth plus filmed sequences mounted over different backgrounds) is available for research purposes under a license agreement.
Signal Processing-image Communication | 2007
Jesús Bescós; José M. Martínez; Luis Herranz; Fabricio Tiburzi
This work presents an on-line approach to the selection of a variable number of frames from a compressed video sequence, just attending to selection rules applied over domain independent semantic features. The localization of these semantic features helps to infer the non homogeneous distribution of semantically relevant information, which allows to reduce the amount of adapted data while maintaining the meaningful information. The extraction of the required features is performed on-line, as demanded for many leading applications. This is achieved via techniques that operate on the compressed domain, which have been adapted to operate on-line. A subjective evaluation of online frame selection validates our results.
international conference on image processing | 2003
Jesús Bescós; José Manuel Menéndez; Narciso N. García
This paper faces the problem of detecting the number of customers crossing an uncontrolled access to a big store, from the information retrieved by a zenithal camera. The solution operates in real time and is scalable in two ways: allows the coverage of wide accesses (spatial scalability), and works with a level of detail enough to allow detection of carried objects (functional scalability). In order to cope with the main problems of these systems - shadows, sudden changes in global lighting, and sporadic camera motion due to vibration -, a novel DCT based segmentation is presented.
workshop on image analysis for multimedia interactive services | 2008
J.C. San Miguel; Jesús Bescós; José M. Martínez; Álvaro García
This paper describes a generic, scalable, and distributed framework for real-time video-analysis intended for research, prototyping and services deployment purposes. The architecture considers multiple cameras and is based on a server/client model. The information generated by each analysis module and the context information are made accessible to the whole system by using a database system. System modules can be interconnected in several ways, thus achieving flexibility. Two main design criteria have been low computational cost and easy component integration. The experimental results show the potential use of this system.
workshop on image analysis for multimedia interactive services | 2007
Fabricio Tiburzi; Jesús Bescós
This work describes a two stages algorithm for camera motion analysis. First, it detects frames defining changes in the camera motion pattern, and then characterises the motion pattern for each video segment. The proposed scheme is motivated by hierarchical content-aware adaptation of on-line video. Its main characteristics are on-line operation, direct analysis on the compressed domain, efficiency and robustness. Results are similar to those achieved by state-of-the-art camera motion classification schemes based on iterative fitting to a motion model, despite involving a significantly smaller number of operations.
Pattern Recognition Letters | 2012
Álvaro García-Martín; José M. Martínez; Jesús Bescós
This paper describes a corpus, dataset and associated ground-truth, for the evaluation of people detection algorithms in surveillance video scenarios, along with the design procedure followed to generate it. Sequences from scenes with different levels of complexity have been manually annotated. Each person present at a scene has been labeled frame by frame, in order to automatically obtain a people detection ground-truth for each sequence. Sequences have been classified into different complexity categories depending on critical factors that typically affect the behavior of detection algorithms. The resulting corpus, which exceeds other public pedestrian datasets in the amount of video sequences and its complexity variability, is freely available for benchmarking and research purposes under a license agreement.
machine vision applications | 2014
Javier Molina; José Antonio Pajuelo; Marcos Escudero-Viñolo; Jesús Bescós; José M. Martínez
The use of hand gestures offers an alternative to the commonly used human–computer interfaces (i.e. keyboard, mouse, gamepad, voice, etc.), providing a more intuitive way of navigating among menus and in multimedia applications. This paper presents a dataset for the evaluation of hand gesture recognition approaches in human–computer interaction scenarios. It includes natural data and synthetic data from several State of the Art dictionaries. The dataset considers single-pose and multiple-pose gestures, as well as gestures defined by pose and motion or just by motion. Data types include static pose videos and gesture execution videos—performed by a set of eleven users and recorded with a time-of-flight camera—and synthetically generated gesture images. A novel collection of critical factors involved in the creation of a hand gestures dataset is proposed: capture technology, temporal coherence, nature of gestures, representativeness, pose issues and scalability. Special attention is given to the scalability factor, proposing a simple method for the synthetic generation of depth images of gestures, making possible the extension of a dataset with new dictionaries and gestures without the need of recruiting new users, as well as providing more flexibility in the point-of-view selection. The method is validated for the presented dataset. Finally, a separability study of the pose-based gestures of a dictionary is performed. The resulting corpus, which exceeds in terms of representativity and scalability the datasets existing in the State Of Art, provides a significant evaluation scenario for different kinds of hand gesture recognition solutions.
international conference on distributed smart cameras | 2014
Antonio González González; Rafael Martin-Nieto; Jesús Bescós; José M. Martínez
In this paper, we present a single-object long-term tracker that supports high appearance changes in the tracked target, occlusions, and is also capable of recovering a target lost during the tracking process. The initial motivation was real time automatic speaker tracking by a static camera in order to control a PTZ camera capturing a lecture. The algorithm consists of a novel combination of state-of-the-art techniques. Subjective evaluation, over existing and newly recorded sequences, shows that the tracker is able to overcome the problems and difficulties of long-term tracking in a real lecture. Additionally, in order to further assess the performance of the proposed approach, a comparative evaluation over the VOT2013 dataset is presented.