Elisa Martínez | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Elisa Martínez is active.

Explore More

Publication

Featured researches published by Elisa Martínez.

Pattern Recognition | 2001

Qualitative vision for the guidance of legged robots in unstructured environments

Elisa Martínez; Carme Torras

Abstract Visual procedures especially tailored to the constraints and requirements of a legged robot are presented. They work for an uncalibrated camera, with pan and zoom, freely moving towards a stationary target in an unstructured environment that may contain independently moving objects. The goal is to dynamically analyse the image sequence in order to extract information about the robot motion, the target position and the environment structure. The development is based on the deformations of an active contour fitted to the target. Experimental results confirm that the proposed approach constitutes a promising alternative to the prevailing trend based on the costly computation of displacement or velocity fields.

non linear speech processing | 2009

Automatic refinement of an expressive speech corpus assembling subjective perception and automatic classification

Ignasi Iriondo; Santiago Planet; Joan-Claudi Socoró; Elisa Martínez; Francesc Alías; Carlos Monzo

This paper presents an automatic system able to enhance expressiveness in speech corpora recorded from acted or stimulated speech. The system is trained with the results of a subjective evaluation carried out on a reduced set of the original corpus. Once the system has been trained, it is able to check the complete corpus and perform an automatic pruning of the unclear utterances, i.e. with expressive styles which are different from the intended corpus. The content which most closely matches the subjective classification remains in the resulting corpus. An expressive speech corpus in Spanish, designed and recorded for speech synthesis purposes, has been used to test the presented proposal. The automatic refinement has been applied to the whole corpus and the result has been validated with a second subjective test.

ieee intelligent vehicles symposium | 2008

Driving assistance system based on the detection of head-on collisions

Elisa Martínez; Marta Diaz; Javier Melenchón; Josh A. Montero; Ignasi Iriondo; Joan Claudi Socoró

An artificial vision system for vehicles is proposed in this article to alert drivers of potential head on collisions. It is capable of detecting any type of frontal collision from any type of obstacle that may present itself in a vehiclepsilas path. The system operates based on a sequence of algorithms whose images are recorded on a camera located in the moving vehicle, resulting in the calculation of Time-to-Contact taken from an analysis of the optical flow, which allows the vehiclepsilas movement to be studied from a sequence of images.

Journal of Robotic Systems | 2004

Fusing Visual and Inertial Sensing to Recover Robot Ego-motion: Alenyà, Martínez, and Torras: Fusing Visual and Inertial Sensing

Guillem Alenyà; Elisa Martínez; Carme Torras

A method for estimating mobile robot egomotion is presented, which relies on tracking contours in real-time images acquired with a calibrated monocular video system. After fitting an active contour to an object in the image, 3D motion is derived from the affine deformations suffered by the contour in an image sequence. More than one object can be tracked at the same time yielding some different pose estimations. Then, improvements in pose determination are achieved by fusing all these different estimations. Inertial information is used to obtain better estimates, as it introduces in the tracking algorithm a measure of the real velocity. Inertial information is also used to eliminate some ambiguities arising from the use of a monocular image sequence. As the algorithms developed are intended to be used in real-time control systems, considerations on computation costs are taken into account.

IEEE Transactions on Audio, Speech, and Language Processing | 2009

Emphatic Visual Speech Synthesis

Javier Melenchón; Elisa Martínez; F. De la Torre; J. A. Montero

The synthesis of talking heads has been a flourishing research area over the last few years. Since human beings have an uncanny ability to read peoples faces, most related applications (e.g., advertising, video-teleconferencing) require absolutely realistic photometric and behavioral synthesis of faces. This paper proposes a person-specific facial synthesis framework that allows high realism and includes a novel way to control visual emphasis (e.g., level of exaggeration of visible articulatory movements of the vocal tract). There are three main contributions: a geodesic interpolation with visual unit selection, a parameterization of visual emphasis, and the design of minimum size corpora. Perceptual tests with human subjects reveal high realism properties, achieving similar perceptual scores as real samples. Furthermore, the visual emphasis level and two communication styles show a statistical interaction relationship.

iberian conference on pattern recognition and image analysis | 2007

Efficiently Downdating, Composing and Splitting Singular Value Decompositions Preserving the Mean Information

Javier Melenchón; Elisa Martínez

Three methods for the efficient downdating, composition and splitting of low rank singular value decompositions are proposed. They are formulated in a closed form, considering the mean information and providing exact results. Although these methods are presented in the context of computer vision, they can be used in any field forgetting information, combining different eigenspaces in one or ignoring particular dimensions of the column space of the data. Application examples on face subspace learning and latent semantic analysis are given and performance results are provided.

international conference on image processing | 2003

Text to visual synthesis with appearance models

J. Melenchon; F. De la Torre; Ignasi Iriondo; Francesc Alías; Elisa Martínez; L. Vicent

This paper presents a new method named text to visual synthesis with appearance models (TEVISAM) for generating videorealistic talking heads. In a first step, the system learns a person-specific facial appearance model (PSFAM) automatically. PSFAM allows modeling all facial components (e.g. eyes, mouth, etc) independently and it will be used to animate the face from the input text dynamically. As reported by other researches, one of the key aspects in visual synthesis is the coarticulation effect. To solve such a problem, we introduce a new interpolation method in the high dimensional space of appearance allowing to create photorealistic and videorealistic avatars. In this work, preliminary experiments synthesizing virtual avatars from text are reported. Summarizing, in this paper we introduce three novelties: first, we make use of color PSFAM to animate virtual avatars; second, we introduce a nonlinear high dimensional interpolation to achieve videorealistic animations; finally, this method allows to generate new expressions modeling the different facial elements.

international conference on image processing | 2003

Subspace eyetracking for driver warning

F. De la Torre; C.J.G. Rubio; Elisa Martínez

Drivers fatigue/distraction is one of the most common causes of traffic accidents. The aim of this paper is to develop a real time system to detect anomalous situations while driving. In a learning stage, the user will sit in front of the camera and the system will learn a person-specific facial appearance model (PSFAM) in an automatic manner. The PSFAM will be used to perform gaze detection and eye-activity recognition in a real time based on subspace constraints. Preliminary experiments measuring the PERCLOS index (average time that the eyes are closed) under a variety of conditions are reported.

Computer Vision and Image Understanding | 2008

Recovering epipolar direction from two affine views of a planar object

Maria Alberich-Carramiñana; Guillem Alenyí; Juan Andrade-Cetto; Elisa Martínez; Carme Torras

The mainstream approach to estimate epipolar geometry from two views requires matching the projections of at least four non-coplanar points in the scene, assuming a full projective camera model. Our work deviates from this in three respects: affine camera, planar scene and active contour tracking. A B-spline is fitted to a planar contour, which is tracked using a Kalman filter. The corresponding control points are used to compute the affine transformation between images. We prove that the affine epipolar direction can be computed as one of the eigenvectors of this affine transformation, provided camera motion is free of cyclorotation. A Staubli robot is used to obtain calibrated image streams, which are used as ground truth to evaluate the performance of the method, and to test its limiting conditions in practice. The fact that our method and the gold standard algorithm produce comparable results shows the potential of our proposal.

international work-conference on artificial and natural neural networks | 2007

Validation of an expressive speech corpus by mapping automatic classification to subjective evaluation

Ignasi Iriondo; Santiago Planet; Francesc Alías; Joan Claudi Socoró; Elisa Martínez

This paper presents the validation of the expressive content of an acted corpus produced to be used in speech synthesis. The use of acted speech can be rather lacking in authenticity and therefore its expressiveness validation is required. The goal is to obtain an automatic classifier able to prune the bad utterances -with wrong expressiveness-. Firstly, a subjective test has been conducted with almost ten percent of the corpus utterances. Secondly, objective techniques have been carried out by means of automatic identification of emotions using different algorithms applied to statistical features computed over the speech prosody. The relationship between both evaluations is achieved by an attribute selection process guided by a metric that measures the matching between the misclassified utterances by the users and the automatic process. The experiments show that this approach can be useful to provide a subset of utterances with poor or wrong expressive content.

Explore More