Javier Ruiz Hidalgo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Javier Ruiz Hidalgo is active.

Explore More

Publication

Featured researches published by Javier Ruiz Hidalgo.

ubiquitous computing | 2009

Integration of audiovisual sensors and technologies in a smart room

Joachim Neumann; Josep R. Casas; Dusan Macho; Javier Ruiz Hidalgo

At the Technical University of Catalonia (UPC), a smart room has been equipped with 85 microphones and 8 cameras. This paper describes the setup of the sensors, gives an overview of the underlying hardware and software infrastructure and indicates possibilities for high- and low-level multi-modal interaction. An example of usage of the information collected from the distributed sensor network is explained in detail: the system supports a group of students that have to solve a lab assignment related problem.

acm sigmm conference on multimedia systems | 2013

Towards a format-agnostic approach for production, delivery and rendering of immersive media

O.A. Niamut; Axel Kochale; Javier Ruiz Hidalgo; Rene Kaiser; Jens Spille; Jean-François Macq; Gert Kienast; Oliver Schreer; Ben Shirley

The media industry is currently being pulled in the often-opposing directions of increased realism (high resolution, stereoscopic, large screen) and personalization (selection and control of content, availability on many devices). We investigate the feasibility of an end-to-end format-agnostic approach to support both these trends. In this paper, different aspects of a format-agnostic capture, production, delivery and rendering system are discussed. At the capture stage, the concept of layered scene representation is introduced, including panoramic video and 3D audio capture. At the analysis stage, a virtual director component is discussed that allows for automatic execution of cinematographic principles, using feature tracking and saliency detection. At the delivery stage, resolution-independent audiovisual transport mechanisms for both managed and unmanaged networks are treated. In the rendering stage, a rendering process that includes the manipulation of audiovisual content to match the connected display and loudspeaker properties is introduced. Different parts of the complete system are revisited demonstrating the requirements and the potential of this advanced concept.

IEEE Transactions on Circuits and Systems for Video Technology | 2006

On the use of indexing metadata to improve the efficiency of video compression

Javier Ruiz Hidalgo; Philippe Salembier

For the last few years, video indexing and video compression have been considered as two separate functionalities. However, multimedia content is growing at such a rate that multimedia services will need to consider both the compression and the indexing aspects of the content in order to efficiently manage this audio-video content. Therefore, it is interesting to study the synergy between the representations of compression and indexing and in particular to find new schemas that allow the possibility to exploit indexing/compression information in order to increase the efficiency of video compression/indexing capabilities. The principal contribution of this paper is to study and develop new techniques where the compression efficiency of video codecs can be improved by the use of indexing metadata where indexing metadata refers to information that has been generated to support indexing capabilities.

international conference on acoustics, speech, and signal processing | 2001

Robust segmentation and representation of foreground key regions in video sequences

Javier Ruiz Hidalgo; Philippe Salembier

This paper deals with the extraction and characterization of foreground objects in video sequences. The algorithm first computes the mosaic image representing the background information and then extracts foreground objects. In this last step, the foreground objects are progressively extracted taking into account the reliability of the contour information. This extraction step is based on morphological tools. Finally, the foreground objects are characterized by their shape, texture and motion trajectory. Moreover, some information about the temporal evolution of non rigid objects is also extracted. This feature extraction algorithm is particularly suitable for the indexing, search and retrieval applications.

british machine vision conference | 2013

Bayesian region selection for adaptive dictionary-based Super-Resolution

Eduardo Perez-Pellitero; Jordi Salvador; Javier Ruiz Hidalgo; Bodo Rosenhahn

The performance of dictionary-based super-resolution (SR) strongly depends on the contents of the training dataset. Nevertheless, many dictionary-based SR methods randomly select patches from of a larger set of training images to build their dictionaries [ 8 , 14 , 19 , 20 ], thus relying on patches being diverse enough. This paper describes a dictionary building method for SR based on adaptively selecting an optimal subset of patches out of the training images. Each training image is divided into sub-image entities, named regions, of such a size that texture consistency is preserved and high-frequency (HF) energy is present. For each input patch to super-resolve, the best-fitting region is found through a Bayesian selection. In order to handle the high number of regions in the training dataset, a local Naive Bayes Nearest Neighbor (NBNN) approach is used. Trained with this adapted subset of patches, sparse coding SR is applied to recover the high-resolution image. Experimental results demonstrate that using our adaptive algo- rithm produces an improvement in SR performance with respect to non-adaptive training.

international conference on acoustics, speech, and signal processing | 2007

Long Term Selection of Reference Frame Sub-Blocks using MPEG-7 Indexing Metadata

Javier Ruiz Hidalgo; Philippe Salembier

Traditionally, video indexing and compression have been considered as two separate functionalities. However, the high amount of available multimedia content creates the need for multimedia services to consider both the compression and the indexing aspects of the content in order to efficiently manage it. Therefore, it is interesting to find new techniques that efficiently exploit the indexing/compression information in order to improve the compression/indexing capabilities of the content. This paper focusses on the development of one technique where the compression efficiency of the H.264 encoder is increased by the use of standard indexing information, called indexing metadata. This indexing metadata, even if extracted or generated to support indexing capabilities, can be exploited to enhance current standard video codecs such as H.264.

artificial intelligence applications and innovations | 2006

Multimodal Integration of Sensor Network

Joachim Neumann; Josep R. Casas; Dusan Macho; Javier Ruiz Hidalgo

At the Universitat Politecnica de Catalunya (UPC), a Smart Room has been equipped with 85 microphones and 8 cameras. This paper describes the setup of the sensors, gives an overview of the underlying hardware and software infrastructure and indicates possibilities for high- and low-level multi-modal interaction. An example of usage of the information collected from the distributed sensor network is explained in detail: the system supports a group of students that have to solve a lab assignment related problem.

Lecture Notes in Computer Science | 2001

Morphological Tools for Robust Key-Region Extraction and Video Shot Modeling

Javier Ruiz Hidalgo; Philippe Salembier

In recent years, the use of multimedia content has experienced an exponential growth. In this context, the need of new image/video sequence representation is becoming a necessity for many applications. This paper deals with the structuring of video shots in terms of various foreground key-regions and a background mosaic. Each key-region represents different foreground objects that appear through the entire sequence in a similar manner the mosaic image represents the background information of the complete sequence. We focus on the interest of morphological tools such as connected operators or watersheds to perform the shot analysis and the computation of the key-regions and the mosaic. It will be shown that morphological tools are particularly attractive to improve the robustness of the various steps of the algorithm.

human factors in computing systems | 2013