Carles Fernández | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carles Fernández is active.

Explore More

Publication

Featured researches published by Carles Fernández.

Pattern Recognition Letters | 2015

A deep analysis on age estimation

Ivan Huerta; Carles Fernández; Carlos Segura; Javier Hernando; Andrea Prati

Two novel methods for age estimation, using simple alignment unlike previous works.Fusing local texture/appearance descriptors improves over complex features like BIF.We propose a deep learning scheme to improve current state-of-the-art.Exhaustive validation over large databases, outperforming previous results in the field. The automatic estimation of age from face images is increasingly gaining attention, as it facilitates applications including advanced video surveillance, demographic statistics collection, customer profiling, or search optimization in large databases. Nevertheless, it becomes challenging to estimate age from uncontrollable environments, with insufficient and incomplete training data, dealing with strong person-specificity and high within-range variance. These difficulties have been recently addressed with complex and strongly hand-crafted descriptors, difficult to replicate and compare. This paper presents two novel approaches: first, a simple yet effective fusion of descriptors based on texture and local appearance; and second, a deep learning scheme for accurate age estimation. These methods have been evaluated under a diversity of settings, and the extensive experiments carried out on two large databases (MORPH and FRGC) demonstrate state-of-the-art results over previous work.

International Workshop on Face and Facial Expression Recognition from Real World Videos | 2014

A Comparative Evaluation of Regression Learning Algorithms for Facial Age Estimation

Carles Fernández; Ivan Huerta; Andrea Prati

The problem of automatic age estimation from facial images poses a great number of challenges: uncontrollable environment, insufficient and incomplete training data, strong person-specificity, and high within-range variance, among others. These difficulties have made researchers of the field propose complex and strongly hand-crafted descriptors, which make it difficult to replicate and compare the validity of posterior classification and regression schemes. We present a practical evaluation of four machine learning regression techniques from some of the most representative families in age estimation: kernel techniques, ensemble learning, neural networks, and projection algorithms. Additionally, we propose the use of simple HOG descriptors for robust age estimation, which achieve comparable performance to the state-of-the-art, without requiring piecewise facial alignment through tens of landmarks, nor fine-tuned and specific modeling of facial aging, nor additional demographic annotations such as gender or ethnicity. By using HOG descriptors, we discuss the benefits and drawbacks among the four learning algorithms. The accuracy and generalization of each regression technique is evaluated through cross-validation and cross-database validation over two large databases, MORPH and FRGC.

international conference on computer vision | 2011

Real-time GPU-based face detection in HD video sequences

David Oro; Carles Fernández; Javier R. Saeta; Xavier Martorell; Javier Hernando

Modern GPUs have evolved into fully programmable parallel stream multiprocessors. Due to the nature of the graphic workloads, computer vision algorithms are in good position to leverage the computing power of these devices. An interesting problem that greatly benefits from parallelism is face detection. This paper presents a highly optimized Haar-based face detector that works in real time over high definition videos. The proposed kernel operations exploit both coarse and fine grain parallelism for performing integral image computations and filter evaluations, thus being beneficial not only for face detection but also for other computer vision techniques. Compared to previous implementations, the experiments show that our proposal achieves a sustained throughput of 35 fps under 1080p resolutions using a sliding window with step of one pixel.

Signal Processing-image Communication | 2008

Interpretation of complex situations in a semantic-based surveillance framework

Carles Fernández; Pau Baiget; Xavier Roca; Jordi Gonzílez

The integration of cognitive capabilities in computer vision systems requires both to enable high semantic expressiveness and to deal with high computational costs as large amounts of data are involved in the analysis. This contribution describes a cognitive vision system conceived to automatically provide high-level interpretations of complex real-time situations in outdoor and indoor scenarios, and to eventually maintain communication with casual end users in multiple languages. The main contributions are: (i) the design of an integrative multilevel architecture for cognitive surveillance purposes; (ii) the proposal of a coherent taxonomy of knowledge to guide the process of interpretation, which leads to the conception of a situation-based ontology; (iii) the use of situational analysis for content detection and a progressive interpretation of semantically rich scenes, by managing incomplete or uncertain knowledge, and (iv) the use of such an ontological background to enable multilingual capabilities and advanced end-user interfaces. Experimental results are provided to show the feasibility of the proposed approach.

european conference on computer vision | 2014

Facial Age Estimation Through the Fusion of Texture and Local Appearance Descriptors

Ivan Huerta; Carles Fernández; Andrea Prati

Automatic extraction of soft biometric characteristics from face images is a very prolific field of research. Among these soft biometrics, age estimation can be very useful for several applications, such as advanced video surveillance [5, 12], demographic statistics collection, business intelligence and customer profiling, and search optimization in large databases. However, estimating age from uncontrollable environments, with insufficient and incomplete training data, dealing with strong person-specificity, and high within-range variance, can be very challenging. These difficulties have been addressed in the past with complex and strongly hand-crafted descriptors, which make it difficult to replicate and compare the validity of posterior classification schemes. This paper presents a simple yet effective approach which fuses and exploits texture- and local appearance-based descriptors to achieve faster and more accurate results. A series of local descriptors and their combinations have been evaluated under a diversity of settings, and the extensive experiments carried out on two large databases (MORPH and FRGC) demonstrate state-of-the-art results over previous work.

Pattern Recognition Letters | 2011

Augmenting video surveillance footage with virtual agents for incremental event evaluation

Carles Fernández; P. Baiget; F. X. Roca; Jordi Gonzílez

The fields of segmentation, tracking and behavior analysis demand for challenging video resources to test, in a scalable manner, complex scenarios like crowded environments or scenes with high semantics. Nevertheless, existing public databases cannot scale the presence of appearing agents, which would be useful to study long-term occlusions and crowds. Moreover, creating these resources is expensive and often too particularized to specific needs. We propose an augmented reality framework to increase the complexity of image sequences in terms of occlusions and crowds, in a scalable and controllable manner. Existing datasets can be increased with augmented sequences containing virtual agents. Such sequences are automatically annotated, thus facilitating evaluation in terms of segmentation, tracking, and behavior recognition. In order to easily specify the desired contents, we propose a natural language interface to convert input sentences into virtual agent behaviors. Experimental tests and validation in indoor, street, and soccer environments are provided to show the feasibility of the proposed approach in terms of robustness, scalability, and semantics.

international conference on embedded computer systems architectures modeling and simulation | 2015

The AXIOM project (Agile, eXtensible, fast I/O Module)

Dimitris Theodoropoulos; Dionisios N. Pnevmatikatos; Carlos Álvarez; Eduard Ayguadé; Javier Bueno; Antonio Filgueras; Daniel Jiménez-González; Xavier Martorell; Nacho Navarro; Carlos Segura; Carles Fernández; David Oro; Javier R. Saeta; Paolo Gai; Antonio Rizzo; Roberto Giorgi

The AXIOM project (Agile, eXtensible, fast I/O Module) aims at researching new software/hardware architectures for the future Cyber-Physical Systems (CPSs). These systems are expected to react in real-time, provide enough computational power for the assigned tasks, consume the least possible energy for such task (energy efficiency), scale up through modularity, allow for an easy programmability across performance scaling, and exploit at best existing standards at minimal costs.

Microprocessors and Microsystems | 2016

The AXIOM software layers

Carlos Álvarez; Eduard Ayguadé; Jaume Bosch; Javier Bueno; Artem Cherkashin; Antonio Filgueras; Daniel Jiménez-González; Xavier Martorell; Nacho Navarro; Miquel Vidal; Dimitris Theodoropoulos; Dionisios N. Pnevmatikatos; Davide Catani; David Oro; Carles Fernández; Carlos Segura; Javier Rodríguez; Javier Hernando; Claudio Scordino; Paolo Gai; Pierluigi Passera; Alberto Pomella; Nicola Bettin; Antonio Rizzo; Roberto Giorgi

People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed. The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).

digital systems design | 2015

The AXIOM Software Layers

Carlos Álvarez; Eduard Ayguadé; Javier Bueno; Antonio Filgueras; Daniel Jiménez-González; Xavier Martorell; Nacho Navarro; Dimitris Theodoropoulos; Dionisios N. Pnevmatikatos; Davide Catani; Claudio Scordino; Paolo Gai; Carlos Segura; Carles Fernández; David Oro; Javier R. Saeta; Pierluigi Passera; Alberto Pomella; Antonio Rizzo; Roberto Giorgi

congress of the italian association for artificial intelligence | 2007

Semantic Annotation of Complex Human Scenes for Multimedia Surveillance

Carles Fernández; Pau Baiget; F. Xavier Roca; Jordi Gonzàlez

A Multimedia Surveillance System (MSS) is considered for automatically retrieving semantic content from complex outdoor scenes, involving both human behavior and traffic domains. To characterize the dynamic information attached to detected objects, we consider a deterministic modeling of spatio-temporal features based on abstraction processes towards fuzzy logic formalism. A situational analysis over conceptualized information will not only allow us to describe human actions within a scene, but also to suggest possible interpretations of the behaviors perceived, such as situations involving thefts or dangers of running over. Towards this end, the different levels of semantic knowledge implied throughout the process are also classified into a proposed taxonomy.

Explore More