Antonio-Javier Gallego

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Antonio-Javier Gallego is active.

Explore More

Publication

Featured researches published by Antonio-Javier Gallego.

Expert Systems With Applications | 2017

Staff-line removal with selectional auto-encoders

Antonio-Javier Gallego; Jorge Calvo-Zaragoza

Abstract Staff-line removal is an important preprocessing stage as regards most Optical Music Recognition systems. The common procedures employed to carry out this task involve image processing techniques. In contrast to these traditional methods, which are based on hand-engineered transformations, the problem can also be approached from a machine learning point of view if representative examples of the task are provided. We propose doing this through the use of a new approach involving auto-encoders, which select the appropriate features of an input feature set (Selectional Auto-Encoders). Within the context of the problem at hand, the model is trained to select those pixels of a given image that belong to a musical symbol, thus removing the lines of the staves. Our results show that the proposed technique is quite competitive and significantly outperforms the other state-of-art strategies considered, particularly when dealing with grayscale input images.

BVAI'07 Proceedings of the 2nd international conference on Advances in brain, vision and artificial intelligence | 2007

3D reconstruction and mapping from stereo pairs with geometrical rectification

Antonio-Javier Gallego; Rafael Molina; Patricia Compañ; Carlos Villagrá

In this paper a new method for reconstructing 3D scenes from stereo images is presented, as well as an algorithm for environment mapping, as an application of the previous method. In the reconstruction process a geometrical rectification filter is used to remove the conical perspective of the images. It is essential to recover the geometry of the scene (with real data of depth and volume) and to achieve a realistic appearance in 3D reconstructions. It also uses sub-pixel precision to solve the lack of information for distant objects. Finally, the method is applied to a mapping algorithm in order to show its usefulness.

Neurocomputing | 2018

MirBot: A collaborative object recognition system for smartphones using convolutional neural networks

Antonio Pertusa; Antonio-Javier Gallego; Marisa Bernabeu

Abstract MirBot is a collaborative application for smartphones that allows users to perform object recognition. This app can be used to take a photograph of an object, select the region of interest and obtain the most likely class (dog, chair, etc.) by means of similarity search using features extracted from a convolutional neural network (CNN). The answers provided by the system can be validated by the user so as to improve the results for future queries. All the images are stored together with a series of metadata, thus enabling a multimodal incremental dataset labeled with synset identifiers from the WordNet ontology. This dataset grows continuously thanks to the users’ feedback, and is publicly available for research. This work details the MirBot object recognition system, analyzes the statistics gathered after more than four years of usage, describes the image classification methodology, and performs an exhaustive evaluation using handcrafted features, neural codes, different transfer learning techniques, PCA compression and metadata, which can be used to improve the image classifier results. The app is freely available at the Apple and Google Play stores.

Remote Sensing | 2018

Automatic Ship Classification from Optical Aerial Images with Convolutional Neural Networks

Antonio-Javier Gallego; Antonio Pertusa; Pablo Gil

The automatic classification of ships from aerial images is a considerable challenge. Previous works have usually applied image processing and computer vision techniques to extract meaningful features from visible spectrum images in order to use them as the input for traditional supervised classifiers. We present a method for determining if an aerial image of visible spectrum contains a ship or not. The proposed architecture is based on Convolutional Neural Networks (CNN), and it combines neural codes extracted from a CNN with a k-Nearest Neighbor method so as to improve performance. The kNN results are compared to those obtained with the CNN Softmax output. Several CNN models have been configured and evaluated in order to seek the best hyperparameters, and the most suitable setting for this task was found by using transfer learning at different levels. A new dataset (named MASATI) composed of aerial imagery with more than 6000 samples has also been created to train and evaluate our architecture. The experimentation shows a success rate of over 99% for our approach, in contrast with the 79% obtained with traditional methods in classification of ship images, also outperforming other methods based on CNNs. A dataset of images (MWPU VHR-10) used in previous works was additionally used to evaluate the proposed approach. Our best setup achieves a success ratio of 86% with these data, significantly outperforming previous state-of-the-art ship classification methods.

iberian conference on pattern recognition and image analysis | 2013

MirBot: A Multimodal Interactive Image Retrieval System

Antonio Pertusa; Antonio-Javier Gallego; Marisa Bernabeu

This study presents a multimodal interactive image retrieval system for smartphones (MirBot). The application is designed as a collaborative game where users can categorize photographs according to the WordNet hierarchy. After taking a picture, the region of interest of the target can be selected, and the image information is sent with a set of metadata to a server in order to classify the object. The user can validate the category proposed by the system to improve future queries. The result is a labeled database with a structure similar to ImageNet, but with contents selected by the users, fully marked with regions of interest, and with novel metadata that can be useful to constrain the search space in a future work. The MirBot app is freely available on the Apple app store.

Pattern Recognition | 2019

A selectional auto-encoder approach for document image binarization

Jorge Calvo-Zaragoza; Antonio-Javier Gallego

Abstract Binarization plays a key role in the automatic information retrieval from document images. This process is usually performed in the first stages of document analysis systems, and serves as a basis for subsequent steps. Hence it has to be robust in order to allow the full analysis workflow to be successful. Several methods for document image binarization have been proposed so far, most of which are based on hand-crafted image processing strategies. Recently, Convolutional Neural Networks have shown an amazing performance in many disparate duties related to computer vision. In this paper we discuss the use of convolutional auto-encoders devoted to learning an end-to-end map from an input image to its selectional output, in which activations indicate the likelihood of pixels to be either foreground or background. Once trained, documents can therefore be binarized by parsing them through the model and applying a global threshold. This approach has proven to outperform existing binarization strategies in a number of document types.

international symposium on distributed computing | 2018

Hand Gesture Detection with Convolutional Neural Networks

Samer Alashhab; Antonio-Javier Gallego; Miguel Angel Lozano

In this paper, we present a method for locating and recognizing hand gestures from images, based on Deep Learning. Our goal is to provide an intuitive and accessible way to interact with Computer Vision-based mobile applications aimed to assist visually impaired people (e.g. pointing a finger at an object in a real scene to zoom in for a close-up of the pointed object). Initially, we have defined different hand gestures that can be assigned to different actions. After that, we have created a database containing images corresponding to these gestures. Lastly, this database has been used to train Neural Networks with different topologies (testing different input sizes, weight initialization, and data augmentation process). In our experiments, we have obtained high accuracies both in localization (96%–100%) and in recognition (99.45%) with Networks that are appropriate to be ported to mobile devices.

Sensors | 2018

Segmentation of Oil Spills on Side-Looking Airborne Radar Imagery with Autoencoders

Antonio-Javier Gallego; Pablo Gil; Antonio Pertusa; Robert B. Fisher

In this work, we use deep neural autoencoders to segment oil spills from Side-Looking Airborne Radar (SLAR) imagery. Synthetic Aperture Radar (SAR) has been much exploited for ocean surface monitoring, especially for oil pollution detection, but few approaches in the literature use SLAR. Our sensor consists of two SAR antennas mounted on an aircraft, enabling a quicker response than satellite sensors for emergency services when an oil spill occurs. Experiments on TERMA radar were carried out to detect oil spills on Spanish coasts using deep selectional autoencoders and RED-nets (very deep Residual Encoder-Decoder Networks). Different configurations of these networks were evaluated and the best topology significantly outperformed previous approaches, correctly detecting 100% of the spills and obtaining an F1 score of 93.01% at the pixel level. The proposed autoencoders perform accurately in SLAR imagery that has artifacts and noise caused by the aircraft maneuvers, in different weather conditions and with the presence of look-alikes due to natural phenomena such as shoals of fish and seaweed.

Journal of Computer and System Sciences | 2017

Grammatical inference of directed acyclic graph languages with polynomial time complexity

Antonio-Javier Gallego; Damián López; Jorge Calera-Rubio

Abstract In this paper we study the learning of graph languages. We extend the well-known classes of k-testability and k-testability in the strict sense languages to directed graph languages. We propose a grammatical inference algorithm to learn the class of directed acyclic k-testable in the strict sense graph languages. The algorithm runs in polynomial time and identifies this class of languages from positive data. We study its efficiency under several criteria, and perform a comprehensive experimentation with four datasets to show the validity of the method. Many fields, from pattern recognition to data compression, can take advantage of these results.

computer analysis of images and patterns | 2007

Rectified reconstruction from stereo pairs and robot mapping

Antonio-Javier Gallego; Rafael Molina; Patricia Compañ; Carlos Villagrá

The reconstruction and mapping of real scenes is a crucial element in several fields such as robot navigation. Stereo vision can be a powerful solution. However the perspective effect arises, as well as other problems, when the reconstruction is tackled using depth maps obtained from stereo images. A new approach is proposed to avoid the perspective effect, based on a geometrical rectification using the vanishing point of the image. It also uses sub-pixel precision to solve the lack of information for distant objects. Finally, the method is applied to map a whole scene, introducing a cubic filter.

Explore More