Raúl Ramos-Pollán | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Raúl Ramos-Pollán is active.

Explore More

Publication

Featured researches published by Raúl Ramos-Pollán.

international conference of the ieee engineering in medicine and biology society | 2015

Convolutional neural networks for mammography mass lesion classification.

John Arevalo; Fabio A. González; Raúl Ramos-Pollán; José Luís Oliveira; Miguel Angel Guevara López

Feature extraction is a fundamental step when mammography image analysis is addressed using learning based approaches. Traditionally, problem dependent handcrafted features are used to represent the content of images. An alternative approach successfully applied in other domains is the use of neural networks to automatically discover good features. This work presents an evaluation of convolutional neural networks to learn features for mammography mass lesions before feeding them to a classification stage. Experimental results showed that this approach is a suitable strategy outperforming the state-of-the-art representation from 79.9% to 86% in terms of area under the ROC curve.

international conference on e-science | 2012

BIGS: A framework for large-scale image processing and analysis over distributed and heterogeneous computing resources

Raúl Ramos-Pollán; Fabio A. González; Juan C. Caicedo; Angel Cruz-Roa; Jorge E. Camargo; Jorge A. Vanegas; Santiago A. Perez; Jose David Bermeo; Juan Sebastian Otalora; Paola K. Rozo; John Arevalo

This paper presents BIGS the Big Image Data Analysis Toolkit, a software framework for large scale image processing and analysis over heterogeneous computing resources, such as those available in clouds, grids, computer clusters or throughout scattered computer resources (desktops, labs) in an opportunistic manner. Through BIGS, eScience for image processing and analysis is conceived to exploit coarse grained parallelism based on data partitioning and parameter sweeps, avoiding the need of inter-process communication and, therefore, enabling loosely coupled computing nodes (BIGS workers). It adopts an uncommitted resource allocation model where (1) experimenters define their image processing pipelines in a simple configuration file, (2) a schedule of jobs is generated and (3) workers, as they become available, take over pending jobs as long as their dependency on other jobs is fulfilled. BIGS workers act autonomously, querying the job schedule to determine which one to take over. This removes the need for a central scheduling node, requiring only access by all workers to a shared information source. Furthermore, BIGS workers are encapsulated within different technologies to enable their agile deployment over the available computing resources. Currently they can be launched through the Amazon EC2 service over their cloud resources, through Java Web Start from any desktop computer and through regular scripting or SSH commands. This suits well different kinds of research environments, both when accessing dedicated computing clusters or clouds with committed computing capacity or when using opportunistic computing resources whose access is seldom or cannot be provisioned in advance. We also adopt a NoSQL storage model to ensure the scalability of the shared information sources required by all workers, including within BIGS support for HBase and Amazons DynamoDB service. Overall, BIGS now enables researchers to run large scale image processing pipelines in an easy, affordable and unplanned manner with the capability to take over computing resources as they become available at run time. This is shown in this paper by using BIGS in different experimental setups in the Amazon cloud and in an opportunistic manner, demonstrating its configurability, adaptability and scalability capabilities.

international conference on computational collective intelligence | 2015

Supervised Greedy Layer-Wise Training for Deep Convolutional Networks with Small Datasets

Diego Rueda-Plata; Raúl Ramos-Pollán; Fabio A. González

Deep convolutional neural networks (DCNs) are increasingly being used with considerable success in image classification tasks trained over large datasets. However, such large datasets are not always available or affordable in many applications areas where we would like to apply DCNs, having only datasets of the order of a few thousands labelled images, acquired and annotated through lenghty and costly processes (such as in plant recognition, medical imaging, etc.). In such cases DCNs do not generally show competitive performance and one must resort to fine-tune networks that were costly pretrained with large generic datasets where there is no a-priori guarantee that they would work well in specialized domains. In this work we propose to train DCNs with a greedy layer-wise method, analogous to that used in unsupervised deep networks. We show how, for small datasets, this method outperforms DCNs which do not use pretrained models and results reported in the literature with other methods. Additionally, our method learns more interpretable and cleaner visual features. Our results are also competitive as compared with convolutional methods based on pretrained models when applied to general purpose datasets, and we obtain them with much smaller datasets (1.2 million vs. 10K images) at a fraction of the computational cost. We therefore consider this work a first milestone in our quest to successfully use DCNs for small specialized datasets.

Archive | 2014

High Throughput Location Proteomics in Confocal Images from the Human Protein Atlas Using a Bag-of-Features Representation

Raúl Ramos-Pollán; John Arevalo; Angel Cruz-Roa; Fabio A. González

This work addresses the problem of predicting protein subcellular locations within cells from confocal images, which is a key issue to reveal information about cell function. The Human Protein Atlas (HPA) is a world-scale project addressed at proteomics research. The HPA stores immunohistological and immunofluorescence images from most human tissues. This paper concentrates on the problem of analyzing HPA immunofluorescence images from immunohistochemically stained tissues and cells to automatically identify the subcellular location of particular proteins expression. This problem has been previously tackled using computer vision methods which train classification models able to discriminate subcellular locations based on particular visual features extracted form images. One of the challenges of applying this approach is the high computational cost of training the computer vision models, which includes the cost of visual feature extraction from multichannel images, classifier training and evaluation, and parameter tuning. This work addresses this challenging problem using a high-throughput computer-vision approach by (1) learning a visual dictionary of the image collection for representing visual content through a bag-of-features histogram image representation, (2) using supervised learning process to predict subcellular locations of proteins and (3) developing a software framework to seamlessly develop machine learning algorithms for computer vision and harness computing power for those processes.

international conference on localization and gnss | 2016

Data driven Vertical Total Electron Content workflow for GNSS positioning for single frequency receivers

Susana Sánchez-Naranjo; Fabio A. González; Raúl Ramos-Pollán; Marc Solé

Global Navigation Satellite Systems (GNSS) systems are largely affected by ionospheric effects, which are larger in low latitudes. Traditional mitigation approaches rely either on high cost equipment or simplified models that have shown limitations for the complex ionospheric dynamics of the equatorial region. Data availability from increasingly larger GNSS receiver networks, together with scalable machine learning techniques, provides the foundation to approach this problem from a data driven perspective. This paper shows the usage of data driven models to predict the Vertical Total Electron Content (VTEC) of the ionosphere based on pseudoranges measurements and, thus, improve GNSS positioning using single frequency receivers especially in low latitudes. Our results reduce by more than 60% the error in VTEC prediction as compared to the Klobuchar model used by GPS, achieving an average positioning residual error below 1 meter, both in medium and low latitudes and in periods of high and low solar activity.

iberoamerican congress on pattern recognition | 2015

Classification of Low-Level Atmospheric Structures Based on a Pyramid Representation and a Machine Learning Method

Sebastián Sierra; Juan F. Molina; Angel Cruz-Roa; José Daniel Pabón; Raúl Ramos-Pollán; Fabio A. González; Hugo Franco

The atmosphere is a highly complex fluid system where multiple intrinsic and extrinsic phenomena superpose at same spatial and temporal dominions and different scales, making its characterization a challenging task. Despite the novel methods for pattern recognition and detection available in the literature, most of climate data analysis and weather forecast rely on the ability of specialized personnel to visually detect atmospheric patterns present in climate data plots. This paper presents a method for classifying low-level wind flow configurations, namely: confluences, difluences, vortices and saddle points. The method combines specialized image features to capture the particular structure of low-level wind flow configurations through a pyramid layout representation and a state-of-the-art machine learning classification method. The method was validated on a set of volumes extracted from climate simulations and manually annotated by experts. The best results into the independent test dataset was 0.81 of average accuracy among the four atmospheric structures.

ieee international conference on high performance computing data and analytics | 2014

Distributed Cache Strategies for Machine Learning Classification Tasks over Cluster Computing Resources

John Edilson Arévalo Ovalle; Raúl Ramos-Pollán; Fabio A. González

Scaling machine learning (ML) methods to learn from large datasets requires devising distributed data architectures and algorithms to support their iterative nature where the same data records are processed several times. Data caching becomes key to minimize data transmission through iterations at each node and, thus, contribute to the overall scalability. In this work we propose a two level caching architecture (disk and memory) and benchmark different caching strategies in a distributed machine learning setup over a cluster with no shared RAM memory. Our results strongly favour strategies where (1) datasets are partitioned and preloaded throughout the distributed memory of the cluster nodes and (2) algorithms use data locality information to synchronize computations at each iteration. This supports the convergence towards models where “ computing goes to data” as observed in other Big Data contexts, and allows us to align strategies for parallelizing ML algorithms and configure appropriately computing infrastructures.

Computer Methods and Programs in Biomedicine | 2016