Is this you? Create Your Porfile

John Arevalo

National University of Colombia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John Arevalo is active.

Explore More

Publication

Featured researches published by John Arevalo.

international conference of the ieee engineering in medicine and biology society | 2015

Convolutional neural networks for mammography mass lesion classification.

John Arevalo; Fabio A. González; Raúl Ramos-Pollán; José Luís Oliveira; Miguel Angel Guevara López

Feature extraction is a fundamental step when mammography image analysis is addressed using learning based approaches. Traditionally, problem dependent handcrafted features are used to represent the content of images. An alternative approach successfully applied in other domains is the use of neural networks to automatically discover good features. This work presents an evaluation of convolutional neural networks to learn features for mammography mass lesions before feeding them to a classification stage. Experimental results showed that this approach is a suitable strategy outperforming the state-of-the-art representation from 79.9% to 86% in terms of area under the ROC curve.

Artificial Intelligence in Medicine | 2015

An unsupervised feature learning framework for basal cell carcinoma image analysis

John Arevalo; Angel Cruz-Roa; Viviana Arias; Eduardo Romero; Fabio A. González

OBJECTIVE The paper addresses the problem of automatic detection of basal cell carcinoma (BCC) in histopathology images. In particular, it proposes a framework to both, learn the image representation in an unsupervised way and visualize discriminative features supported by the learned model. MATERIALS AND METHODS This paper presents an integrated unsupervised feature learning (UFL) framework for histopathology image analysis that comprises three main stages: (1) local (patch) representation learning using different strategies (sparse autoencoders, reconstruct independent component analysis and topographic independent component analysis (TICA), (2) global (image) representation learning using a bag-of-features representation or a convolutional neural network, and (3) a visual interpretation layer to highlight the most discriminant regions detected by the model. The integrated unsupervised feature learning framework was exhaustively evaluated in a histopathology image dataset for BCC diagnosis. RESULTS The experimental evaluation produced a classification performance of 98.1%, in terms of the area under receiver-operating-characteristic curve, for the proposed framework outperforming by 7% the state-of-the-art discrete cosine transform patch-based representation. CONCLUSIONS The proposed UFL-representation-based approach outperforms state-of-the-art methods for BCC detection. Thanks to its visual interpretation layer, the method is able to highlight discriminative tissue regions providing a better diagnosis support. Among the different UFL strategies tested, TICA-learned features exhibited the best performance thanks to its ability to capture low-level invariances, which are inherent to the nature of the problem.

iberoamerican congress on pattern recognition | 2012

Bag of Features for Automatic Classification of Alzheimer’s Disease in Magnetic Resonance Images

Andrea Rueda; John Arevalo; Angel Cruz; Eduardo Romero; Fabio A. González

The goal of this paper is to evaluate the suitability of a bag-of-feature representation for automatic classification of Alzheimer’s disease brain magnetic resonance (MR) images. The evaluated method uses a bag-of-features (BOF) to represent the MR images, which are then fed to a support vector machine, which has been trained to distinguish between normal control and Alzheimer’s disease. The method was applied to a set of images from the OASIS data set. An exhaustive exploration of different BOF parameters was performed, i.e. feature extraction, dictionary construction and classification model. The experimental results show that the evaluated method reaches competitive performance in terms of accuracy, sensibility and specificity. In particular, the method based on a BOF representation outperforms the best published result in this data set improving the equal error classification rate in about 10% (0.80 to 0.95 for Group 1 and 0.71 to 0.81 for Group 2).

medical image computing and computer assisted intervention | 2015

Combining Unsupervised Feature Learning and Riesz Wavelets for Histopathology Image Representation: Application to Identifying Anaplastic Medulloblastoma

Sebastian Otálora; Angel Cruz-Roa; John Arevalo; Manfredo Atzori; Anant Madabhushi; Alexander R. Judkins; Fabio A. González; Henning Müller; Adrien Depeursinge

Medulloblastoma MB is a type of brain cancer that represent roughly 25% of all brain tumors in children. In the anaplastic medulloblastoma subtype, it is important to identify the degree of irregularity and lack of organizations of cells as this correlates to disease aggressiveness and is of clinical value when evaluating patient prognosis. This paper presents an image representation to distinguish these subtypes in histopathology slides. The approach combines learned features from i an unsupervised feature learning method using topographic independent component analysis that captures scale, color and translation invariances, and ii learned linear combinations of Riesz wavelets calculated at several orders and scales capturing the granularity of multiscale rotation-covariant information. The contribution of this work is to show that the combination of two complementary approaches for feature learning unsupervised and supervised improves the classification performance. Our approach outperforms the best methods in literature with statistical significance, achieving 99% accuracy over region-based data comprising 7,500 square regions from 10 patient studies diagnosed with medulloblastoma 5 anaplastic and 5 non-anaplastic.

international conference on e-science | 2012

BIGS: A framework for large-scale image processing and analysis over distributed and heterogeneous computing resources

Raúl Ramos-Pollán; Fabio A. González; Juan C. Caicedo; Angel Cruz-Roa; Jorge E. Camargo; Jorge A. Vanegas; Santiago A. Perez; Jose David Bermeo; Juan Sebastian Otalora; Paola K. Rozo; John Arevalo

This paper presents BIGS the Big Image Data Analysis Toolkit, a software framework for large scale image processing and analysis over heterogeneous computing resources, such as those available in clouds, grids, computer clusters or throughout scattered computer resources (desktops, labs) in an opportunistic manner. Through BIGS, eScience for image processing and analysis is conceived to exploit coarse grained parallelism based on data partitioning and parameter sweeps, avoiding the need of inter-process communication and, therefore, enabling loosely coupled computing nodes (BIGS workers). It adopts an uncommitted resource allocation model where (1) experimenters define their image processing pipelines in a simple configuration file, (2) a schedule of jobs is generated and (3) workers, as they become available, take over pending jobs as long as their dependency on other jobs is fulfilled. BIGS workers act autonomously, querying the job schedule to determine which one to take over. This removes the need for a central scheduling node, requiring only access by all workers to a shared information source. Furthermore, BIGS workers are encapsulated within different technologies to enable their agile deployment over the available computing resources. Currently they can be launched through the Amazon EC2 service over their cloud resources, through Java Web Start from any desktop computer and through regular scripting or SSH commands. This suits well different kinds of research environments, both when accessing dedicated computing clusters or clouds with committed computing capacity or when using opportunistic computing resources whose access is seldom or cannot be provisioned in advance. We also adopt a NoSQL storage model to ensure the scalability of the shared information sources required by all workers, including within BIGS support for HBase and Amazons DynamoDB service. Overall, BIGS now enables researchers to run large scale image processing pipelines in an easy, affordable and unplanned manner with the capability to take over computing resources as they become available at run time. This is shown in this paper by using BIGS in different experimental setups in the Amazon cloud and in an opportunistic manner, demonstrating its configurability, adaptability and scalability capabilities.

IX International Seminar on Medical Information Processing and Analysis | 2013

Hybrid image representation learning model with invariant features for basal cell carcinoma detection

John Arevalo; Angel Cruz-Roa; Fabio A. González

This paper presents a novel method for basal-cell carcinoma detection, which combines state-of-the-art methods for unsupervised feature learning (UFL) and bag of features (BOF) representation. BOF, which is a form of representation learning, has shown a good performance in automatic histopathology image classi cation. In BOF, patches are usually represented using descriptors such as SIFT and DCT. We propose to use UFL to learn the patch representation itself. This is accomplished by applying a topographic UFL method (T-RICA), which automatically learns visual invariance properties of color, scale and rotation from an image collection. These learned features also reveals these visual properties associated to cancerous and healthy tissues and improves carcinoma detection results by 7% with respect to traditional autoencoders, and 6% with respect to standard DCT representations obtaining in average 92% in terms of F-score and 93% of balanced accuracy.

Tenth International Symposium on Medical Information Processing and Analysis | 2015

A comparative evaluation of supervised and unsupervised representation learning approaches for anaplastic medulloblastoma differentiation

Angel Cruz-Roa; John Arevalo; Ajay Basavanhally; Anant Madabhushi; Fabio A. González

Learning data representations directly from the data itself is an approach that has shown great success in different pattern recognition problems, outperforming state-of-the-art feature extraction schemes for different tasks in computer vision, speech recognition and natural language processing. Representation learning applies unsupervised and supervised machine learning methods to large amounts of data to find building-blocks that better represent the information in it. Digitized histopathology images represents a very good testbed for representation learning since it involves large amounts of high complex, visual data. This paper presents a comparative evaluation of different supervised and unsupervised representation learning architectures to specifically address open questions on what type of learning architectures (deep or shallow), type of learning (unsupervised or supervised) is optimal. In this paper we limit ourselves to addressing these questions in the context of distinguishing between anaplastic and non-anaplastic medulloblastomas from routine haematoxylin and eosin stained images. The unsupervised approaches evaluated were sparse autoencoders and topographic reconstruct independent component analysis, and the supervised approach was convolutional neural networks. Experimental results show that shallow architectures with more neurons are better than deeper architectures without taking into account local space invariances and that topographic constraints provide useful invariant features in scale and rotations for efficient tumor differentiation.

content-based multimedia indexing | 2014

Unsupervised feature learning for content-based histopathology image retrieval

Jorge A. Vanegas; John Arevalo; Fabio A. González

This paper proposes a strategy for content-based image retrieval, which combines unsupervised feature learning (UFL) with the classical bag-of-features (BOF) representation. In BOF, patches are usually represented using standard classical descriptors (i.e., SIFT, SURF, DCT, among others).We propose to use UFL to learn the patch representation itself. This is achieved by applying a topographic UFL method, which automatically learns visual invariance properties of color, scale and rotation from an image collection. The learned image representation is used as input for a multimodal latent semantic indexing system, which enriches the visual representation with semantics from image annotations. The overall strategy is evaluated in a particular histopathology image collection retrieval task, showing that the learned representation has a positive impact in retrieval performance for this particular task.

11th International Symposium on Medical Information Processing and Analysis (SIPAIM 2015) | 2015

A method for medulloblastoma tumor differentiation based on convolutional neural networks and transfer learning

Angel Cruz-Roa; John Arevalo; Alexander R. Judkins; Anant Madabhushi; Fabio A. González

Convolutional neural networks (CNN) have been very successful at addressing different computer vision tasks thanks to their ability to learn image representations directly from large amounts of labeled data. Features learned from a dataset can be used to represent images from a different dataset via an approach called transfer learning. In this paper we apply transfer learning to the challenging task of medulloblastoma tumor differentiation. We compare two different CNN models which were previously trained in two different domains (natural and histopathology images). The first CNN is a state-of-the-art approach in computer vision, a large and deep CNN with 16-layers, Visual Geometry Group (VGG) CNN. The second (IBCa-CNN) is a 2-layer CNN trained for invasive breast cancer tumor classification. Both CNNs are used as visual feature extractors of histopathology image regions of anaplastic and non-anaplastic medulloblastoma tumor from digitized whole-slide images. The features from the two models are used, separately, to train a softmax classifier to discriminate between anaplastic and non-anaplastic medulloblastoma image regions. Experimental results show that the transfer learning approach produce competitive results in comparison with the state of the art approaches for IBCa detection. Results also show that features extracted from the IBCa-CNN have better performance in comparison with features extracted from the VGG-CNN. The former obtains 89.8% while the latter obtains 76.6% in terms of average accuracy.

Archive | 2014

High Throughput Location Proteomics in Confocal Images from the Human Protein Atlas Using a Bag-of-Features Representation

Raúl Ramos-Pollán; John Arevalo; Angel Cruz-Roa; Fabio A. González

This work addresses the problem of predicting protein subcellular locations within cells from confocal images, which is a key issue to reveal information about cell function. The Human Protein Atlas (HPA) is a world-scale project addressed at proteomics research. The HPA stores immunohistological and immunofluorescence images from most human tissues. This paper concentrates on the problem of analyzing HPA immunofluorescence images from immunohistochemically stained tissues and cells to automatically identify the subcellular location of particular proteins expression. This problem has been previously tackled using computer vision methods which train classification models able to discriminate subcellular locations based on particular visual features extracted form images. One of the challenges of applying this approach is the high computational cost of training the computer vision models, which includes the cost of visual feature extraction from multichannel images, classifier training and evaluation, and parameter tuning. This work addresses this challenging problem using a high-throughput computer-vision approach by (1) learning a visual dictionary of the image collection for representing visual content through a bag-of-features histogram image representation, (2) using supervised learning process to predict subcellular locations of proteins and (3) developing a software framework to seamlessly develop machine learning algorithms for computer vision and harness computing power for those processes.

Explore More