Christopher Kanan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christopher Kanan is active.

Explore More

Publication

Featured researches published by Christopher Kanan.

Visual Cognition | 2009

SUN: Top-down saliency using natural statistics

Christopher Kanan; Mathew H. Tong; Lingyun Zhang; Garrison W. Cottrell

When people try to find particular objects in natural scenes they make extensive use of knowledge about how and where objects tend to appear in a scene. Although many forms of such “top-down” knowledge have been incorporated into saliency map models of visual search, surprisingly, the role of object appearance has been infrequently investigated. Here we present an appearance-based saliency model derived in a Bayesian framework. We compare our approach with both bottom-up saliency algorithms as well as the state-of-the-art Contextual Guidance model of Torralba et al. (2006) at predicting human fixations. Although both top-down approaches use very different types of information, they achieve similar performance; each substantially better than the purely bottom-up models. Our experiments reveal that a simple model of object appearance can predict human fixations quite well, even making the same mistakes as people.

computer vision and pattern recognition | 2010

Robust classification of objects, faces, and flowers using natural image statistics

Christopher Kanan; Garrison W. Cottrell

Classification of images in many category datasbets has rapidly improved in recent years. However, systems that perform well on particular datasets typically have one or more limitations such as a failure to generalize across visual tasks (e.g., requiring a face detector or extensive retuning of parameters), insufficient translation invariance, inability to cope with partial views and occlusion, or significant performance degradation as the number of classes is increased. Here we attempt to overcome these challenges using a model that combines sequential visual attention using fixations with sparse coding. The models biologically-inspired filters are acquired using unsupervised learning applied to natural image patches. Using only a single feature type, our approach achieves 78.5% accuracy on Caltech-101 and 75.2% on the 102 Flowers dataset when trained on 30 instances per class and it achieves 92.7% accuracy on the AR Face database with 1 training instance per person. The same features and parameters are used across these datasets to illustrate its robust performance.

PLOS ONE | 2012

Color-to-Grayscale: Does the Method Matter in Image Recognition?

Christopher Kanan; Garrison W. Cottrell

In image recognition it is often assumed the method used to convert color images to grayscale has little impact on recognition performance. We compare thirteen different grayscale algorithms with four types of image descriptors and demonstrate that this assumption is wrong: not all color-to-grayscale algorithms work equally well, even when using descriptors that are robust to changes in illumination. These methods are tested using a modern descriptor-based image recognition framework, on face, object, and texture datasets, with relatively few training instances. We identify a simple method that generally works best for face and object recognition, and two that work well for recognizing textures.

eye tracking research & application | 2014

Predicting an observer's task using multi-fixation pattern analysis

Christopher Kanan; Nicholas A. Ray; Dina N.F. Bseiso; Janet Hui-wen Hsiao; Garrison W. Cottrell

Since Yarbuss seminal work in 1965, vision scientists have argued that peoples eye movement patterns differ depending upon their task. This suggests that we may be able to infer a persons task (or mental state) from their eye movements alone. Recently, this was attempted by Greene et al. [2012] in a Yarbus-like replication study; however, they were unable to successfully predict the task given to their observer. We reanalyze their data, and show that by using more powerful algorithms it is possible to predict the observers task. We also used our algorithms to infer the image being viewed by an observer and their identity. More generally, we show how off-the-shelf algorithms from machine learning can be used to make inferences from an observers eye movements, using an approach we call Multi-Fixation Pattern Analysis (MFPA).

computer vision and pattern recognition | 2016

Answer-Type Prediction for Visual Question Answering

Kushal Kafle; Christopher Kanan

Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued. In this paper, we build a system capable of answering open-ended text-based questions about images, which is known as Visual Question Answering (VQA). Our approachs key insight is that we can predict the form of the answer from the question. We formulate our solution in a Bayesian framework. When our approach is combined with a discriminative model, the combined model achieves state-of-the-art results on four benchmark datasets for open-ended VQA: DAQUAR, COCO-QA, The VQA Dataset, and Visual7W.

Computer Vision and Image Understanding | 2017

Visual question answering: Datasets, algorithms, and future challenges

Kushal Kafle; Christopher Kanan

Abstract Visual Question Answering (VQA) is a recent problem in computer vision and natural language processing that has garnered a large amount of interest from the deep learning, computer vision, and natural language processing communities. In VQA, an algorithm needs to answer text-based questions about images. Since the release of the first VQA dataset in 2014, additional datasets have been released and many algorithms have been proposed. In this review, we critically examine the current state of VQA in terms of problem formulation, existing datasets, evaluation metrics, and algorithms. In particular, we discuss the limitations of current datasets with regard to their ability to properly train and assess VQA algorithms. We then exhaustively review existing algorithms for VQA. Finally, we discuss possible future directions for VQA and image understanding research.

computer vision and pattern recognition | 2015

VAIS: A dataset for recognizing maritime imagery in the visible and infrared spectrums

Mabel M. Zhang; Jean Choi; Kostas Daniilidis; Michael T. Wolf; Christopher Kanan

The development of fully autonomous seafaring vessels has enormous implications to the worlds global supply chain and militaries. To obey international marine traffic regulations, these vessels must be equipped with machine vision systems that can classify other ships nearby during the day and night. In this paper, we address this problem by introducing VAIS, the worlds first publicly available dataset of paired visible and infrared ship imagery. This dataset contains more than 1,000 paired RGB and infrared images among six ship categories - merchant, sailing, passenger, medium, tug, and small - which are salient for control and following maritime traffic regulations. We provide baseline results on this dataset using two off-the-shelf algorithms: gnostic fields and deep convolutional neural networks. Using these classifiers, we are able to achieve 87.4% mean per-class recognition accuracy during the day and 61.0% at night.

international symposium on visual computing | 2010

Color constancy algorithms for object and face recognition

Christopher Kanan; Arturo Flores; Garrison W. Cottrell

Brightness and color constancy is a fundamental problem faced in computer vision and by our own visual system. We easily recognize objects despite changes in illumination, but without a mechanism to cope with this, many object and face recognition systems perform poorly. In this paperwe compare approaches in computer vision and computational neuroscience for inducing brightness and color constancy based on their ability to improve recognition. We analyze the relative performance of the algorithms on the AR face and ALOI datasets using both a SIFT-based recognition system and a simple pixel-based approach. Quantitative results demonstrate that color constancy methods can significantly improve classification accuracy. We also evaluate the approaches on the Caltech-101 dataset to determine how these algorithms affect performance under relatively normal illumination conditions.

IEEE Transactions on Geoscience and Remote Sensing | 2017

Self-Taught Feature Learning for Hyperspectral Image Classification

Ronald Kemker; Christopher Kanan

In this paper, we study self-taught learning for hyperspectral image (HSI) classification. Supervised deep learning methods are currently state of the art for many machine learning problems, but these methods require large quantities of labeled data to be effective. Unfortunately, existing labeled HSI benchmarks are too small to directly train a deep supervised network. Alternatively, we used self-taught learning, which is an unsupervised method to learn feature extracting frameworks from unlabeled hyperspectral imagery. These models learn how to extract generalizable features by training on sufficiently large quantities of unlabeled data that are distinct from the target data set. Once trained, these models can extract features from smaller labeled target data sets. We studied two self-taught learning frameworks for HSI classification. The first is a shallow approach that uses independent component analysis and the second is a three-layer stacked convolutional autoencoder. Our models are applied to the Indian Pines, Salinas Valley, and Pavia University data sets, which were captured by two separate sensors at different altitudes. Despite large variation in scene type, our algorithms achieve state-of-the-art results across all the three data sets.

workshop on applications of computer vision | 2014

Fine-grained object recognition with Gnostic Fields

Christopher Kanan

Much object recognition research is concerned with basic-level classification, in which objects differ greatly in visual shape and appearance, e.g., desk vs duck. In contrast, fine-grained classification involves recognizing objects at a subordinate level, e.g., Wood duck vs Mallard duck. At the basic-level objects tend to differ greatly in shape and appearance, but these differences are usually much more subtle at the subordinate level, making fine-grained classification especially challenging. In this work, we show that Gnostic Fields, a brain-inspired model of object categorization, excel at fine-grained recognition. Gnostic Fields exceeded state-of-the-art methods on benchmark bird classification and dog breed recognition datasets, achieving a relative improvement on the Caltech-UCSD Bird-200 (CUB-200) dataset of 30.5% over the state-of-the-art and a 25.5% relative improvement on the Stanford Dogs dataset. We also demonstrate that Gnostic Fields can be sped up, enabling real-time classification in less than 70 ms per image.

Explore More