Cheston Tan
McGovern Institute for Brain Research
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cheston Tan.
Vision Research | 2010
Sharat Chikkerur; Thomas Serre; Cheston Tan; Tomaso Poggio
In the theoretical framework of this paper, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while feature-based attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in cluttered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several well-known attentional phenomena--including bottom-up pop-out effects, multiplicative modulation of neuronal tuning curves and shift in contrast responses--all emerge naturally as predictions of the model. We also show that the Bayesian model predicts well human eye fixations (considered as a proxy for shifts of attention) in natural scenes.
Machine Learning for Computer Vision | 2013
Cheston Tan; Joel Z. Leibo; Tomaso Poggio
In recent years, scientific and technological advances have produced artificial systems that have matched or surpassed human capabilities in narrow domains such as face detection and optical character recognition. However, the problem of producing truly intelligent machines still remains far from being solved. In this chapter, we first describe some of these recent advances, and then review one approach to moving beyond these limited successes – the neuromorphic approach of studying and reverse-engineering the networks of neurons in the human brain (specifically, the visual system). Finally, we discuss several possible future directions in the quest for visual intelligence.
PLOS ONE | 2016
Cheston Tan; Tomaso Poggio
Faces are an important and unique class of visual stimuli, and have been of interest to neuroscientists for many years. Faces are known to elicit certain characteristic behavioral markers, collectively labeled “holistic processing”, while non-face objects are not processed holistically. However, little is known about the underlying neural mechanisms. The main aim of this computational simulation work is to investigate the neural mechanisms that make face processing holistic. Using a model of primate visual processing, we show that a single key factor, “neural tuning size”, is able to account for three important markers of holistic face processing: the Composite Face Effect (CFE), Face Inversion Effect (FIE) and Whole-Part Effect (WPE). Our proof-of-principle specifies the precise neurophysiological property that corresponds to the poorly-understood notion of holism, and shows that this one neural property controls three classic behavioral markers of holism. Our work is consistent with neurophysiological evidence, and makes further testable predictions. Overall, we provide a parsimonious account of holistic face processing, connecting computation, behavior and neurophysiology.
Proceedings of SPIE | 2011
Sharat Chikkerur; Thomas Serre; Cheston Tan; Tomaso Poggio
David Marr famously defined vision as knowing what is where by seeing. In the framework described here, attention is the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that performs well in recognition tasks and that predicts some of the main properties of attention at the level of psychophysics and physiology. We propose an algorithmic implementation a Bayesian network that can be mapped into the basic functional anatomy of attention involving the ventral stream and the dorsal stream. This description integrates bottom-up, feature-based as well as spatial (context based) attentional mechanisms. We show that the Bayesian model predicts well human eye fixations (considered as a proxy for shifts of attention) in natural scenes, and can improve accuracy in object recognition tasks involving cluttered real world images. In both cases, we found that the proposed model can predict human performance better than existing bottom-up and top-down computational models.
Journal of Vision | 2010
Cheston Tan; Thomas Serre; Gabriel Kreiman; Tomaso Poggio
Feedforward hierarchical models of the visual cortex constitute a popular class of models of object recognition. In these models, position and scale invariant recognition is achieved via selective pooling mechanisms, resulting in units at the top of the hierarchy having large receptive fields that signal the presence of specific image features within their receptive fields, irrespective of scale and location. Hence, it is often assumed that such models are incompatible with data that suggest a representation for configurations between objects or parts. Here, we consider a specific implementation of this class of models (Serre et al, 2005) and show that location, scale and configural information is implicitly encoded by a small population of IT units. First we show that model IT units agree quantitatively with the coarse location and scale information read out from neurons in macaque IT cortex (Hung et al, 2005). Next, we consider the finding by Biederman et al (VSS 2007) that changes in configuration are reflected both behaviorally and in the BOLD signal measured from adaptation experiments. Model results are qualitatively similar to theirs: for stimuli consisting of two objects, stimuli that differ in location (objects shifted together) evoke similar responses, while stimuli that differ in configuration (object locations swapped) evoke dissimilar responses. Finally, the model replicates psychophysical findings by Hayworth et al. (VSS 2007), further demonstrating sensitivity to configuration. Line drawings of objects were split into complementary pairs A and B by assigning every other vertex to A, and complementary vertices to B. Scrambled versions A and B were then generated. Both human subjects and the model rated A as more similar to B than to A. Altogether, our results suggest that implicit location, scale and configural information exists in feedforward hierarchical models based on a large dictionary of shape-components with various levels of invariance. Received April 22, 2008.
arXiv: Learning | 2015
Fabio Anselmi; Lorenzo Rosasco; Cheston Tan; Tomaso Poggio
Archive | 2009
Tomaso Poggio; Thomas Serre; Cheston Tan; Sharat Chikkerur
Archive | 2013
Cheston Tan; Tomaso Poggio
neural information processing systems | 2013
Cheston Tan; Jedediah M. Singer; Thomas Serre; David L. Sheinberg; Tomaso Poggio
arXiv: Artificial Intelligence | 2014
Cheston Tan; Tomaso Poggio