Kandan Ramakrishnan
University of Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kandan Ramakrishnan.
Frontiers in Computational Neuroscience | 2015
Kandan Ramakrishnan; H. Steven Scholte; I. Groen; Arnold W. M. Smeulders; Sennay Ghebreab
The human visual system is assumed to transform low level visual features to object and scene representations via features of intermediate complexity. How the brain computationally represents intermediate features is still unclear. To further elucidate this, we compared the biologically plausible HMAX model and Bag of Words (BoW) model from computer vision. Both these computational models use visual dictionaries, candidate features of intermediate complexity, to represent visual scenes, and the models have been proven effective in automatic object and scene recognition. These models however differ in the computation of visual dictionaries and pooling techniques. We investigated where in the brain and to what extent human fMRI responses to short video can be accounted for by multiple hierarchical levels of the HMAX and BoW models. Brain activity of 20 subjects obtained while viewing a short video clip was analyzed voxel-wise using a distance-based variation partitioning method. Results revealed that both HMAX and BoW explain a significant amount of brain activity in early visual regions V1, V2, and V3. However, BoW exhibits more consistency across subjects in accounting for brain activity compared to HMAX. Furthermore, visual dictionary representations by HMAX and BoW explain significantly some brain activity in higher areas which are believed to process intermediate features. Overall our results indicate that, although both HMAX and BoW account for activity in the human visual system, the BoW seems to more faithfully represent neural responses in low and intermediate level visual areas of the brain.
Journal of Vision | 2015
Kandan Ramakrishnan; Steven Scholte; Victor A. F. Lamme; Arnold W. M. Smeulders; Sennay Ghebreab
Biologically inspired computational models replicate the hierarchical visual processing in the human ventral stream. One such recent model, Convolutional Neural Network (CNN) has achieved state of the art performance on automatic visual recognition tasks. The CNN architecture contains successive layers of convolution and pooling, and resembles the simple and complex cell hierarchy as proposed by Hubel and Wiesel. This makes it a candidate model to test against the human brain. In this study we look at 1) where in the brain different layers of the CNN account for brain responses, and 2) how the CNN network compares against existing and widely used hierarchical vision models such as Bag-of-Words (BoW) and HMAX. fMRI brain activity of 20 subjects obtained while viewing a short video clip was analyzed voxel-wise using a distance-based variation partitioning method. Variation partitioning was done on successive CNN layers to determine the unique contribution of each layers in explaining fMRI brain activity. We observe that each of the 7 different layers of CNN accounts for brain activity consistently across subjects in areas known to be involved in visual processing. In addition, we find a relation between the visual processing hierarchy in the brain and the 7 CNN layers: visual areas such as V1, V2 and V3 are sensitive to lower layers of the CNN while areas such as LO, TO and PPA are sensitive to higher layers. The comparison of CNN with HMAX and BoW furthermore shows that while all three models explain brain activity in early visual areas, the CNN additionally explains brain activity deeper in the brain. Overall, our results suggest that Convolutional Neural Networks provide a suitable computational basis for visual processing in the brain, allowing to decode feed-forward representations in the visual brain. Meeting abstract presented at VSS 2015.
international conference on multimedia and expo | 2014
Kandan Ramakrishnan; I. Groen; H.S. Scholte; Arnold W. M. Smeulders; Sennay Ghebreab
The human visual system is thought to use features of intermediate complexity for scene representation. How the brain computationally represents intermediate features is, however, still unclear. Here we tested and compared two widely used computational models - the biologically plausible HMAX model and Bag of Words (BoW) model from computer vision against human brain activity. These computational models use visual dictionaries, candidate features of intermediate complexity, to represent visual scenes, and the models have been proven effective in automatic object and scene recognition. We analyzed where in the brain and to what extent human fMRI responses to natural scenes can be accounted for by the HMAX and BoW representations. Voxel-wise application of a distance-based variation partitioning method reveals that HMAX explains significant brain activity in early visual regions and also in higher regions such as LO, TO while the BoW primarily explains brain acitvity in the early visual area. Notably, both HMAX and BoW explain the most brain activity in higher areas such as V4 and TO. These results suggest that visual dictionaries might provide a suitable computation for the representation of intermediate features in the brain.
Cortex | 2018
H. Steven Scholte; Max Losch; Kandan Ramakrishnan; Edward H.F. de Haan; Sander M. Bohte
Vision research has been shaped by the seminal insight that we can understand the higher-tier visual cortex from the perspective of multiple functional pathways with different goals. In this paper, we try to give a computational account of the functional organization of this system by reasoning from the perspective of multi-task deep neural networks. Machine learning has shown that tasks become easier to solve when they are decomposed into subtasks with their own cost function. We hypothesize that the visual system optimizes multiple cost functions of unrelated tasks and this causes the emergence of a ventral pathway dedicated to vision for perception, and a dorsal pathway dedicated to vision for action. To evaluate the functional organization in multi-task deep neural networks, we propose a method that measures the contribution of a unit towards each task, applying it to two networks that have been trained on either two related or two unrelated tasks, using an identical stimulus set. Results show that the network trained on the unrelated tasks shows a decreasing degree of feature representation sharing towards higher-tier layers while the network trained on related tasks uniformly shows high degree of sharing. We conjecture that the method we propose can be used to analyze the anatomical and functional organization of the visual system and beyond. We predict that the degree to which tasks are related is a good descriptor of the degree to which they share downstream cortical-units.
bioRxiv | 2017
Kandan Ramakrishnan; Iris I. A. Groen; Arnold W. M. Smeulders; H. Steven Scholte; Sennay Ghebreab
Convolutional neural networks (CNNs) have recently emerged as promising models of human vision based on their ability to predict hemodynamic brain responses to visual stimuli measured with functional magnetic resonance imaging (fMRI). However, the degree to which CNNs can predict temporal dynamics of visual object recognition reflected in neural measures with millisecond precision is less understood. Additionally, while deeper CNNs with higher numbers of layers perform better on automated object recognition, it is unclear if this also results into better correlation to brain responses. Here, we examined 1) to what extent CNN layers predict visual evoked responses in the human brain over time and 2) whether deeper CNNs better model brain responses. Specifically, we tested how well CNN architectures with 7 (CNN-7) and 15 (CNN-15) layers predicted electro-encephalography (EEG) responses to several thousands of natural images. Our results show that both CNN architectures correspond to EEG responses in a hierarchical spatio-temporal manner, with lower layers explaining responses early in time at electrodes overlying early visual cortex, and higher layers explaining responses later in time at electrodes overlying lateral-occipital cortex. While the explained variance of neural responses by individual layers did not differ between CNN-7 and CNN-15, combining the representations across layers resulted in improved performance of CNN-15 compared to CNN-7, but only after 150 ms after stimulus-onset. This suggests that CNN representations reflect both early (feed-forward) and late (feedback) stages of visual processing. Overall, our results show that depth of CNNs indeed plays a role in explaining time-resolved EEG responses.
Journal of Vision | 2016
Kandan Ramakrishnan; H. Steven Scholte; Arnold W. M. Smeulders; Sennay Ghebreab
Journal of Vision | 2016
Kandan Ramakrishnan; H. Steven Scholte; Sennay Ghebreab
Journal of Vision | 2016
Noor Seijdel; Kandan Ramakrishnan; Max Losch; Steven Scholte
Journal of Vision | 2016
H. Steven Scholte; Max Losch; Noor Seijdel; Kandan Ramakrishnan; Cees G. M. Snoek
Journal of Vision | 2016
Max Losch; Noor Seijdel; Kandan Ramakrishnan; Cees G. M. Snoek; H. Steven Scholte