James Bonaiuto
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James Bonaiuto.
Biological Cybernetics | 2007
James Bonaiuto; Edina Rosta; Michael A. Arbib
The paper introduces mirror neuron system II (MNS2), a new version of the MNS model (Oztop and Arbib in Biol Cybern 87 (2):116–140, 2002) of action recognition learning by mirror neurons of the macaque brain. The new model uses a recurrent architecture that is biologically more plausible than that of the original model. Moreover, MNS2 extends the capacity of the model to address data on audio-visual mirror neurons and on the response of mirror neurons when the target object was recently visible but is currently hidden.
Psychological Research-psychologische Forschung | 2009
Michael A. Arbib; James Bonaiuto; Stéphane Jacobs; Scott H. Frey
We review recent neurophysiological data from macaques and humans suggesting that the use of tools extends the internal representation of the actor’s hand, and relate it to our modeling of the visual control of grasping. We introduce the idea that, in addition to extending the body schema to incorporate the tool, tool use involves distalization of the end-effector from hand to tool. Different tools extend the body schema in different ways, with a displaced visual target and a novel, task-specific processing of haptic feedback to the hand. This distalization is critical in order to exploit the unique functional capacities engendered by complex tools.
computer vision and pattern recognition | 2005
James Bonaiuto; Laurent Itti
Bottom-up visual attention allows primates to quickly select regions of an image that contain salient objects. In artificial systems, restricting the task of object recognition to these regions allows faster recognition and unsupervised learning of multiple objects in cluttered scenes. A problem is that objects superficially dissimilar to the target are given the same consideration in recognition as similar objects. Here we investigate rapid pruning of the recognition search space using the already-computed low-level features that guide attention. Itti and Koch’s bottom-up visual attention algorithm selects salient locations based on low-level features such as contrast, orientation, color, and intensity. Lowe’s SIFT recognition algorithm then extracts a signature of the attended object, for comparison with the object database. The database search is prioritized for objects which better match the low-level features used to guide attention to the current candidate for recognition. The SIFT signatures of prioritized database objects are then checked for match against the attended candidate. By comparing performance of Lowe’s recognition algorithm and Itti and Koch’s bottom-up attention model with or without search space pruning, we demonstrate that our pruning approach improves the speed of object recognition in complex natural scenes.
Image and Vision Computing | 2006
James Bonaiuto; Laurent Itti
Bottom-up visual attention allows primates to quickly select regions of an image that contain salient objects. In artificial systems, restricting the task of object recognition to these regions allows faster recognition and unsupervised learning of multiple objects in cluttered scenes. A problem with this approach is that objects superficially dissimilar to the target are given the same consideration in recognition as similar objects. In video, objects recognized in previous frames at locations distant to the current fixation point are given the same consideration in recognition as objects previously recognized in locations closer to the current target of attention. Due to the continuity of smooth motion, objects recently recognized in previous frames at locations close to the current focus of attention have a high probability of matching the current target. Here we investigate rapid pruning of the facial recognition search space using the already-computed low-level features that guide attention and spatial information derived from previous video frames. For each video frame, Itti & Kochs bottom-up visual attention algorithm is used to select salient locations based on low-level features such as contrast, orientation, color, intensity, flicker and motion. This algorithm has shown to be highly effective in selecting faces as salient objects. Lowes SIFT object recognition algorithm then extracts a signature of the attended object, for comparison with the facial database. The database search is prioritized for faces which better match the low-level features used to guide attention to the current candidate for recognition or those that were previously recognized near the current candidates location. The SIFT signatures of the prioritized faces are then checked against the attended candidate for a match. By comparing performance of Lowes recognition algorithm and Itti & Kochs bottom-up attention model with or without search space pruning we demonstrate that our pruning approach improves the speed of facial recognition in video footage.
Neural Networks | 2014
James Bonaiuto; Michael A. Arbib
Winner-take-all models are commonly used to model decision-making tasks where one outcome must be selected from several competing options. Related random walk and diffusion models have been used to explain such processes and apply them to psychometric and neurophysiological data. Recent model-based fMRI studies have sought to find the neural correlates of decision-making processes. However, due to the fact that hemodynamic responses likely reflect synaptic rather than spiking activity, the expected BOLD signature of winner-take-all circuits is not clear. A powerful way to integrate data from neurophysiology and brain imaging is by developing biologically plausible neural network models constrained and testable by neural and behavioral data, and then using Synthetic Brain Imaging - transforming the output of simulations with the model to make predictions testable against neuroimaging data. We developed a biologically realistic spiking winner-take-all model comprised of coupled excitatory and inhibitory neural populations. We varied the difficulty of a decision-making task by adjusting the contrast, or relative strength of inputs representing two response options. Synthetic brain imaging was used to estimate the BOLD response of the model and analyze its peak as a function of input contrast. We performed a parameter space analysis to determine values for which the model performs the task accurately, and given accurate performance, the distribution of the input contrast-BOLD response relationship. This underscores the need for models grounded in neurophysiological data for brain imaging analyses which attempt to localize the neural correlates of cognitive processes based on predicted BOLD responses.
Proceedings of the 6th International Conference (EVOLANG6) | 2006
Michael A. Arbib; James Bonaiuto; Edina Rosta
The Mirror System Hypothesis (MSH) of the evolution of brain mechanisms supporting language distinguishes a monkey-like mirror neuron system from a chimpanzee-like mirror system that supports simple imitation and a human-like mirror system that supports complex imitation and language. This paper briefly reviews the seven evolutionary stages posited by MSH and then focuses on the early stages which precede but are claimed to ground language. It introduces MNS2, a new model of action recognition learning by mirror neurons of the macaque brain to address data on audio-visual mirror neurons. In addition, the paper offers an explicit hypothesis on how to embed a macaque-like mirror system in a larger human-like circuit which has the capacity for imitation by both direct and indirect routes. Implications for the study of speech are briefly noted.
Biological Cybernetics | 2015
James Bonaiuto; Michael A. Arbib
The activity of certain parietal neurons has been interpreted as encoding affordances (directly perceivable opportunities) for grasping. Separate computational models have been developed for infant grasp learning and affordance learning, but no single model has yet combined these processes in a neurobiologically plausible way. We present the Integrated Learning of Grasps and Affordances (ILGA) model that simultaneously learns grasp affordances from visual object features and motor parameters for planning grasps using trial-and-error reinforcement learning. As in the Infant Learning to Grasp Model, we model a stage of infant development prior to the onset of sophisticated visual processing of hand–object relations, but we assume that certain premotor neurons activate neural populations in primary motor cortex that synergistically control different combinations of fingers. The ILGA model is able to extract affordance representations from visual object features, learn motor parameters for generating stable grasps, and generalize its learned representations to novel objects.
Biological Cybernetics | 2010
James Bonaiuto; Michael A. Arbib
Mind & Society | 2008
Michael A. Arbib; James Bonaiuto
Embodied Communication in Humans and Machines | 2008
Stefan Kopp; Ipke Wachsmuth; James Bonaiuto; Michael A. Arbib