Roberto A. Santiago | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roberto A. Santiago is active.

Explore More

Publication

Featured researches published by Roberto A. Santiago.

Neural Networks | 1995

Adaptive critic designs: a case study for neurocontrol

Danil V. Prokhorov; Roberto A. Santiago; Donald C. Wunsch

Abstract For the first time, different adaptive critic designs (ACDs), a conventional proportional integral derivative (PID) regulator and backpropagation of utility are compared for the same control problem—automatic aircraft landing. The original problem proved to contain little challenge since various conventional and neural network techniques had solved it very well. After the problem had been made much more difficult by a change of parameters, increasingly better performance was observed by going from the simplest ACD to more sophisticated designs, dual heuristic programming having been ranked best of all. This case study is of use in general intelligent control problems for it provides an example of the capabilities of different adaptive critic designs.

international symposium on neural networks | 2004

Context discerning multifunction networks: reformulating fixed weight neural networks

Roberto A. Santiago

Research in recurrent neural networks has produced a genre of networks referred to as fixed weight neural networks (FWNNs) which have the ability to adapt without changing explicit weights. FWNNs are unique in that they adapt their processing based on the spatiotemporal characteristics of the incoming signal without need for weight change. As a result, a single FWNN is able to model and control many families of disparate systems without weight changes. FWNNs pose an interesting model for contextual memory in neural systems. The work reported takes a FWNN, decomposes it and analyzes its internal workings. Using new insight, FWNNs are reformulated into a simpler structure, context discerning multifunction networks (CDMN).

Biological Cybernetics | 2008

An implementation of reinforcement learning based on spike timing dependent plasticity

Patrick D. Roberts; Roberto A. Santiago; Gerardo Lafferriere

An explanatory model is developed to show how synaptic learning mechanisms modeled through spike-timing dependent plasticity (STDP) can result in long-term adaptations consistent with reinforcement learning models. In particular, the reinforcement learning model known as temporal difference (TD) learning has been used to model neuronal behavior in the orbitofrontal cortex (OFC) and ventral tegmental area (VTA) of macaque monkey during reinforcement learning. While some research has observed, empirically, a connection between STDP and TD, there has not been an explanatory model directly connecting TD to STDP. Through analysis of the learning dynamics that results from a general form of a STDP learning rule, the connection between STDP and TD is explained. We further demonstrate that a STDP learning rule drives the spike probability of a reward predicting neuronal population to a stable equilibrium. The equilibrium solution has an increasing slope where the steepness of the slope predicts the probability of the reward, similar to the results from electrophysiological recordings suggesting a different slope that predicts the value of the anticipated reward of Montague and Berns [Neuron 36(2):265–284, 2002]. This connection begins to shed light into more recent data gathered from VTA and OFC which are not well modeled by TD. We suggest that STDP provides the underlying mechanism for explaining reinforcement learning and other higher level perceptual and cognitive function.

international symposium on neural networks | 2002

Proposed framework for applying adaptive critics in real-time realm

George G. Lendaris; Roberto A. Santiago; Michael S. Carroll

Adaptive critics have shown much promise for designing optimal nonlinear controllers in an off-line context. Still, their greatest potential exists in the context Of reconfigurable control, that is, real time controller redesign in response to (substantial) changes in plant dynamics. To accomplish this, a framework is proposed for the application of adaptive critics in real-time control (for those critic methods requiring a model of the plant). The framework is presented in the context of work being done in reconfigurable flight control by the NW Computational Intelligence Lab (NWCIL) at Portland State University. The proposal incorporates recent work (by others) in fast and efficient on-line plant identification, considerations for bounding the computational costs of converging neural networks, and a novel approach (by us) toward the task of assuring system stability during the adaptation process. The potential and limitations of the proposed framework are discussed. It is suggested that with the recent rapid reduction in computational barriers, only certain theoretical issues remain as the central barriers to successful on-line application of the methods.

international symposium on neural networks | 2005

Reinforcement learning and the frame problem

Roberto A. Santiago; George G. Lendaris

The frame problem, originally proposed within AI, has grown to be a fundamental stumbling block for building intelligent agents and modeling the mind. The source of the frame problem stems from the nature of symbolic processing. Unfortunately, connectionist approaches have long been criticized as having weaker representational capabilities than symbolic systems so have not been considered by many. The equivalence between the representational power of symbolic systems and connectionist architectures is redressed through neural manifolds, and reveals an associated frame problem. Working within the construct of neural manifolds, the frame problem is solved through the use of contextual reinforcement learning, a new paradigm recently proposed.

BMC Neuroscience | 2007

Spike timing dependent plasticity implements reinforcement learning

Roberto A. Santiago; Patrick D. Roberts; Gerardo Lafferriere

An explanatory model is developed to show how synaptic learning mechanisms modeled through spike-timing dependent plasticity (STDP) can result in longer term adaptations consistent with reinforcement learning models. In particular, the reinforcement learning model known as temporal difference (TD) learning has been used to model neuronal behavior in the orbitofrontal cortex (OFC) and ventral tegmental area (VTA) of macaque monkey during reinforcement learning. While some research has observed, empirically, a connection between STDP and TD there is as yet no explanatory model directly connecting TD to STDP. Through analysis of the STDP rule, the connection between STDP and TD is explained. We further show that an STDP learning rule drives the spike probability of reward predicting neurons to a stable equilibrium. The equilibrium solution has an increasing slope where the steepness of the slope predicts the probability of the reward. This connection begins to shed light into more recent data gathered from VTA and OFC which are not well modeled by TD. We suggest that STDP provides the underlying mechanism for explaining reinforcement learning and other higher level perceptual and cognitive function.

international symposium on neural networks | 2003

Accelerating critic learning in approximate dynamic programming via value templates and perceptual learning

Thaddeus T. Shannon; Roberto A. Santiago; G.L. George

The concept of value templates and perceptual learning are introduced as refinements to the reinforcement learning (RL) paradigm. We demonstrate a method for accelerating dual heuristic programming (DHP) critic training using value templates and perceptual learning. Both faster and more stable learning are achieved by using the value template and utilizing its inherent constraints to regularize the perceptual learning task. The method is demonstrated by tuning a neurofuzzy control system for a highly nonlinear 2/sup nd/ order plant proposed by Sanner and Slotine. We take advantage of the TSK model framework throughout to keep the controller, critic, and model components used in DHP highly interpretable.

international symposium on neural networks | 2003

An automated method for neuronal spike source identification

Roberto A. Santiago; James McNames; K. Burchiei; George G. Lendaris

Analysis of microelectrode recordings (MER) of extracellular neuronal activity has gained increasing interest due to potential improvements to surgical techniques involving ablation or placement of deep brain stimulators, as is common in the treatment of Parkinsons disease. Critical to these procedures is the identification of different brain structures such as the globus pallidus internus (GPI). Evidence suggests that the spike trains from individual neurons contain enough information to identify the brain structure in which they are located For the work reported here, spike train data gathered during surgical procedure from multiple patients is used. Using a moving window sampling approach, a novel feature extraction method for spike trains was developed. This method is then used in combination with a support vector classification algorithm. Results strongly indicate that the sampling methods reported here are able to extract the necessary information for highly accurate spike source identification.

international conference of the ieee engineering in medicine and biology society | 2002

Automatic target localization using microelectrode recordings

Roberto A. Santiago; James McNames; H. Falkenberg; Kim J. Burchiel

We describe an algorithm that objectively and automatically identifies target regions in the brain for ablation or stimulation during neurosurgery for Parkinsons disease and other movement disorders. The algorithm uses microelectrode recordings to distinguish between the target and adjacent anatomic structures during stereotactic neurosurgery. This algorithm uses a novel method of signal feature extraction that enables standard classification algorithms such as support vector machines to perform well. The algorithm was validated on microelectrode recordings acquired near the globus pallidus internus and labeled by the neurosurgeon.

Neurocomputing | 2007

Storage of auditory temporal patterns in the songbird telencephalon

Patrick D. Roberts; Roberto A. Santiago; Tarciso Velho; Claudio V. Mello

A quantitative model of auditory learning is presented to predict how auditory patterns are stored in the songbird auditory forebrain. This research focuses on the caudomedial nidopallium (NCM) in the songbird telencephalon, a candidate site for song perception and the formation of song auditory memories. The objective is to introduce simplified features of bird song that could be used by the auditory forebrain to identify and distinguish memorized songs. The results elucidate which biological mechanisms are sufficient for temporal pattern prediction and the storage of higher-order patterns, where by higher order we mean the specific arrangement of syllables into song motifs (phrases) to reveal neural mechanisms of syntax.

Explore More