Is this you? Create Your Porfile

Bertram E. Shi

Hong Kong University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bertram E. Shi is active.

Explore More

Publication

Featured researches published by Bertram E. Shi.

international conference on computer vision | 2015

Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks

Lin Sun; Kui Jia; Dit Yan Yeung; Bertram E. Shi

Human actions in video sequences are three-dimensional (3D) spatio-temporal signals characterizing both the visual appearance and motion dynamics of the involved humans and objects. Inspired by the success of convolutional neural networks (CNN) for image classification, recent attempts have been made to learn 3D CNNs for recognizing human actions in videos. However, partly due to the high complexity of training 3D convolution kernels and the need for large quantities of training videos, only limited success has been reported. This has triggered us to investigate in this paper a new deep architecture which can handle 3D signals more effectively. Specifically, we propose factorized spatio-temporal convolutional networks (FstCN) that factorize the original 3D convolution kernel learning as a sequential process of learning 2D spatial kernels in the lower layers (called spatial convolutional layers), followed by learning 1D temporal kernels in the upper layers (called temporal convolutional layers). We introduce a novel transformation and permutation operator to make factorization in FstCN possible. Moreover, to address the issue of sequence alignment, we propose an effective training and inference strategy based on sampling multiple video clips from a given action video sequence. We have tested FstCN on two commonly used benchmark datasets (UCF-101 and HMDB-51). Without using auxiliary training videos to boost the performance, FstCN outperforms existing CNN based methods and achieves comparable performance with a recent method that benefits from using auxiliary training videos.

IEEE Transactions on Circuits and Systems I-regular Papers | 2005

Neuromorphic implementation of orientation hypercolumns

Thomas Yu Wing Choi; Paul Merolla; John V. Arthur; Kwabena Boahen; Bertram E. Shi

Neurons in the mammalian primary visual cortex are selective along multiple stimulus dimensions, including retinal position, spatial frequency, and orientation. Neurons tuned to different stimulus features but the same retinal position are grouped into retinotopic arrays of hypercolumns. This paper describes a neuromorphic implementation of orientation hypercolumns, which consists of a single silicon retina feeding multiple chips, each of which contains an array of neurons tuned to the same orientation and spatial frequency, but different retinal locations. All chips operate in continuous time, and communicate with each other using spikes transmitted by the address-event representation protocol. This system is modular in the sense that orientation coverage can be increased simply by adding more chips, and expandable in the sense that its output can be used to construct neurons tuned to other stimulus dimensions. We present measured results from the system, demonstrating neuronal selectivity along position, spatial frequency and orientation. We also demonstrate that the system supports recurrent feedback between neurons within one hypercolumn, even though they reside on different chips. The measured results from the system are in excellent concordance with theoretical predictions.

IEEE Transactions on Circuits and Systems I-regular Papers | 2007

Expandable Networks for Neuromorphic Chips

Paul Merolla; John V. Arthur; Bertram E. Shi; Kwabena Boahen

We have developed a grid network that broadcasts spikes (all-or-none events) in a multichip neuromorphic system by relaying them from chip to chip. The grid is expandable because, unlike a bus, its capacity does not decrease as more chips are added. The multiple relays do not increase latency because the grids cycle time is shorter than the bus. We describe an asynchronous relay implementation that automatically assigns chip addresses to indicate the source of spikes, encoded as word-serial address-events. This design, which is integrated on each chip, connects neurons at corresponding locations on each of the chips (pointwise connectivity) and supports oblivious, targeted, and excluded delivery of spikes. Results from two chips fabricated in 0.25-mum technology are presented, showing word-rates up to 45.4 M events/s

IEEE Transactions on Circuits and Systems I-regular Papers | 1992

Resistive grid image filtering: input/output analysis via the CNN framework

Bertram E. Shi; Leon O. Chua

The cellular neural network framework developed by L.O. Chua and L. Yang (IEEE Trans. Circuits Syst., vol.32, Oct. 1988) is used to analyze the image filtering operation performed by the VLSI linear resistive grid. In particular, it is shown in detail how the resistive grid can be cast as a CNN, and the use of frequency-domain techniques to characterize the input-output behavior of resistive grids of both infinite and finite size is discussed. These results lead to a theoretical justification of one of the so-called folk theorems commonly held by researchers using resistive grids: resistive grids are robust in the presence of variations in the values of the resistors. An application to edge detection is proposed. In particular, it is shown that the filtering performed by the grid is similar to the exponential filter in the edge detection algorithm proposed by J. Shen and S. Castan (1986). >

IEEE Transactions on Circuits and Systems I-regular Papers | 1998

Gabor-type filtering in space and time with cellular neural networks

Bertram E. Shi

Gabor filters are preprocessing stages in image-processing and computer-vision applications. One drawback is that they are computationally intensive on a digital computer. This paper describes the design of cellular neural networks (CNNs) which compute the outputs of filters similar to Gabor filters. Analog VLSI implementations of these CNNs might eventually relieve the computational bottleneck associated with Gabor filtering image-processing algorithms. The CNNs compute both the real and imaginary parts of the filter outputs simultaneously, which is an important feature in applying them in algorithms utilizing the phase of the Gabor output.

IEEE Transactions on Circuits and Systems I-regular Papers | 2004

An ON-OFF orientation selective address event representation image transceiver chip

Thomas Yu Wing Choi; Bertram E. Shi; Kwabena Boahen

This paper describes the electronic implementation of a four-layer cellular neural network architecture implementing two components of a functional model of neurons in the visual cortex: linear orientation selective filtering and half wave rectification. Separate ON and OFF layers represent the positive and negative outputs of two-phase quadrature Gabor-type filters, whose orientation and spatial-frequency tunings are electronically adjustable. To enable the construction of a multichip network to extract different orientations in parallel, the chip includes an address event representation (AER) transceiver that accepts and produces two-dimensional images that are rate encoded as spike trains. It also includes routing circuitry that facilitates point-to-point signal fan in and fan out. We present measured results from a 32/spl times/64 pixel prototype, which was fabricated in the TSMC0.25-/spl mu/m process on a 3.84 by 2.54 mm die. Quiescent power dissipation is 3 mW and is determined primarily by the spike activity on the AER bus. Settling times are on the order of a few milliseconds. In comparison with a two-layer network implementing the same filters, this network results in a more symmetric circuit design with lower quiescent power dissipation, albeit at the expense of twice as many transistors.

IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 2000

A low-power orientation-selective vision sensor

Bertram E. Shi

We describe the implementation of a focal plane array for computing the outputs of orientation-selective filters similar to Gabor filters using weak inversion transistor circuits. Both the scale and orientation selectivity of the filter can be tuned electronically. We exploit the concept of the transistor as a pseudo-conductance or diffuser and use current, rather than voltage, to represent signals of interest. This design enables energy-efficient computation of the filter responses with similar circuit complexity, as compared with previous strong inversion designs. Test results from a 12/spl times/14 pixel array fabricated in 1.2-/spl mu/m technology are presented.

IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 1993

Design of linear cellular neural networks for motion sensitive filtering

Bertram E. Shi; Tamás Roska; Leon O. Chua

On the basis of the cellular neural network (CNN) paradigm, the authors propose a new architecture for spatio-temporal filtering called a CNN filter array and demonstrate the design of CNN filter arrays for motion sensitive filtering. One advantage of this approach to motion sensitive filtering is that a global convolution in space and time can be performed by using only spatially local interconnections and exploiting the continuous time dynamics of the CNN filter array. No storage of any past image frames is required. >

international conference on development and learning | 2012

A unified model of the joint development of disparity selectivity and vergence control

Yu Zhao; Constantin A. Rothkopf; Jochen Triesch; Bertram E. Shi

Reinforcement learning is a prime candidate as a general mechanism to learn how to progressively choose behaviorally better options in animals and humans. An important problem is how the brain finds representations of relevant sensory input to use for such learning. Extensive empirical data have shown that such representations are also adapted throughout development. Thus, learning sensory representations for tasks and learning of task solutions occur simultaneously. Here we propose a novel framework for efficient coding and task learning in the full perception and action cycle and apply it to the learning of disparity representation for vergence eye movements. Our approach integrates learning of a generative model of sensory signals and learning of a behavior policy with the identical objective of making the generative model work as effectively as possible. We show that this naturally leads to a self-calibrating system learning to represent binocular disparity and produce accurate vergence eye movements. Our framework is very general and could be useful in explaining the development of various sensorimotor behaviors and their underlying representations.

international symposium on circuits and systems | 2010

GPU implemention of fast Gabor filters

XinXin Wang; Bertram E. Shi

With their parallel multi-core architecture, Programmable Graphics Processing Units (GPUs) are well suited for implementing biologically-inspired visual processing algorithms, such as Gabor filtering. We compare several GPU implementations of Gabor filtering. On the same graphics card (an NVIDIA GeForce 9800 GTX+) and for convolution kernel radii from 8 to 48 pixels, an algorithm that decomposes Gabor filtering into a number of simpler steps results in an algorithm that is 2.2 to 33 times faster than direct 2D convolution and 2.8 to 6.6 times faster than a FFT based approach. Surprisingly, in comparison with an optimized algorithm for Gabor filtering running on a PC (Core2 Duo 3.16GHz), it is only 4–10 times faster. The PC can efficiently implement a recursive 1D filter, which requires far fewer arithmetic operations than convolution. However, due to data dependencies, this recursive filter typically runs slower than 1D convolution on the GPU. This highlights the importance of simultaneously considering both arithmetic and memory operations in porting algorithms to GPUs.

Explore More