Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gerald G. Pechanek is active.

Publication


Featured researches published by Gerald G. Pechanek.


Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204) | 1998

The sum-absolute-difference motion estimation accelerator

Stamatis Vassiliadis; Edwin A. Hakkennes; J. S. S. M. Wong; Gerald G. Pechanek

We investigate the Sum Absolute Difference (SAD) operation, an operation frequently used by a number of algorithms for digital motion estimation. For such operation, we propose a single vector instruction that can be performed (in hardware) on an entire block of data in parallel. We investigate possible implementations for such an instruction. Assuming a machine cycle comparable to the cycle of a two cycle multiply, we show that for a block of 16/spl times/1 or 16/spl times/16, the SAD operation can be performed in 3 or 4 machine cycles respectively. The proposed implementation operates as follows. First we determine in parallel which of the operands is the smallest in a pair of operands. Second we compute the absolute value of the difference of each pairs by subtracting the smallest value from the largest and finally we compute the accumulation. The operations associated with the second and the third step are performed in parallel resulting in a multiply (accumulate) type of operation. Our approach covers also the Mean Absolute Difference (MAD) operation at the exclusion of a shifting (division) operation.


Proceedings of the 26th Euromicro Conference. EUROMICRO 2000. Informatics: Inventing the Future | 2000

The ManArray/sup TM/ embedded processor architecture

Gerald G. Pechanek; Stamatis Vassiliadis

The BOPS(R) ManArray/sup TM/ architecture is presented as a scalable DAP platform for the embedded processor domain. In this domain, ManArray-based processors use a single architecture definition, that supports multiple configurations of processing elements (PEs) from low end single PE to large arrays of PEs, and single tool set. The ManArray (selectable) parallelism architecture mixes control oriented operations, VLIWs, packed data operations, and distributed array processing in a cohesive, independently selectable manner. In addition, scalable conditional execution and single-cycle communications across a high connectivity, low cost network are integrated in the architecture. This allows another level of selectivity that enhances the application of the parallel resources that enhances the application of the parallel resources to high performance algorithms. Coupled with the array DSP is a scalable DMA engine that runs in the background and provides programmer-selectable data-distribution patterns and a high-bandwidth data-streaming interface to system peripherals and global memory. This paper introduces the embedded scalable ManArray architecture and a number of benchmarks. For example, a standard ASIC flow DSP/coprocessor core, the BOPS2040, can process a distributed 256-point complex FFT in 425 cycles and an 8/spl times/8 2D IDCT that meets IEEE standards in 34 cycles.


Proceeding of an international workshop on VLSI for neural networks and artificial intelligence | 1995

A VLSI pipelined neuroemulator

José G. Delgado-Frias; Stamatis Vassiliadis; Gerald G. Pechanek; Wei Lin; Steven M. Barber; Hui Ding

Applications and interest on artificial neural networks (ANN) have been increasing in recent years. Applications include pattern matching, associative memory, image processing and word recognition (Simpson 1992). ANNs is a novel computing paradigm in which an artificial neuron produces an output that depends on the inputs (from other neurons), the strength or weights associated with the inputs, and an activation function.


international symposium on neural networks | 1996

A neuro-emulator with learning and virtual emulation capabilities

V.C. Aikens; J.G. Delgado Frias; S.M. Barber; Gerald G. Pechanek; Stamatis Vassiliadis

In this paper we present and evaluate a novel neuro-emulator. The architecture of this neuro-emulator provides support for learning as well as handling large neural networks in virtual mode. We have identified a set of computational, communication and storage requirements for learning in artificial neural networks. These requirements are representative of a wide variety of algorithms for different styles of learning. The proposed novel neuro-emulator provides the computational ability for the stated requirements. While meeting all the identified requirements the new architecture maintains a high utilization of the machines resources during learning. To show the capabilities of the proposed machine we present four diverse learning algorithms. We include an evaluation of the machine performance as well as a comparison with other architectures. It is shown that with a modest amount of hardware the proposed architecture yields an extremely high number of connections per second.


Proceeding of an international workshop on VLSI for neural networks and artificial intelligence | 1995

A dataflow approach for neural networks

Thomas F. Ryan; José G. Delgado-Frias; Stamatis Vassiliadis; Gerald G. Pechanek; Douglas M. Green

Artificial neural networks have been introduced as a novel computing paradigm (Kohenen 1988). Processing (or retrieving) in neural networks requires a collective interaction of a number of neurons. Output of neurons are computed based on the inputs from other neurons, weights associated with such inputs, and a non-linear activation function. Specifically, most artificial neurons follow a mathematical model that is expressed as: n n


Archive | 2001

Methods and apparatus for scalable instruction set architecture with dynamic compact instructions

Gerald G. Pechanek; Edwin Frank Barry; Juan Guillermo Revilla; Larry D. Larsen


Archive | 2000

Accessing tables in memory banks using load and store address generators sharing store read port of compute register file separated from address register file

Edwin Frank Barry; Charles W. Kurak; Gerald G. Pechanek; Larry D. Larsen

{Y_i}(t + 1) = F(sumlimits_{j = 1}^N {{W_{ij}}{Y_j}(t)} )


Archive | 2001

Methods and apparatus for dynamic instruction controlled reconfigurable register file with extended precision

Gerald G. Pechanek; Edwin Franklin Barry


Archive | 1995

Processor using folded array structures for transposition memory and fast cosine transform computation

Gerald G. Pechanek; Stamatis Vassiliadis

n n(1) n nwhere Wij is the weight, Yj(t) is the neuron input, N is the number of neurons connected to neuron 1, and F is a non linear function which is usually a sigmoid (Hopfield 1984) as shown below.


Archive | 2001

Merged control/process element processor for executing VLIW simplex instructions with SISD control/SIMD process mode bit

Gerald G. Pechanek; Juan Guillermo Revilla

Researchain Logo
Decentralizing Knowledge