P.G. Fernandez | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where P.G. Fernandez is active.

Explore More

Publication

Featured researches published by P.G. Fernandez.

midwest symposium on circuits and systems | 2000

A RNS-based matrix-vector-multiply FCT architecture for DCT computation

P.G. Fernandez; Antonio G. García; Javier Ramírez; L. Parrilla; A. Lloris

A Field-Programmable Logic (FPL) implementation of the Discrete Cosine Transform (DCT) based on the Residue Number System (RNS) is presented. It uses a combination of the Fast Cosine Transform (FCT) algorithm and the matrix-vector multiplication (MVM). This paper shows that the RNS-based FCT-MVM implementation provides a throughput improvement over the equivalent binary system up to 72%, while its advantage over the binary distributed arithmetic implementation is up to 128%.

international symposium on circuits and systems | 2000

A new architecture to compute the discrete cosine transform using the quadratic residue number system

Javier Ramírez; Antonio G. García; P.G. Fernandez; L. Parrilla; A. Lloris

A new methodology to compute the N-point DCT (Discrete Cosine Transform) in the QRNS (Quadratic Residue Number System) is presented, with a significant improvement in complexity and speed compared to the corresponding binary version. This reduction in the total number of arithmetic adders and multipliers is up to 21% and 26% for an 8-point and a 16-point DCT, respectively. In addition, large and slow binary adders and multipliers with a long carry propagation delay are replaced by high-speed small word-length modular adders and LUT (Look-Up Table) multipliers. When a Field Programmable Logic (FPL) implementation using Altera FLEX10KE devices of the proposed architecture for the 8-point DCT is considered, throughput is three times better than that obtained with the corresponding fixed point 2s complement binary implementation.

signal processing systems | 2000

Fast RNS-based 2D-DCT computation on field-programmable devices

P.G. Fernandez; Antonio G. García; Javier Ramírez; A. Lloris

This paper shows the implementation of an 8/spl times/8 2D-DCT (discrete cosine transform) processor based on the residue number system (RNS). It makes use of a fast cosine transform (FCT) algorithm that requires a single multiplication stage for each signal path, while most other algorithms include paths with more than one multiplication. The row-column decomposition technique is used and each 1D-DCT processor requires only 14 multipliers and 32 adders and subtractors. The proposed RNS-based 2D-DCT processor provides a throughput improvement over the equivalent 2s complement system of up to 147% when 8-bit moduli are used. This is achieved due to the synergy between RNS and modern FPL device families.

asilomar conference on signals, systems and computers | 1999

A new implementation of the discrete cosine transform in the residue number system

P.G. Fernandez; Antonio G. García; Javier Ramírez; L. Parrilla; A. Lloris

A field-programmable logic (FPL) implementation of a discrete cosine transform (DCT) based on the residue number system (RNS) is presented. Compared with a binary distributed arithmetic implementation, the presented architecture provides approximately 21% throughput improvement. Moreover, the performance improvement over a conventional binary implementation is up to 103%. This is achieved due to the synergy between RNS and modern FPL device families.

signal processing systems | 2001

Index-based RNS DWT architectures for custom IC designs

Javier Ramírez; P.G. Fernandez; Uwe Meyer-Bäse; Fred J. Taylor; Antonio G. García; A. Lloris

The design of high-performance, high-precision, real-time digital signal processing (DSP) systems, such as those associated with wavelet signal processing, is a challenging problem. This paper reports on the innovative use of the residue number system (RNS) for implementing high-end wavelet filter banks. The disclosed system uses an enhanced index-transformation defined over Galois fields to efficiently support different wavelet filter instantiations without adding any extra cost or additional lookup tables (LUT). An exhaustive comparison against existing twos complement (2C) designs for different custom IC technologies was carried out. These structures have been demonstrated to be well suited for field programmable logic (FPL) assimilation as well as for CBIC (cell-based integrated circuit) technologies.

midwest symposium on circuits and systems | 2000

Implementation of RNS analysis and synthesis filter banks for the orthogonal discrete wavelet transform over FPL devices

Javier Ramírez; Antonio G. García; L. Parrilla; A. Lloris; P.G. Fernandez

RNS architectures to compute the orthogonal DWT and its inverse are shown. The relation between the coefficients of the analysis and synthesis filters allows one to halve the number of required LUTs and modular adders. Simulations of one- and two- octave implementations using VHDL and FPL devices show a performance advantage of up to 23.45 and 96.58% when compared to the 2s complement arithmetic versions, respectively.

field programmable logic and applications | 2000

Analysis of RNS-FPL Synergy for High Throughput DSP Applications: Discrete Wavelet Transform

Javier Ramírez; Antonio G. García; P.G. Fernandez; Luis Parilla; Antonio Lloris-Ruíz

This paper focuses on the implementation over FPL devices of high throughput DSP applications taking advantage of RNS arithmetic. The synergy between the RNS and modern FPGA device families, providing built-in tables and fast carry and cascade chains, makes it possible to accelerate MAC intensive real-time and DSP systems. In this way, a slow high dynamic range binary 2s complement system can be partitioned into various parallel and high throughput small word-length RNS channels without inter-channel carry dependencies. To illustrate the design methodology, novel RNS-based architectures for multi-octave orthogonal DWT and its inverse are implemented using structural level VHDL synthesis. Area analysis and performance simulation are conducted. A relevant throughput improvement for the proposed RNS-based solution is obtained, compared to the equivalent 2s complement implementation.

asilomar conference on signals, systems and computers | 2000

A new RNS architecture for the computation of the scaled 2D-DCT on field-programmable logic

P.G. Fernandez; Javier Ramírez; Antonio G. García; L. Parrilla; A. Lloris

This paper shows the implementation of an 8/spl times/8 scaled two-dimensional Discrete Cosine Transform processor (2D-DCT) based on the Residue Number System (RNS). The row-column decomposition technique is used and each 1D-DCT processor has been derived by the application of a previously developed scaled Fast Cosine Transform (FCT) algorithm that requires a reduced number of multiplications. Simulations of binary 2s complement and RNS version of the scaled 2D-DCT processor using VHDL over Field-Programmable Logic (FPL) devices provide a throughput improvement for the proposed RNS-based 2D-DCT processor of up to 148% when 8-bit moduli are used. This is achieved due to the synergy between RNS and modern FPL device families.

asilomar conference on signals, systems and computers | 2000

Implementation of canonical and retimed RNS architectures for the orthogonal 1-D DWT over FPL devices

Javier Ramírez; Antonio G. García; P.G. Fernandez; L. Parrilla; A. Lloris

This paper shows the design and implementation of canonical and retimed RNS (Residue Number System) architectures for the direct and inverse orthogonal Discrete Wavelet Transform (DWT). They allow sharing of the low-pass filter Look-Up Tables (LUTs) to compute the two filter bank outputs. These architectures enable the use of fine-grain pipelining in the modular adder trees, the use of only one adder tree and the redistribution of module multiplier LUTs. Their implementation on Field-Programmable Logic (FPL) devices was carried out to compare both solutions with binary 2s complement arithmetic solutions.

asilomar conference on signals, systems and computers | 1999

A novel RNS-based SIMD RISC processor for digital signal processing

Javier Ramírez; Antonio G. García; L. Parrilla; P.G. Fernandez; A. Lloris

The architectural design and field-programmable logic (FPL) implementation of a digital signal processor (DSP) based on the residue number system (RNS) is presented. This processor makes use of the intrinsic parallelism of RNS for high speed digital signal processing. It consists of a certain number of RNS channels that perform data processing in parallel without any dependency between them. In this way, efficiency is achieved by the reduction in channel word-length. The processor has been modelled at the structural level using VHDL and implemented in Altera FLEX10K devices. Comparison with commercial DSPs for several applications reveals an improvement of up to 133%.

Explore More