Anders Berkeman
Lund University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anders Berkeman.
IEEE Journal of Solid-state Circuits | 2000
Anders Berkeman; Viktor Öwall; Mats Torkelson
A combinatorial complex multiplier has been designed for use in a pipelined fast Fourier transform processor. The performance in terms of throughput of the processor is limited by the multiplication. Therefore, the multiplier is optimized to make the input-to-output delay as short as possible. A new architecture based on distributed arithmetic, Wallace-trees, and carry-lookahead adders has been developed. The multiplier has been fabricated using standard cells in a 0.5-/spl mu/m process and verified for functionality, speed, and power consumption. Running at 40 MHz, a multiplier with input wordlengths of 16+16 times 10+10 bits consumes 54% less power compared to an distributed arithmetic array multiplier fabricated under equal conditions.
international symposium on circuits and systems | 2003
Anders Berkeman; Viktor Öwall
The division operation is essential in many digital signal processing algorithms. For a hardware implementation, the requirements and constraints on the divider circuit differ significantly with different applications. Therefore, it is not possible to design one divider component having optimal performance and cost for all target applications. Instead, the presented divider has a modular architecture, based on instantiation of small efficient divider subblocks. The configuration of the divider architecture is set by a number of parameters controlling wordlength, number of quotient bits, number of clock cycles per operation, and fixed or floating point operation. Digit recurrence algorithms with carry save arithmetic and on-the-fly twos complement output quotient conversion are used to make the sub-blocks small, fast and power efficient. The modularity gives the designer freedom to elaborate different parameters to explore the design space. Two applications using the proposed divider are presented. Furthermore, an example divider circuit has been fabricated and performance measurements are included.
european solid-state circuits conference | 1998
Anders Berkeman; Viktor Öwall; Mats Torkelson
A complex multiplier has been designed for use in a pipelined fast fourier transform processor. The performance in terms of throughput of the processor is limited by the multiplication. Therefore, the multiplier is optimized to make the input to output delay as short as possible. A new architecture based on distributed arithmetic and Wallace-trees has been developed and is compared to a previous multiplier realized as a regular distributed arithmetic array. The simulated gain in speed for the presented multiplier is about 100%. For verification, the multiplier is fabricated in a three metal-layer 0.5µ CMOS process using a standard cell library. The fabricated multiplier chip has been functionally verified.
international symposium on circuits and systems | 2000
Anders Berkeman; Viktor Öwall
In application specific implementation of digital signal processing algorithms optimization is important for a low power solution, not only on block level but also between blocks. This paper presents a co-optimization of a fast Fourier transform and a finite impulse response filter in a silicon implementation of an acoustic echo. The optimization gain can be measured in the number of operations and memory accesses performed per second, and therefore processing power. The optimization can also be applied to other algorithms with a similar constellation of Fourier transforms and finite impulse response filters.
european solid-state circuits conference | 2003
Anders Berkeman; Viktor Öwall
This paper presents a hardware implementation of a high quality acoustic echo canceller for use in hands-free telecommunication systems. The implementation is based on an algorithm with no delay in the signal path, attractive for communication systems where low delay is crucial. However, a zero delay algorithm has higher complexity compared to other canceller solutions. A custom silicon implementation fulfils quality and realtime operation while sustaining low power consumption. The fabricated processor contains two million transistors, and the core occupies 20 mm/sup 2/ in a 0.35 /spl mu/m CMOS process. At 16 MHz clock frequency, the chip processes 16 bit samples at a rate of 16 kHz, while consuming 55mW for uncorrelated input data.
international conference on electronics circuits and systems | 1998
Anders Berkeman; Viktor Öwall; Mats Torkelson
A complex multiplier has been designed for use in a pipelined fast Fourier transform processor. The performance in terms of throughput of the processor is limited by the multiplication. Therefore, the multiplier is optimized to make the input to output delay as short as possible. A new architecture based on distributed arithmetic and Wallace-trees has been developed and is compared to a previous multiplier realized as a regular distributed arithmetic array. The simulated gain in speed for the presented multiplier is approximately 100%. For verification, the multiplier is currently under fabrication in a three metal-layer 0.5 /spl mu/m CMOS process using a standard cell library.
midwest symposium on circuits and systems | 1999
Anders Berkeman; Viktor Öwall; Mats Torkelson
The high computational complexity of acoustic echo cancellation algorithms requires application specific implementations to sustain real time signal processing with affordable power consumption. This is especially true for systems where a delayless approach is considered important, e.g. wireless communication systems. The proposed paper presents architectural considerations to reach a feasible hardware solution.
norchip | 1997
Anders Berkeman; Viktor Öwall; Mats Torkelson
NORSIG | 2002
Anders Berkeman; Viktor Öwall
ISSN: 1402-8662 | 2002
Anders Berkeman