Andreas Ehliar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andreas Ehliar is active.

Explore More

Publication

Featured researches published by Andreas Ehliar.

field-programmable logic and applications | 2007

An FPGA Based Open Source Network-on-Chip Architecture

Andreas Ehliar; Dake Liu

Networks on chip (NoC) has long been seen as a potential solution to the problems encountered when implementing large digital hardware designs. In this paper we describe an open source FPGA based NoC architecture with low area overhead, high throughput and low latency compared to other published works. The architecture has been optimized for Xilinx FPGAs and the NoC is capable of operating at a frequency of 260 MHz in a Virtex-4 FPGA. We have also developed a bridge so that generic Wishbone bus compatible IP blocks can be connected to the NoC.

IEEE Transactions on Nuclear Science | 2012

A High-Rate Energy-Resolving Photon-Counting ASIC for Spectral Computed Tomography

Mikael Gustavsson; Farooq Amin; Anders Bjorklid; Andreas Ehliar; Cheng Xu; Christer Svensson

We describe a high-rate energy-resolving photon-counting ASIC aimed for spectral computed tomography. The chip has 160 channels and 8 energy bins per channel. It demonstrates a noise level of ENC =214 electrons at 5 pF input load at a power consumption of <; 5 mW/channel. Maximum count rate is 17 Mcps at a peak time of 40 ns, made possible through a new filter reset scheme, and maximum read-out frame rate is 37 kframe/s.

norchip | 2006

High Performance, Low Latency FPGA based Floating Point Adder and Multiplier Units in a Virtex 4

Per Karlström; Andreas Ehliar; Dake Liu

Since the invention of FPGAs, the increase in their size and performance has allowed designers to use FPGAs for more complex designs. FPGAs are generally good at bit manipulations and fixed point arithmetics but has a harder time coping with floating point arithmetics. In this paper we describe methods used to construct high performance floating point components in a Virtex-4. We have constructed a floating point adder/subtracter and multiplier which we then used to construct a complex radix-2 butterfly. Our adder/subtracter can operate at a frequency of 361 MHz in a Virtex-4SX35 (speed grade -12)

multimedia signal processing | 2004

Using low precision floating point numbers to reduce memory cost for MP3 decoding

Johan Eilert; Andreas Ehliar; Dake Liu

The purpose of our work has been to evaluate the practicality of using a 16-bit floating point representation to store the intermediate sample values and other data in memory during the decoding of MP3 bit streams. A floating point number representation offers a better trade-off between dynamic range and precision than a fixed point representation for a given word length. Using a floating point representation means that smaller memories can be used which leads to smaller chip area and lower power consumption without reducing sound quality. We have designed and implemented a DSP processor based on 16-bit floating point intermediate storage. The DSP processor is capable of decoding all MP3 bit streams at 20 MHz and this has been demonstrated on an FPGA prototype.

Proceedings of the 7th FPGAworld Conference on | 2010

Optimizing Xilinx designs through primitive instantiation

Andreas Ehliar

This paper is intended as a guideline for people who are interested in manual instantiation of FPGA primitives as a way of improving the performance of an FPGA design. The focus of the paper is on designs where slice primitives like flip-fops and lookup tables are instantiated. Guidelines on how to develop a design with manual instantiation are presented together with a case study of a high performance bitserial twos complement divider where a majority of the area is manually instantiated. This divider is capable of reaching a maximum frequency of 345 MHz in the fastest Virtex-4 while utilizing less than 150 LUTs thanks to the high amount of manual optimizations. An open source library containing modules intended to promote the structured development of modules with manually instantiated components is also presented.

Iet Computers and Digital Techniques | 2008

High-performance, low-latency field-programmable gate array-based floating-point adder and multiplier units in a Virtex 4

Per Karlström; Andreas Ehliar; Dake Liu

There is increasing interest about floating-point arithmetics in field programmable gate arrays (FPGAs) because of the increase in their size and performance. FPGAs are generally good at bit manipulations and fixed-point arithmetics, but they have a harder time coping with floating-point arithmetics. An architecture used to construct high-performance floating-point components in a Virtex-4 FPGA is described in detail. Floating-point adder/subtracter and multiplier units have been constructed. The adder/subtracter can operate at a frequency of 377 MHz in a Virtex-4SX35 (speed grade -12).

field-programmable logic and applications | 2008

A high performance microprocessor with DSP extensions optimized for the Virtex-4 FPGA

Andreas Ehliar; Per Karlström; Dake Liu

As the use of FPGAs increases, the importance of highly optimized processors for FPGAs will increase. In this paper we present the microarchitecture of a soft microprocessor core optimized for the Virtex-4 architecture. The core can operate at 357 MHz, which is significantly faster than Xilinxpsila Microblaze architecture on the same FPGA. At this frequency it is necessary to keep the logic complexity down and this paper shows how this can be done while retaining sufficient functionality for a high performance processor.

field-programmable logic and applications | 2009

An ASIC perspective on FPGA optimizations

Andreas Ehliar; Dake Liu

In this paper we discuss how various design components perform in both FPGAs and standard cell based ASICs. We also investigate how various common FPGA optimizations will effect the performance and area of an ASIC port. We find that most techniques that are used to optimize a design for an FPGA will not have a negative impact on the area in an ASIC. The intended audience for this paper are engineers charged with creating designs or IP cores that are optimized for both FPGAs and ASICs.

field-programmable technology | 2014

Area efficient floating-point adder and multiplier with IEEE-754 compatible semantics

Andreas Ehliar

In this paper we describe an open source floating-point adder and multiplier implemented using a 36-bit custom number format based on radix-16 and optimized for the 7-series FPGAs from Xilinx. Although this number format is not identical to the single-precision IEEE-754 format, the floatingpoint operators are designed in such a way that the numerical results for a given operation will be identical to the result from an IEEE-754 compliant operator with support for round-to-nearest even, NaNs and Infs, and subnormal numbers. The drawback of this number format is that the rounding step is more involved than in a regular, radix-2 based operator. On the other hand, the use of a high radix means that the area cost associated with normalization and denormalization can be reduced, leading to a net area advantage for the custom number format, under the assumption that support for subnormal numbers is required. The area of the floating-point adder in a Kintex-7 FPGA is 261 slice LUTs and the area of the floating-point multiplier is 235 slice LUTs and 2 DSP48E blocks. The adder can operate at 319 MHz and the multiplier can operate at a frequency of 305 MHz.

2014 International Symposium on Integrated Circuits (ISIC) | 2014

Challenging the limits of FFT performance on FPGAs (Invited paper)

Mario Garrido; Miguel Acevedo; Andreas Ehliar; Oscar Gustafsson

This paper analyzes the limits of FFT performance on FPGAs. For this purpose, a FFT generation tool has been developed. This tool is highly parameterizable and allows for generating FFTs with different FFT sizes and amount of parallelization. Experimental results for FFT sizes from 16 to 65536, and 4 to 64 parallel samples have been obtained. They show that even the largest FFT architectures fit well in todays FPGAs, achieving throughput rates from several GSamples/s to tens of GSamples/s.

Explore More