Claudio Brunelli
Nokia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Claudio Brunelli.
Eurasip Journal on Wireless Communications and Networking | 2011
Omer Anjum; Tapani Ahonen; Fabio Garzia; Jari Nurmi; Claudio Brunelli; Heikki Berg
Software Defined Radio (SDR) is an innovative approach which is becoming a more and more promising technology for future mobile handsets. Several proposals in the field of embedded systems have been introduced by different universities and industries to support SDR applications. This article presents an overview of current platforms and analyzes the related architectural choices, the current issues in SDR, as well as potential future trends.
international symposium on system-on-chip | 2008
Heikki Berg; Claudio Brunelli; Ulf Lücking
Applying design principles and methodologies constituted in the software domain and being adapted to the complete execution environment provides new perspectives for future multi-radio computers. In order to share the underlying hardware resources efficiently, the overall system architecture and related programming model has to support dynamic behavior and extensive changes in the configuration during run-time. The requirements for such a multi-radio computer are demanding, as there will be various radio access stacks with inhomogeneous characteristics executing in parallel. This implies a configuration and control framework, besides the different protocol stacks, that is aware of the managed system in every state and is capable of dynamically scheduling different dataflow graphs corresponding to the applications running on the underlying system. This paper presents the main concepts behind such a reactive system, focusing in particular on the proposed model of computation, giving an overview on the software architecture and related problems to be solved.
signal processing systems | 2009
Claudio Brunelli; Heikki Berg; David Guevorkian
Sine is one of the fundamental mathematic functions which are widely used in a number of application fields. In particular, signal processing and telecommunications need to calculate sine and cosine of numerical values for several different purposes. One of the challenges which affected the implementation of sine calculation in Digital Signal Processing (DSP) has been the method used to calculate it by means of rational functions, which would allow the implementation of sine calculation in a digital computer system. One possibility is to exploit the Taylor polynomials, even though their main drawback consists of a relatively high grade (thus computational load) already for relatively low-precision approximations. This paper proposes a variable-precision method that allows approximating sine and cosine functions with Taylor polynomials while significantly reducing the computational load required. Our analysis shows how using our method it is possible to achieve the same accuracy marked by other approximation methods, at a lower computational cost.
international symposium on system-on-chip | 2009
Fabio Garzia; Roberto Airoldi; Jari Nurmi; Carmelo Giliberto; Claudio Brunelli
This paper describes the implementation of a FFT on a system based on a GP core and a reconfigurable coarse-grain accelerator. The entire system has been prototyped on an Altera Stratix II device. On the prototype a 1024-point FFT gives a 40X speed-up in comparison with the software implementation. The 1024-point FFT is executed in 400μβ. Considering an ASIC synthesis of the coarse-grain array, the 1024-point FFT is executed in 42μβ, against the 104μβ of a DSP implementation.
international symposium on system-on-chip | 2010
Claudio Brunelli; Roberto Airoldi; Jari Nurmi
This paper analyzes the performance of the execution of a few commonly used versions of the Fast Fourier Transform (FFT) algorithm. We started from the C implementation of programs implementing the aforementioned FFT algorithms, then profiled their execution on a series of multicore platforms, both embedded and not. The aim of this work is multiple: in the first place we tried to find out how well different FFT algorithms map to different multicore processors. Secondly, we wanted to understand also how well the performance scales with the number of cores, and how well current compilers manage in exploiting the available hardware when compared to handcrafted programs. Results show that Radix-4 Cooley-Tuckey FFT is on average the best one among the algorithms considered.
ieee eurocon | 2009
Fabio Garzia; Claudio Brunelli; Carmelo Giliberto; Jari Nurmi
This paper describes the implementation ofW-CDMA cell search on a reconfigurable architecture. The architecture is composed of a general-purpose processor core and a reconfigurable coarse-grain array accelerator. In this work we used a computational kernel mapped on the reconfigurable array to execute the W-CDMA target cell search. The acceleration produces a 26X total speed-up against a 4X overhead in area.
international symposium on system-on-chip | 2011
Claudio Brunelli; Eero Aho; Heikki Berg
This paper presents some OpenCL implementations for Cholesky decomposition, a very popular algorithm used in linear algebra and signal processing applications. The Cholesky algorithm represents a very interesting candidate for OpenCL implementation since it contains sequential parts besides parallel ones. Furthermore, one step involves just a small amount of calculations. These characteristics pose challenges which call for suitable techniques to overcome the limitations of the language. We propose several versions of the implementation of the Cholesky algorithm, then provide an analysis of the trade off between complexity and performance offered by each of them. We also analyze the differences between execution of the program on GPU and on multicore CPU.
international symposium on system-on-chip | 2010
Tomi Aarnio; Claudio Brunelli; Timo Viitanen
We propose a novel hardware design for decoding compressed floating-point textures in a graphics processing unit (GPU). Our decoder is based on the NXR texture format, which provides lossy, fixed-rate 6∶1 compression for floating-point textures. Our design exploits the constraints of the compressed pixel blocks to produce the correct output using only fixed-point arithmetic. This results in significantly lower silicon area occupation compared to pre-existing floating-point texture decoders.
Archive | 2011
Claudio Brunelli; Heikki Berg; David Guevorkian
Archive | 2008
Claudio Brunelli; Heikki Berg; David Guevorkian