Antonio Deledda | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Antonio Deledda is active.

Explore More

Publication

Featured researches published by Antonio Deledda.

international solid-state circuits conference | 2005

XiSystem: a XiRisc-based SoC with a reconfigurable IO module

Andrea Cappelli; Andrea Lodi; Massimo Bocchi; Claudio Mucci; Massimiliano Innocenti; C. De Bartolomeis; Luca Ciccarelli; Roberto Giansante; Antonio Deledda; Fabio Campi; Mario Toma; Roberto Guerrieri

In the nanometer era, the increase in nonrecurring engineering costs is a challenge for SoCs that can be faced through a standardization process. Hardware specialization of a standard platform to a given application can be achieved by exploiting reconfigurable technology. This paper presents a XiSystem SoC, which integrates two different field-programmable devices to provide application-specific computing blocks and IOs. A XiRisc reconfigurable processor is exploited to achieve more than one order of magnitude speed-up and energy consumption reduction vis-a/spl grave/-vis a DSP-like processor, while an eFPGA is integrated in the system in order to make it flexible enough to support various IO ports and protocols. The reconfigurable IO device is also utilized for pre/post data processing and implementation of some standard computational blocks.

design, automation, and test in europe | 2007

A dynamically adaptive DSP for heterogeneous reconfigurable platforms

Fabio Campi; Antonio Deledda; Matteo Pizzotti; Luca Ciccarelli; Pier Luigi Rolandi; Claudio Mucci; Andrea Lodi; Arseni Vitkovski; Luca Vanzolini

This paper describes a digital signal processor based on a multi-context, dynamically reconfigurable datapath, suitable for inclusion as an IP-block in complex SoC design projects. The IP was realized in CMOS 090 nm technology. The most relevant features offered by the proposed architecture with respect to state of the art are zero overhead for switching between successive configurations, relevant area and energy computational density on computational kernels (average of 2 GOPS/mm2, 0.2GOPS/mW) and relatively small area occupation (18 mm2), making it suitable for acceleration or upgrade of multi-core heterogeneous embedded platforms. The processor is delivered with a software tool chain providing the application developer algorithmic analysis and design space exploration based on ANSI C, with no utilization of hardware-related constructs or description languages

design, automation, and test in europe | 2008

Design of a HW/SW communication infrastructure for a heterogeneous reconfigurable processor

Antonio Deledda; Claudio Mucci; Arseni Vitkovski; M. Kuehnle; F. Ries; Michael Huebner; Jürgen Becker; Philippe Bonnot; A. Grasset; Philippe Millet; Marcello Coppola; Lorenzo Pieralisi; Riccardo Locatelli; Giuseppe Maruccia; Fabio Campi; T. DeMarco

Reconfigurable architectures and NoC (Network-on- Chip) have introduced new research directions for technology and flexibility issues, which have been largely investigated in the last decades. Exploiting run-time adaptivity opens a new area of research by considering dynamic reconfiguration. In this paper, we present the architecture and associated development tools of an heterogeneous reconfigurable SoC focusing on the chosen communication infrastructure. The SOC integrates units of various sizes of reconfiguration granularity. The included NoC approach demonstrates the mentioned benefits and scalability for actual and future SoC design. On a reference CMOS090 implementation the described interconnect system works at the system reference frequency of 200 MHZ sustaining the required run-time bandwidth on a set of reference applications, at a price < 10% in area in power consumption with respect to the overall system.

international symposium on system-on-chip | 2007

Intelligent cameras and embedded reconfigurable computing: a case-study on motion detection

Claudio Mucci; Luca Vanzolini; Antonio Deledda; Fabio Campi; Gerard Gaillat

Image processing for intelligent cameras like those used in video surveillance applications implies computational demanding algorithms activated in function of non predictable events, such as the content of the image or user requests. For such applications, hardwired acceleration must be restricted to a minimum subset of kernels, due to the increasing NREs when application update become necessary. Embedded reconfigurable processors, coupling in the same computing engine a general-purpose embedded processor and field-programmable fabrics, provide an appealing trade-off point between pure software and dedicated hardware acceleration. As a case-study, this paper presents the implementation of a set of image processing operators utilized for motion detection on the DREAM adaptive DSP. With respect to pure software solutions, the proposed implementation achieves a performance improvement of 2-3 orders of magnitude, while retaining the same degree of programmability and the same economical perspectives from the end-user point of view of processor-based approaches.

design, automation, and test in europe | 2007

Implementation of AES/Rijndael on a dynamically reconfigurable architecture

Claudio Mucci; Luca Vanzolini; Andrea Lodi; Antonio Deledda; Roberto Guerrieri; Fabio Campi; Mario Toma

Reconfigurable architectures provide the user the capability to couple performance typical of hardware design with the flexibility of the software. This paper presents the design of AES/Rijndael on a dynamically reconfigurable architecture. A performance improvement of three order of magnitude was shown compared to the reference code and up to 24times speed-up figure wrt fast C implementations over a RISC processor. A maximum throughput of 546 Mbit/sec is achieved. Compared to prior art, a better energy efficiency with respect to the other programmable solutions was shown, obtaining up to 3 Mbit/sec/mW

international parallel and distributed processing symposium | 2005

A cycle-accurate ISS for a dynamically reconfigurable processor architecture

Claudio Mucci; Fabio Campi; Antonio Deledda; Alberto Fazzi; Mirco Ferri; Massimo Bocchi

Reconfigurable processor architectures (RAs) have been proving as an effective way to couple significant performance improvements with severe energy constraints, such as those imposed by modern portable real-time applications. XiRisc is a VLIW RISC processor architecture featuring a reconfigurable dataflow-oriented functional unit, the so-called PiCoGA, allowing run-time dynamic extension of the instruction set. In this paper, we propose a LISA-based instruction set simulator (ISS) for the reconfigurable processor, retargetable through a dynamically linked library that emulates instruction set extension. The ISS comprises a SystemC system-level model with embedded bus architecture and memory hierarchy (on-chip and off-chip) to provide a reconfigurable system-on-chip performance evaluator.

international symposium on system-on-chip | 2009

RTL-to-layout implementation of an embedded coarse grained architecture for dynamically reconfigurable computing in systems-on-chip

Fabio Campi; Ralf König; M. Dreschmann; Moritz Neukirchner; Damien Picard; Michael Jüttner; Eberhard Schüler; Antonio Deledda; Davide Rossi; Alberto Pasini; Michael Hübner; Jürgen Becker; Roberto Guerrieri

This paper describes the RTL-to-layout implementation of the PACT XPP-III coarse-grained reconfigurable architecture (CGRA). The implementation activity was strictly based on a hierarchical approach in order to exploit performance optimization at all levels, as well as guarantee maximum scalability and provide a portfolio of IP-blocks that could be reused to build different configurations and embodiments of the same CGRA template. The final result can be seamlessly introduced in any SoC design flow as embedded accelerator. It is designed in STMicroelectronics 90nm GP technology, occupies 42.5 mm2, delivers 13 16-bit GOPS (0.8 GOPS/mW, 10 MOPS/mW) and has a measured max frequency of 150 MHZ, requiring a measured 13 mW/MHz dynamic power, 93 mW static. A silicon prototype was also produced embedding XPP-III in a complex system-on-chip including an ARM processor as system controller as well as different ASIC blocks.

IEEE Design & Test of Computers | 2008

An Interconnect Strategy for a Heterogeneous, Reconfigurable SoC

Matthias Kühnle; Michael Hübner; Jürgen Becker; Antonio Marcello Coppola; Lorenzo Pieralisi; Riccardo Locatelli; Giuseppe Maruccia; Tommaso DeMarco; Fabio Campi; Antonio Deledda; Claudio Mucci; Florian Ries

Data-intensive processing in embedded systems is receiving much attention in multimedia computing and high-speed telecommunications. The memory bandwidth problem of traditional von Neumann architectures, however, is impairing processor efficiency. On the other hand, ASIC designs suffer from skyrocketing manufacturing costs and long development cycles. This results in an increasing need for postfabrication programmability at both software and hardware levels. FPGAs provide maximum flexibility with their fine-grained architecture but bring severe overhead in timing, area, and power consumption. Wordor subword-oriented runtime reconfigurable architectures offer highly parallel, scalable solutions combining hardware performance with software flexibility.1 Their coarser granularity reduces area, delay, power consumption, and reconfiguration time, but they introduce trade-offs in processing-element design.

design, automation, and test in europe | 2008

Implementation of parallel LFSR-based applications on an adaptive DSP featuring a pipelined configurable Gate Array

Claudio Mucci; Luca Vanzolini; Ilario Mirimin; Daniele Gazzola; Antonio Deledda; Sebastian Goller; Joachim Knaeblein; Axel Schneider; Luca Ciccarelli; Fabio Campi

Linear feedback shift registers (LFSRs) are common structures in many application fields, including cryptography, digital broadcasting and communication. High- throughput requirements need highly parallel implementations, usually accomplished in state of the art system on chips (SoCs) with application specific coprocessors. Although this approach achieves the required performance, it rapidly shows lack of flexibility when those devices are proposed, as an example, for multi-standard modems or for security applications in which run-time update can provide added value. This paper shows the implementation of parallel LFSR-based applications on an embedded adaptive DSP featuring a Pipelined Configurable Gate Array (PiCoGA). With respect to standard embedded FPGAs, pipelined devices usually provide better performance, e.g. in terms of speed, but they commonly show the undeniable drawback of additional design constraints. As a test-case, we consider the implementation of the 32-bit CRC used in the Ethernet standard that achieves on the target architecture up to ~25Gbit/sec throughput, with a parallel LFSR processing 128 bit at time, which is comparable to the performance offered by some ASIC devices.

international symposium on circuits and systems | 2006

A stream register file unit for reconfigurable processors

Fabio Campi; P. Zoffoli; Claudio Mucci; Massimo Bocchi; Antonio Deledda; M. De Dominicis; Arseni Vitkovski

This paper presents a local buffer memory in the form of a stream register file (SRF) that was developed in order to connect, in a compiler-friendly pattern, large-bandwidth run-time configurable logic units in processor-based SOCs. The proposed SRF offers to the host SOC system performance speedups in the range of 4times, with area/power overhead in the order of 6%. The described hardware and algorithm mapping strategy was implemented on silicon in a SOC based on the PiCoGA reconfigurable architecture. The SOC provides an average 450 MOPS (mega operations per Second) in STM CMOS090 technology running at 100MHZ

Explore More