Daniele Ludovici | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniele Ludovici is active.

Explore More

Publication

Featured researches published by Daniele Ludovici.

design, automation, and test in europe | 2011

Exploiting Network-on-Chip structural redundancy for a cooperative and scalable built-in self-test architecture

Alessandro Strano; Crispín Gómez; Daniele Ludovici; Michele Favalli; María Engracia Gómez; Davide Bertozzi

This paper proposes a built-in self-test/self-diagnosis procedure at start-up of an on-chip network (NoC). Concurrent BIST operations are carried out after reset at each switch, thus resulting in scalable test application time with network size. The key principle consists of exploiting the inherent structural redundancy of the NoC architecture in a cooperative way, thus detecting faults in test pattern generators too. At-speed testing of stuck-at faults can be performed in less than 1200 cycles regardless of their size, with an hardware overhead of less than 11%.

design, automation, and test in europe | 2009

Assessing fat-tree topologies for regular network-on-chip design under nanoscale technology constraints

Daniele Ludovici; F. Gilabert; Simone Medardoni; Crispín Gómez; María Engracia Gómez; Pedro López; Georgi Gaydadjiev; Davide Bertozzi

Most of past evaluations of fat-trees for on-chip interconnection networks rely on oversimplifying or even irrealistic architecture and traffic pattern assumptions, and very few layout analyses are available to relieve practical feasibility concerns in nanoscale technologies. This work aims at providing an in-depth assessment of physical synthesis efficiency of fat-trees and at extrapolating silicon-aware performance figures to back-annotate in the system-level performance analysis. A 2D mesh is used as a reference architecture for comparison, and a 65 nm technology is targeted by our study. Finally, in an attempt to mitigate the implementation cost of k-ary n-tree topologies, we also review an alternative unidirectional multi-stage interconnection network which is able to simplify the fat-tree architecture and to minimally impact performance.

complex, intelligent and software intensive systems | 2009

Designing Regular Network-on-Chip Topologies under Technology, Architecture and Software Constraints

F. Gilabert; Daniele Ludovici; Simone Medardoni; Davide Bertozzi; L. Benini; Georgi Gaydadjiev

Regular multi-core processors are appearing in the embedded system market as high performance software programmable solutions. The use of regular interconnect fabrics for them allows fast design time, ease of routing, predictability of electrical parameters and good scalability. k-ary n-mesh topologies are candidate solutions for these systems, borrowed from the domain of off-chip interconnection networks. However, the on-chip integration has to deal with unique challenges at different levels of abstraction. From a technology viewpoint, interconnect reverse scaling causes critical paths to go across global links. Poor interconnect performance might also impact IP core speed depending on the synchronization mechanism at the interface. Finally, this might also conflict with the requirements that communication libraries employed in the MPSoC domain pose on the underlying interconnect fabric. This paper provides a comprehensive overview of these topics, by characterizing physical feasibility of representative k-ary n-mesh topologies and by providing silicon-aware system-level performance figures.

networks on chips | 2009

Comparing tightly and loosely coupled mesochronous synchronizers in a NoC switch architecture

Daniele Ludovici; Alessandro Strano; Davide Bertozzi; Luca Benini; Georgi Gaydadjiev

With the advent of Networks-on-Chip (NoCs), the interest for mesochronous synchronizers is again on the rise due to the intricacies of skew-controlled chip-wide clock tree distribution. Recently proposed schemes agree on a source synchronous design style with some form of ping-pong buffering to counter timing and metastability concerns. However, the integration issues of such synchronizers in a NoC setting are still largely uncovered. Most schemes are in fact placed between communicating switches, thus neglecting the abrupt increase of buffering resources needed at switch input stages. This paper goes a step forward and aims at deep integration of the synchronizer in the switch architecture, thus merging key tasks such as synchronization, buffering and flow control into a unique architecture block. This paper compares the integrated and the loosely coupled solutions from a performance and area viewpoint, while devoting special attention to their robustness with respect to physical design parameters.

international conference on embedded computer systems: architectures, modeling, and simulation | 2010

A library of dual-clock FIFOs for cost-effective and flexible MPSoC design

Alessandro Strano; Daniele Ludovici; Davide Bertozzi

Customization of IP blocks in a multi-processor system-on-chip (MPSoC) is the historical approach to the cost-effective implementation of such systems. A recent trend consists of structuring a MPSoC into loosely coupled voltage and frequency islands to meet tight power budgets. In this context, synchronization between islands of synchronicity becomes a major design issue. Dual-clock FIFOs compare favorably with respect to synchronizer-based designs and pausible clocking interfaces from a performance viewpoint, but incur a significant area, power and latency overhead. This paper proposes a library of dual-clock FIFOs for cost-effective MPSoC design, where each architecture variant in the library has been designed to match well-defined operating conditions at the minimum implementation cost. Each FIFO synchronizer is suitable for plug-and-play insertion into the NoC architecture and selection depends on the performance requirements of the synchronization interface at hand. Above all, components of our synchronization library have not been conceived in isolation, but have been tightly co-designed with the switching fabric of the on-chip interconnection network, thus making a conscious use of power-hungry buffering resources and leading to affordable implementations in the resource constrained MPSoC domain.

Proceedings of the Fifth International Workshop on Interconnection Network Architecture | 2011

Mesochronous NoC technology for power-efficient GALS MPSoCs

Daniele Ludovici; Alessandro Strano; Georgi Gaydadjiev; Davide Bertozzi

MPSoCs are today frequently designed as the composition of multiple voltage/frequency islands, thus calling for a GALS clocking style. In this context, the on-chip interconnection network can be either inferred as a single and independent clock domain or it can be distributed among cores domains. This paper targets the former scenario, since it results in the homogeneous speed of the NoC switching elements. From a physical design viewpoint, the main issues lie however in the chip-wide extension of the network domain and in the growing uncertainties affecting nanoscale silicon technologies. This paper proves that partitioning the network into mesochronous domains and merging synchronizers with NoC building blocks, two main advantages can be achieved. First, it is possible to evolve synchronous networks to mesochronous ones with marginal performance and area overhead. Second, the mesochronous NoC exposes more degrees of freedom for power optimization.

ACM Transactions in Embedded Computing Systems | 2013

A complete self-testing and self-configuring NoC infrastructure for cost-effective MPSoCs

Alberto Ghiribaldi; Daniele Ludovici; Francisco Triviño; Alessandro Strano; Jose Flich; José L. Sánchez; Francisco José Alfaro; Michele Favalli; Davide Bertozzi

Networks-on-chip need to survive to manufacturing faults in order to sustain yield. An effective testing and configuration strategy however implies two opposite requirements. One one hand, a fast and scalable built-in self-testing and self-diagnosis procedure has to be carried out concurrently at NoC switches. On the other hand, programming the NoC routing mechanism to go around faulty links and switches can be optimally performed by a centralized controller with global network visibility. To the best of our knowledge, this article proposes for the first time a global network testing and configuration strategy that meets the opposite requirements by means of a fault-tolerant dual network architecture and a fast configuration algorithm for the most common failure patterns. Experimental results report an area overhead as low as 12.5% with respect to the baseline switch architecture while achieving a high degree of fault tolerance. In fact, even when multiple stuck-at faults are considered, the capability of fault masking by the dual network is always over 80%, and the support for multiple link failures is more than 90% in presence of two unusable links in the main network with minimum set-up times.

2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip | 2011

System-level infrastructure for boot-time testing and configuration of networks-on-chip with programmable routing logic

Alberto Ghiribaldi; Daniele Ludovici; Michele Favalli; Davide Bertozzi

great lakes symposium on vlsi | 2009

Capturing topology-level implications of link synthesis techniques for nanoscale networks-on-chip

Daniele Ludovici; Georgi Gaydadjiev; Davide Bertozzi; Luca Benini

In the context of nanoscale networks-on-chip (NoCs), each link implementation solution is not just a specific synthesis optimization technique with local performance and power implications, but gives rise to a well-differentiated point in the architecture design space. This in an effect of the tight interaction existing between architecture and physical design layers in nanoscale technologies. This work assesses several NoC link inference techniques (buffering options, link pipelining) by means of commercial backend synthesis tools, taking the system-level perspective. In fact, performance speed-ups and power overhead are not evaluated for the links in isolation but for the network topology as a whole, thus showing their sensitivity to the link inference strategy. k-ary n-mesh topologies are considered for the sake of analysis, in that they provide a range of topologies with increasing total wirelength.

network on chip architectures | 2011

Contrasting multi-synchronous MPSoC design styles for fine-grained clock domain partitioning: the full-HD video playback case study

Hervé Tatenguem; Daniele Ludovici; Alessandro Strano; Davide Bertozzi; Helmut Reinig

Fine-grained (per-core) multi-synchronous systems calls for new clocking strategies and new architecture design techniques. This paper compares two fundamental multi-synchronous implementation variants based on the extensive use of dual-clock FIFOs vs mesochronous synchronizers respectively. The architecture-homogeneous experimental setting, the cost-effective merging of synchronizers with NoC switch buffers, the sharing of as many physical synthesis steps as possible between the two architectures and the requirements of a realistic full-HD video playback application are the key innovations of this study.

Explore More