Igor Loi
University of Bologna
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Igor Loi.
international solid-state circuits conference | 2010
G. Van der Plas; Paresh Limaye; Igor Loi; Abdelkarim Mercha; Herman Oprins; C. Torregiani; Steven Thijs; Dimitri Linten; Michele Stucchi; Guruprasad Katti; Dimitrios Velenis; Vladimir Cherman; Bart Vandevelde; V. Simons; I. De Wolf; Riet Labie; Dan Perry; S. Bronckers; Nikolaos Minas; Miro Cupac; Wouter Ruythooren; J. Van Olmen; Alain Phommahaxay; M. de Potter de ten Broeck; A. Opdebeeck; M. Rakowski; B. De Wachter; M. Dehan; Marc Nelis; Rahul Agarwal
In this paper key design issues and considerations of a low-cost 3-D Cu-TSV technology are investigated. The impact of TSV on BEOL interconnect reliability is limited, no failures have been observed. The impact of TSV stress on MOS devices causes shifts, further analysis is required to understand their importance. Thermal hot spots in 3-D chip stacks cause temperature increases three times higher than in 2-D chips, necessitating a careful thermal floorplanning to avoid thermal failures. We have monitored for ESD during 3-D processing and have found no events take place, however careful further monitoring is required. The noise coupling between two tiers in a 3-D chip-stack is 20 dB lower than in a 2-D SoC, opening opportunities for increased mixed signal system performance. The impact on digital circuit performance of TSVs is accurately modeled with the presented RC model and digital gates can directly drive signals through TSVs at high speed and low power. Experimental results of a 3-D Network-on-Chip implementation demonstrate that the NoC concept can be extended from 2-D SoC to 3-D SoCs at low area (0.018 ) and power (3%) overhead.
international conference on computer aided design | 2008
Igor Loi; Subhasish Mitra; Thomas H. Lee; Shinobu Fujita; Luca Benini
Three-dimensional die stacking integration provides the ability to stack multiple layers of processed silicon with a large number of vertical interconnects. Through Silicon Vias (TSVs) provide a promising area- and power-efficient way to support communication between different stack layers. Unfortunately, low TSV yield significantly impacts design of three-dimensional die stacks with a large number of TSVs. This paper presents a defect-tolerance technique for TSVs-based multi-bit links through an efficient and effective use of redundancy. This technique is ideally suited for three-dimensional network-on-chip (NoC) links. Simulation results demonstrate significant yield improvement, from 66% to 98%, with a low area cost (17% on a vertical link in a NoC switch, which leads a modest 2.1% increase the total switch area) in 130 nm technology, with minimal impact of VLSI design and test flows.
design, automation, and test in europe | 2011
Abbas Rahimi; Igor Loi; Mohammad Reza Kakoee; Luca Benini
Shared L1 memory is an interesting architectural option for building tightly-coupled multi-core processor clusters. We designed a parametric, fully combinational Mesh-of-Trees (MoT) interconnection network to support high-performance, single-cycle communication between processors and memories in L1-coupled processor clusters. Our interconnect IP is described in synthesizable RTL and it is coupled with a design automation strategy mixing advanced synthesis and physical optimization to achieve optimal delay, power, area (DPA) under a wide range of design constraints. We explore DPA for a large set of network configurations in 65nm technology. Post placement&routing delay is 38FO4 for a configuration with 8 processors and 16 32-bit memories (8×16); when the number of both processors and memories is increased by a factor of 4, the delay increases almost logarithmically, to 84FO4, confirming scalability across a significant range of configurations. DPA tradeoff flexibility is also promising: in comparison to the maxperformance 16×32 configuration, there is potential to save power and area by 45% and 12 % respectively, at the expense of 30% performance degradation.
Nano-Net '07 Proceedings of the 2nd international conference on Nano-Networks | 2007
Igor Loi; Federico Angiolini; Luca Benini
Three-dimensional (3D) manufacturing technologies are viewed as promising solutions to the bandwidth bottlenecks in VLSI communication. At the architectural level, Networks-on-chip (NoCs) have been proposed to address the complexity of interconnecting an ever-growing number of cores, memories and peripherals. NoCs are a promising choice for implementing scalable 3D interconnect architectures. However, the development of 3D NoCs is still at an early development stage. In this paper, we present a semi-automated design flow for 3D NoCs. Starting from an accurate physical and geometric model of Through-Silicon Vias (TSVs), we extract a circuit-level model for vertical interconnections, and we use it to evaluate the design implications of extending switch architectures with ports in the vertical direction. In addition, we present a design flow allowing for post-layout simulation of NoCs with links in all three physical dimensions.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011
Igor Loi; Federico Angiolini; Shinobu Fujita; Subhasish Mitra; Luca Benini
Through silicon vias (TSVs) provide an efficient way to support vertical communication among different layers of a vertically stacked chip, enabling scalable 3-D networks-on-chip (NoC) architectures. Unfortunately, low TSV yields significantly impact the feasibility of high-bandwidth vertical connectivity. In this paper, we present a semi-automated design flow for 3-D NoCs including a defect-tolerance scheme to increase the global yield of 3-D stacked chips. Starting from an accurate physical and geometrical model of TSVs: 1) we extract a circuit-level model for vertical interconnections; 2) we use it to evaluate the design implications of extending switch architectures with ports in the vertical direction; moreover, 3) we present a defect-tolerance technique for TSV-based multi-bit links through an effective use of redundancy; and finally, 4) we present a design flow allowing for post-layout simulation of NoCs with links in all three physical dimensions. Experimental results show that a 3-D NoC implementation yields around 10% frequency improvement over a 2-D one, thanks to the propagation delay advantage of TSVs and the shorter links. In addition, the adopted fault tolerance scheme demonstrates a significant yield improvement, ranging from 66% to 98%, with a low area cost (20.9% on a vertical link in a NoC switch, which leads a modest 2.1% increase in the total switch area) in 130 nm technology, with minimal impact on very large-scale integrated design and test flows.
design, automation, and test in europe | 2010
Igor Loi; Luca Benini
Historically, processor performance has increased at a much faster rate than that of main memory and up-coming NoC-based many-core architectures are further tightening the memory bottleneck. 3D integration based on TSV technology may provide a solution, as it enables stacking of multiple memory layers, with orders-of-magnitude increase in memory interface bandwidth, speed and energy efficiency. To fully exploit this potential, the architectural interface to vertically stacked memory must be streamlined. In this paper we present an efficient and flexible distributed memory interface for 3D-stacked DRAM. Our interface ensures ultra-low-latency access to the memory modules on top of each processing element (vertically local memory neighborhoods). Communication to these local modules do not travel through the NoC and takes full advantage of the lower latency of vertical interconnect, thus speeding up significantly the common case. The interface still supports a convenient global address space abstraction with high-latency remote access, due to the slower horizontal interconnect. Experimental results demonstrate significant bandwidth improvement that ranges from 1.44x to 7.40x as compared to the JEDEC standard, with peaks of 4.53GB/s for direct memory access, and 850MB/s for remote access through the NoC.
design, automation, and test in europe | 2008
Igor Loi; Federico Angiolini; Luca Benini
The network-on-chip (NoC) interconnection paradigm has been gaining momentum thanks to its flexibility, scalability and suitability to deep submicron technology processes. The next challenge is to use NoCs as the backbones of the upcoming generation of 3D chips, assembled by stacking multiple silicon layers. Multiple technical issues have to be tackled in this respect. One of the foremost is the unsuitability of a purely synchronous design style, as it is not straightforward to impose a strict bound on the clock skew among multiple clock trees across different layers. In this paper, we present a scheme to handle mesochronous communication in 3D NoCs and analyze (i) the circuit design, (ii) the timing properties, (iii) the requirements to support flow control across mesochronous links, (iv) the implementation cost of such a scheme after placement and routing.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2013
Christian Weis; Igor Loi; Luca Benini; Norbert Wehn
Energy efficiency is the major optimization criterion for systems-on-chip (SoCs) for mobile devices (smartphones and tablets). Through silicon via (TSV) technology enables 3-D integration of dies and the heterogeneous stacking of multiple memory or logic layers, allowing increased bandwidth and lower energy consumption of the memory interface compared to traditional approaches. In this paper, we explore the 3-D-DRAM architecture design space. The result is an optimized 2 Gb 3-D-DRAM, which shows a 83% lower energy/bit than a 2 Gb device. Furthermore, we propose a highly energy-efficient DRAM subsystem for next-generation 3-D-integrated SoCs, consisting of a SDR/DDR 3-D-DRAM controller and an attached 3-D-DRAM cube with fine-grained access and a flexible (WIDE-IO) interface. We assess the energy efficiency using a synthesizable model of the SDR/DDR 3-D-DRAM channel controller (CC) as well as functional models of the 3-D-stacked DRAM, including an accurate power estimation engine. We also investigate different DRAM families (WIDE IO SDR/DDR, LPDDR, and LPDDR2) and densities from 256 Mb to 4 Gb per channel. The implementation results of the proposed 3-D-DRAM subsystem show that energy optimized accesses to the 3-D-DRAM enable up to 50% energy savings compared to standard accesses. To the best of our knowledge this is the first design space exploration for 3-D-stacked DRAM considering different technologies based on real-world physical data and the first design of a 3-D-DRAM CC and 3-D-DRAM model featuring co-optimization of memory and controller architecture.
Vlsi Design | 2007
Paolo Meloni; Igor Loi; Federico Angiolini; Salvatore Carta; Massimo Barbaro; Luigi Raffo; Luca Benini
Networks-on-Chip (NoCs) are emerging as scalable interconnection architectures, designed to support the increasing amount of cores that are integrated onto a silicon die. Compared to traditional interconnects, however, NoCs still lack well established CAD deployment tools to tackle the large amount of available degrees of freedom, starting from the choice of a network topology. “Silicon-aware” optimization tools are now emerging in literature; they select an NoC topology taking into account the tradeoff between performance and hardware cost, that is, area and power consumption. A key requirement for the effectiveness of these tools, however, is the availability of accurate analytical models for power and area. Such models are unfortunately not as available and well understood as those for traditional communication fabrics. Further, simplistic models may turn out to be totally inaccurate when applied to wire dominated architectures; this observation demands at least for a model validation step against placed and routed devices. In this work, given an NoC reference architecture, we present a flow to devise analytical models of area occupation and power consumption of NoC switches, and propose strategies for coefficient characterization which have different tradeoffs in terms of accuracy and of modeling activity effort. The models are parameterized on several architectural, synthesis-related, and traffic variables, resulting in maximum flexibility. We finally assess the accuracy of the models, checking whether they can also be applied to placed and routed NoC blocks.
power and timing modeling optimization and simulation | 2011
Ahmed Yasir Dogan; David Atienza; Andreas Burg; Igor Loi; Luca Benini
This study presents a single-core and a multi-core processor architecture for health monitoring systems where slow biosignal events and highly parallel computations exist. The single-core architecture is composed of a processing core (PC), an instruction memory (IM) and a data memory (DM), while the multi-core architecture consists of PCs, individual IMs for each core, a shared DM and an interconnection crossbar between the cores and the DM. These architectures are compared with respect to power vs performance trade-offs for a multi-lead electrocardiogram signal conditioning application exploiting near threshold computing. The results show that the multi-core solution consumes 66% less power for high computation requirements (50.1 MOps/s), whereas 10.4% more power for low computation needs (681 kOps/s).