Emna Amouri
Pierre-and-Marie-Curie University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Emna Amouri.
applied reconfigurable computing | 2013
Vinod Pangracious; Zied Marrakchi; Emna Amouri; Habib Mehrez
A Tree-based 3D Multilevel FPGA architecture that unifies two unidirectional programmable interconnection network is presented in this paper. In a Tree-based FPGA architecture, the interconnects are arranged in a multilevel network with the switch blocks placed at different tree levels using Butterfly-Fat-Tree network topology. Two dimensional layout development of a Tree-based multilevel interconnect is a major challenge for Tree-based FPGA. A 3D interconnect network technology leverage on Through Silicon Via (TSVs) to re-distribute the Tree interconnects, based on network delay and thermal considerations into multiple silicon layers is discussed. The impact of of Through Silicon Vias and performance improvement of 3D Tree-based FPGA are analyzed. We present an optimized physical design technology leverage on TSV, Thermal-TSV (TTSV), and thermal analysis. Compared to 3D Mesh-based FPGA, the 3D Tree-based FPGA design reduces the number of TSVs by 29% and leads to a performance improvement of 53% based on our place and route experiments.
reconfigurable computing and fpgas | 2013
Emna Amouri; Adrien Blanchardon; Roselyne Chotin-Avot; Habib Mehrez; Zied Marrakchi
This paper presents an improved cluster-based Mesh architecture. This architecture has a depopulated intra-cluster interconnect, and presents a new hierarchical topology for the switch box which unifies a downward and an upward unidirectional networks. Experimental results of 20 MCNC benchmarks show that density is improved and interconnect area requirement is reduced by 42 % compared to the cluster-based VPR architecture.
parallel, distributed and network-based processing | 2016
Sonda Chtourou; Zied Marrakchi; Emna Amouri; Vinod Pangracious; Habib Mehrez; Mohamed Abid
In this paper, we propose a 2D and 3D interconnect network based on a Mesh-of-Clusters (MoC) topology for the implementation of an efficient Field Programmable Gate Arrays (FPGA) architecture. Proposed MoC-based FPGA architecture presents a new hierarchical Switch Box (SBs) and depopulated intra-cluster interconnect based on the Butterfly-Fat-Tree (BFT) topology. Long routing wires which span multiple SBs in every row and column were used in order to improve performance. By adjusting the percentage of long wire and span, we can design and build 3D high density MoC-based FPGA. To design 3D MoC-based FPGAs, we cut the 2D FPGA into two equal FPGA dies and we adjust the long wire span factor to connect the two dies. Then, these long wire segments are converted as 3D through silicon via (TSV). We present also a design methodology and CAD tools to explore the performance of proposed 2D and 3D MoC-based FPGA architectures in term of power, energy, area and delay. Experimental results with large benchmarks show that with 3D MoC-based FPGA the average gains in terms frequency, energy and area are 23%, 37% and 47% respectively, compared to 2D MoC-based FPGA.
Microprocessors and Microsystems | 2016
Sonda Chtourou; Zied Marrakchi; Emna Amouri; Vinod Pangracious; Mohamed Abid; Habib Mehrez
This paper presents an improved interconnect network for Mesh of Clusters (MoC) Field-Programmable Gate Array (FPGA) architecture. Proposed architecture has a depopulated intra-cluster interconnect with flexible Rents parameter. It presents new multi-levels Switch Box (SB) interconnect which unifies a downward and an upward unidirectional networks based on the Butterfly-Fat-Tree (BFT) topology. To improve the routability of proposed MoC-based FPGA, long routing segments are introduced as a function of channel width with adjustable span. Compared to basic Versatile Place and Route (VPR) Mesh architecture, a saving of 32% of area and 30% of power was achieved with proposed MoC-based architecture. Based on analytical and experimental methods, we identified and explored architecture parameters that control the interconnect flexibility of the proposed MoC-based FPGA such as Rents parameter, cluster size, Look-Up-Table (LUT) size, long wires span and percentage. Experimental results show that architecture with LUT size 4 and Cluster arity 8 is the best trade-off between power consumption and density. It can also be noted that in general long wires span equal to 4 and percentage between 20% and 30% produce most efficient results in terms of density and power.
Microelectronics Journal | 2014
Vinod Pangracious; Emna Amouri; Zied Marakchi; Habib Mehrez
We describe a methodology to design and optimize Three-dimensional (3D) Tree-based FPGA by introducing a break-point at particular tree level interconnect to optimize the speed, area, and power consumption. The ability of the design flow to decide a horizontal or vertical network break-point based on design specifications is a defining feature of our design methodology. The vertical partitioning is organized in such a way to balance the placement of logic blocks and switch blocks into multiple tiers while the horizontal partitioning optimizes the interconnect delay by segregating the logic blocks and programmable interconnect resources into multiple tiers to build a 3D stacked Tree-based FPGA. We finally evaluate the effect of Look-Up-Table (LUT) size, cluster size, speed, area and power consumption of the proposed 3D Tree-based FPGA using our home grown experimental flow and show that the horizontal partitioned 3D stacked Tree-based FPGA with LUT and cluster sizes equal to 4 has the best area-delay product to design and manufacture 3D Tree-based FPGA.
field-programmable technology | 2013
Arwa Ben Dhia; Saif Ur Rehman; Adrien Blanchardon; Lirida A. B. Naviner; Mounir Benabdenbi; Roselyne Chotin-Avot; Habib Mehrez; Emna Amouri; Zied Marrakchi
In this paper, we propose the implementation of multiple defect-tolerant techniques on an SRAM-based FPGA. These techniques include redundancy at both the logic block and intra-cluster interconnect. In the logic block, redundancy is implemented at the multiplexer level. Its efficiency is analyzed by injecting a single defect at the output of a multiplexer, considering all possible locations and input combinations. While at the interconnect level, fine grain redundancy is introduced which not only bypasses defects but also increases routability. Taking advantage of the sparse intra-cluster interconnect structures, routability is further improved by efficient distribution of feedback paths allowing more flexibility in the connections among logic blocks. Emulation results show a significant improvement of about 15% and 34% in the robustness of logic block and intra-cluster interconnect respectively. Furthermore, the impact of these hardening schemes on the testability of the FPGA cluster for manufacturing defects is also investigated in terms of maximum achievable fault coverage and the respective cost.
international conference on high performance computing and simulation | 2016
Sonda Chtourou; Mohamed Abid; Zied Marrakchi; Emna Amouri; Habib Mehrez
The interconnect structure in common FPGA architectures is generally designed to maximize logic utilization. A fully populated routing interconnect is simple and provides high flexibility at the cost of power and area overhead. Moreover, the utilization rate of interconnect switches is extremely low. In this paper, we aim to explore new cluster-based mesh FPGA architectures with depopulated routing network. First, we propose a Depopulated FPGA (DFPGA) architecture with depopulated intra-cluster and inter-cluster interconnects. Based on a comparison with a common Mesh architecture, we note that power and area are improved respectively by an average of 23% and 30%. However, these improvements are obtained at the cost of wiring complexity, congestion and low flexibility to route complex circuits. To alleviate those weaknesses, we propose to populate inter-cluster interconnect by using hierarchy. We show experimentally that the second proposed FPGA architecture with Multilevel Switch blocks (MS-FPGA) has a good routability and interesting power consumption and area density compared to the common cluster-based mesh FPGA. Moreover, additional switches used in the hierarchical inter-cluster interconnect of the MSFPGA are compensated with a better flexibility. Unlike DFPGA, MS-FPGA can deal with complex circuits.
applied reconfigurable computing | 2015
Sonda Chtourou; Zied Marrakchi; Vinod Pangracious; Emna Amouri; Habib Mehrez; Mohamed Abid
In this paper, we present an improved Mesh of Clusters (MoC) architecture with new hierarchical Switch Box (SB) topology and depopulated intra-cluster interconnect with flexible Rent’s parameter. The aim of this paper is to explore the effect of different architecture parameters like architecture Rent’s, design Rent’s and channel width. Then, we analyze how these factors interact and the way to tune them to satisfy various specific application constraints and quality metrics like power consumption and area. The proposed exploration methodology unifies two procedures which are analytical method based on Rent’s rule modeling and experimental method based on benchmarks circuits implementation. A comparison with VPR mesh architecture shows gains in terms power and area equal respectively to 30% and 32%.
ieee international d systems integration conference | 2014
Sonda Chtourou; Mohamed Abid; Vinod Pangracious; Emna Amouri; Zied Marrakchi; Habib Mehrez
In this study, we propose a three-dimensional (3D) interconnect network implementation based on a modified Mesh-of-Clusters (MoC) topology for FPGA architecture design. Design and experimental setup is developed to demonstrate the improvement in performance, power and area of 2.5D and 3D MoC-based FPGA architecture. MoC starts with a mesh of nodes and builds a separate hierarchical network along each row and column in the mesh. To obtain the optimal MoC programmable interconnect structure with high performance and density, the routing architecture of the 2D MoC-based FPGA is modified to include long routing segments which span multiple switch blocks in every row and column. By adjusting the percentage of long wire and span, we can design and build 2.5D and 3D high density MoC FPGAs. To design 3D MoC-based FPGAs, we cut the 2D MoC FPGA into two equal FPGA dies and we adjust the long wire span factor to connect the two dies. Then, these long wire segments are converted as 3D through silicon via (TSV) technology. To design 2.5D interposer based multi-FPGA architecture, we use the same principle of cuts and we adjust the long wires span to remain within die connections. However, we apply constraints at cutline location to reduce the die to die interposer connections. A 3D physical design CAD for MoC-based FPGA is developed using Global Foundries 130nm technology node modified to use TSV designs from Tezzaron Semiconductor inc. Using our 3D design and simulation tool flow developed for MoC-based FPGA, we demonstrate that the speed, power and area of 3D MoC-based FPGA architecture are improved respectively by 35%, 21% and 47% in comparison to 2D MoC-based FPGA.
International Journal of Reconfigurable Computing | 2013
Emna Amouri; Habib Mehrez; Zied Marrakchi
The wave dynamic differential logic (WDDL) has been identified as a promising countermeasure to increase the robustness of cryptographic devices against differential power attacks (DPA). However, to guarantee the effectiveness of WDDL technique, the routing in both the direct and complementary paths must be balanced. This paper tackles the problem of unbalance of dual-rail signals in WDDL design. We describe placement techniques suitable for tree-based and mesh-based FPGAs and quantify the gain they confer. Then, we introduce a timing-balance-driven routing algorithm which is architecture independent. Our placement and routing techniques proved to be very promising. In fact, they achieve a gain of 95%, 93%, and 85% in delay balance in treebased, simple mesh, and cluster-based mesh architectures, respectively. To reduce further the switch and delay unbalance in Mesh architecture, we propose a differential pair routing algorithm that is specific to cluster-basedmesh architecture. It achieves perfectly balanced routed signals in terms of wire length and switch number.