Is this you? Create Your Porfile

Xifan Tang

École Polytechnique Fédérale de Lausanne

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xifan Tang is active.

Explore More

Publication

Featured researches published by Xifan Tang.

IEEE Transactions on Circuits and Systems I-regular Papers | 2016

A Study on the Programming Structures for RRAM-Based FPGA Architectures

Xifan Tang; Gain Kim; Pierre-Emmanuel Gaillardon; Giovanni De Micheli

Field Programmable Gate Arrays (FPGAs) can benefit non-volatility and high-performance by exploiting Resistive Random Access Memories (RRAMs). In RRAM-based FPGAs, the memories do not only replace the SRAMs and store configurations, but they can also replace the transmission gates and propagate datapath signals. The high-performance achievable by RRAM-based FPGAs comes from the fact that the on-resistance of the memory devices RLRS is smaller than the equivalent resistance of a transmission gate. Efficient programming structures for RRAMs should provide high current density with a small area footprint, to obtain a low RLRS. In this paper, we first examine the efficiency of the widely-used 2Transistor/1RRAM (2T1R) programming structure and identify four major limitations of the 2T1R structure. To overcome these limitations, we propose a 2Transmission-Gates/1RRAM (2TG1R) and a 4Transistor/ 1RRAM (4T1R) programming structures. We perform both theoretical analysis and electrical simulations on the evaluated programming structures. 4T1R programming structure is the best in terms of current density with 1.4 x and 1.1 x as compared to 2T1R and 2TG1R counterparts, respectively. We also investigate the effect of boosting the programming voltage Vprog of the programming structures. Experimental results show that boosting Vprog for all the programming structures improves driving current of the evaluated programming structures by 3 x and area efficiency by 1.7 x on average.

field-programmable technology | 2014

A high-performance low-power near-Vt RRAM-based FPGA

Xifan Tang; Pierre-Emmanuel Gaillardon; Giovanni De Micheli

The routing architecture, heavily using programmable switches, dominates the area, delay and power of Field Programmable Gate Arrays (FPGAs). Resistive Random Access Memories (RRAMs) enable high-performance routing architectures through the replacement of Static Random Access Memory (SRAM)-based programming switches. Exploiting the very low on-resistance state achievable by RRAMs, RRAM-based routing multiplexers can be used to significantly reduce the FPGA routing delays. In addition, RRAM-based routing architectures are less sensitive to supply voltage reductions and show promises in low-power FPGA designs. In this paper, we propose a near-Vt low-power RRAM-based FPGA where both delay and power reductions are achieved. Experimental results demonstrate that a near-Vi RRAM-based FPGA design leads to a 15% area shrink, a 10% delay reduction, and a 65% power improvement, compared to a conventional FPGA design for a given technology node. To achieve low on-resistance values, RRAMs typically require high programming currents. In other word, they need relatively large programming transistors, potentially resulting in area, delay and power inefficiencies. We also present a design methodology to properly size the programming transistors of RRAMs in order to further improve the area-efficiency. Experimental results show that a correct programming transistor sizing strategy contributes to further 18% area and 2% delay shrink, compared to the initial near-Vi RRAM-based FPGA.

IEEE Transactions on Very Large Scale Integration Systems | 2015

A Novel FPGA Architecture Based on Ultrafine Grain Reconfigurable Logic Cells

Pierre-Emmanuel Gaillardon; Xifan Tang; Gain Kim; Giovanni De Micheli

In this paper, we investigate the opportunity brought by controllable-polarity transistors to design efficient reconfigurable circuits. Controllable-polarity transistors are devices whose polarity can be electrostatically programmed to be either n- or p-type. Such devices are used to build ultrafine grain computation cells. These cells are arranged into regular matrices, called MClusters, with a fixed and incomplete interconnection pattern, employed to minimize the reconfigurable interconnection overhead. We subsequently use them into field-programmable gate arrays (FPGAs). To assess this architectural scheme in an efficient and objective manner, we present a complete benchmarking tool flow and focus on the packing algorithm developed to handle the architecture. We finally perform the evaluation with widely used benchmark circuits. Leveraging the ultrafine grain cells compactness from a system-level perspective, we show that FPGAs exploiting MClusters demonstrate average savings of 43% and 23% in area and delay, respectively, as compared with the CMOS lookup table FPGA counterpart at 22-nm technological node.

IEEE Transactions on Very Large Scale Integration Systems | 2013

Timing Uncertainty in 3-D Clock Trees Due to Process Variations and Power Supply Noise

Hu Xu; Vasilis F. Pavlidis; Xifan Tang; Wayne Burleson; Giovanni De Micheli

Clock distribution networks are affected by different sources of variations. The resulting clock uncertainty significantly affects the frequency of a circuit. To support this analysis, a statistical model of skitter, which consists of clock skew and jitter, for 3-D clock trees is introduced. The effect of skitter on both the setup and hold time slacks is modeled. The variation of skitter is shown to be underestimated up to 36% if process variations and dynamic power supply noise are considered separately, which highlights the importance of this unified treatment. Potential scenarios of supply noise in 3-D integrated circuits (ICs) are investigated. 3-D circuits generated from industrial benchmarks are simulated to show the skitter under these scenarios. The mean and standard deviation of skitter can vary up to 60% and 51%, respectively, due to the different amplitudes and phases of supply noise. The tradeoff between skitter and the power consumed by clock trees is also shown. A set of guidelines are presented to decrease skitter in 3-D ICs. By applying these guidelines to industrial benchmarks, simulations show a decrease in the mean skitter up to 31%.

international symposium on circuits and systems | 2014

TSPC Flip-Flop circuit design with three-independent-gate silicon nanowire FETs

Xifan Tang; Jian Zhang; Pierre-Emmanuel Gaillardon; Giovanni De Micheli

True Single-Phase Clock (TSPC) Flip-Flops, based on dynamic logic implementation, are area-saving and high-speed compared to standard static flip-flops. Furthermore, logic gates can be embedded into TSPC flip-flops which significantly improves performance. As a promising approach to keep the pace of Moores Law, functionality-enhanced devices with multiple independent gates have drown many recent interests. In particular, Three-Independent-Gate Silicon Nanowire FETs (TIG SiNWFETs) can realize the functionality of two serial transistors in a single device. Therefore, they open new opportunities to compact designs in both arithmetic and control circuits. In this paper, we propose TSPC flip-flop implementation with asynchronous set and reset using the compactness of TIG SiNWFET. Electrical simulations show that TIG SiNWFET-based TSPC flip-flop improves nearly 20%, 30% and 7% in area, delay and leakage power respectively as compared to its LSTP FinFET counterpart at 22nm.

international conference on computer design | 2015

FPGA-SPICE: A simulation-based power estimation framework for FPGAs

Xifan Tang; Pierre-Emmanuel Gaillardon; Giovanni De Micheli

Mainstream Field Programmable Gate Array (FPGA) power estimation tools are based on probabilistic activity estimation and analytical power models. The power consumption of the programmable resources of FPGAs is highly sensitive to their configurations. Due to their highly flexible nature, the configurations of FPGAs routing multiplexers or Look Up Tables (LUTs) are really different from a design to another but current analytical power models cannot accurately capture the associated power differences. In this paper, we introduce a simulation-based power estimation framework for FPGAs, called FPGA-SPICE, which supports any FPGA architecture that can be described with an architectural description language. Our power estimation engine automatically generates accurate SPICE netlists according to the FPGA configurations and enables precise power analysis of FPGA architectures. SPICE testbenches can be generated at different level of complexity, denoted as full-chip-level, grid-level and component-level testbenches. Full-chip-level testbenches dump the netlists associated with the complete FPGA fabric. To reduce simulation time, FPGA-SPICE can split the full-chip-level testbenches into grid-level testbenches, each of which consisting of a complete logic block netlist, or component-level testbenches, which consider individual circuit elements, i.e., multiplexers, LUTs, flip-flops, etc., separately. We show that the grid/component-level approach can achieve 14 × speed-up with a moderate 14% accuracy loss, compared to the full-chip level. We also use FPGA-SPICE to study the power characteristics of a commercial FPGA architecture at different technology nodes. Experimental results show that the global routing architecture consumes 50% of the total power, the local routing architecture claims for 40% of the total power, and the remaining 10% comes from the LUTs and flip-flops.

field programmable logic and applications | 2015

Accurate power analysis for near-V t RRAM-based FPGA

Xifan Tang; Pierre-Emmanuel Gaillardon; Giovanni De Micheli

Resistive Random Access Memory (RRAM)-based FPGA architectures employ RRAMs not only as memories to store the configuration but embed them in the datapaths of programmable routing resources to propagate signals with improved performances. Sources of power consumption have been intensively studied for conventional Static Random Access Memories (SRAM)-based FPGAs. However, very limited works focused so far on studying the power characteristics of RRAM-based FPGAs. In this paper, we first analyze the power characteristics of RRAM-based multiplexer at circuit level and then use electrical simulations to study power consumption of RRAM-based FPGA architectures. Experimental results show that RRAM-based FPGAs achieve a Power-Delay Product reduced by 50% compared to SRAM-based FPGA at nominal voltage and 20% compared to near-Vt SRAM-based FPGA, respectively.

design, automation, and test in europe | 2015

A ultra-low-power FPGA based on monolithically integrated RRAMs

Pierre-Emmanuel Gaillardon; Xifan Tang; Jury Sandrini; Maxime Thammasack; Somayyeh Rahimian Omam; Davide Sacchetto; Yusuf Leblebici; Giovanni De Micheli

Field Programmable Gate Arrays (FPGAs) rely heavily on complex routing architectures. The routing structures use programmable switches and account for a significant share in the total area, delay and power consumption numbers. With the ability of being monolithically integrated with CMOS chips, Resistive Random Access Memories (RRAMs) enable high-performance routing architectures through the replacement of Static Random Access Memory (SRAM)-based programming switches. Exploiting the very low on-resistance state achievable by RRAMs as well as the improved tolerance to power supply reduction, RRAM-based routing multiplexers can be used to significantly reduce the power consumption of FPGA systems with no performance compromises. By evaluating the opportunities of ultra-low-power RRAM-based FPGAs at the system level, we see an improvement of 12%, 26% and 81% in area, delay and power consumption at a mature technology node.

latin american symposium on circuits and systems | 2015

A study on buffer distribution for RRAM-based FPGA routing structures

Somayyeh Rahimian Omam; Xifan Tang; Pierre-Emmanuel Gaillardon; Giovanni De Micheli

Compared to Application-Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) provide reconfigurablity at the cost of lower performance and higher power consumption. Exploiting a large number of programmable switches, routing structures are mainly responsible for the performance limitation. Hence, employing more efficient switches can drastically improve the performance and reduce the power consumption of the FPGA. Resistive Random Access Memory (RRAM)-based switches are one of the most promising candidates to improve the FPGA routing architecture thanks to their low on-resistance and non-volatility. The lower RC delay of RRAM-based routing multiplexers, as compared to CMOS-based routing structures encourages us to reconsider the buffer distribution in FPGAs. This paper proposes an approach to reduce the number of buffers in the routing path of RRAM-based FPGAs. Our architectural simulations show that the use of RRAM switches improves the critical path delay by 56% as compared to CMOS switches in standard FPGA circuits at 45-nm technology node while, at the same time, the area and power are reduced, respectively, by 17% and 9%. By adapting the buffering scheme, an extra bonus of 9% for delay reduction, 5% for power reduction and 16% for area reduction can be obtained, as compared to the conventional buffering approach for RRAM-based FPGAs.

latin american symposium on circuits and systems | 2017

Optimization opportunities in RRAM-based FPGA architectures

Xifan Tang; Giovanni De Micheli; Pierre-Emmanuel Gaillardon

Static Random Access Memory (SRAM)-based routing multiplexers, whatever structure is employed, share a common limitation: their area, delay and power increase linearly with the input size. This property results in most SRAM-based FPGA architectures typically avoiding the use of large multiplexers. Resistive Random Access Memory (RRAM)-based multiplexers, built with one-level structure, have a unique advantage over SRAM-based multiplexers: their ideal delay is independent from the input size. This property allows RRAM-based FPGA architectures to use larger multiplexers than their SRAM-based counterparts, without generating any delay overhead. In this paper, by carefully considering the properties of RRAM multiplexers, we assess that current state-of-art architectural parameters for SRAM-based FPGAs cannot preserve optimality in the context of RRAM-based FPGAs. As a result, we propose that in RRAM-based FPGAs, (a) the routing tracks should be interconnected to Look-Up Table (LUT) inputs via a one-level crossbar, instead of through Connection Blocks and local routing; (b) the Switch Blocks should employ larger multiplexers; (c) length-2 wires should be used instead of length-4 wires. When operated in nominal voltage, the proposed RRAM-based FPGA architecture reduces area by 26%, delay by 39% and channel width by 13%, as compared to a SRAM-based FPGA with a classical architecture. When operated in the near-Vt regime, the proposed RRAM-based FPGA architecture improves Area-Delay Product by 42% and Power-Delay Product by 5× as compared to a classical SRAM-based FPGA at nominal voltage.

Explore More