Benjamin S. Devlin
Xilinx
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Benjamin S. Devlin.
field programmable logic and applications | 2016
Ilya K. Ganusov; Benjamin S. Devlin
This paper presents enhancements to the Xilinx UltraScale+ clocking architecture to support fine-grain time-borrowing. Time borrowing improves performance by redistributing timing slack between fast and slow paths. The Ultra-Scale+ architecture introduces programmable hardware delays and pulse generators embedded in the clocking tree to support time-borrowing based both on clock skew scheduling and pulsed latches. This programmable hardware allows borrowing from a few picoseconds to multiple nanoseconds between sequential pipeline stages without any changes to RTL, placement or routing. Vivado algorithms automatically determine when to skew flip-flop clock or convert them to pulsed latches to achieve the highest possible performance. Using the default Vivado flow, this programmable time-borrowing platform delivers 5.5% Fmax increase on average over a suite of 89 industrial designs. It is especially effective on high-speed applications, delivering up to 13.7% Fmax increase on individual designs. We also demonstrate that using non-default features, such as delays cascades or increasing hold margin, can increase average performance gains to 7.4% and 8.5%, respectively. This platform incurs minimum area (less than 0.1% of total chip area) while staying robust in the presence of tight hold constraints and increasing process variation.
asian solid state circuits conference | 2013
Benjamin S. Devlin; Makoto Ikeda; Hiroshi Ueki; Kazuhiko Fukushima
We have designed and measured completely self-synchronous 1024-bit RSA crypt-engine, fabricated in 40nm CMOS. We have implemented two modular exponentiation algorithms, the high-to-low(HTL) and Montgomery power ladder(MPL) in order to show the performance of the self-synchronous, gate-level pipelined architectures. Both implementations employ identical data-paths and take 804k transistors, with only difference in controller, and two interleaved 1024b cryptographic operations take from 6.1ms to 3.1ms for HTL and 6.0ms for MPL, at nominal power supply of 1.1V.
Archive | 2016
Benjamin S. Devlin; Ilya K. Ganusov
Archive | 2015
Ilya K. Ganusov; Benjamin S. Devlin
Archive | 2018
Benjamin S. Devlin
Archive | 2017
Ilya K. Ganusov; Benjamin S. Devlin
Archive | 2017
Benjamin S. Devlin; Rafael C. Camarota
Archive | 2017
Jindrich Zejda; Atul Srinivasan; Ilya K. Ganusov; Walter A. Manaker; Benjamin S. Devlin; Satish Sivaswamy
Archive | 2017
Benjamin S. Devlin; Rafael C. Camarota
Archive | 2017
Ilya K. Ganusov; Benjamin S. Devlin