Valavan Manohararajah
Altera
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Valavan Manohararajah.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2006
Valavan Manohararajah; Stephen Dean Brown; Zvonko G. Vranesic
In this paper, an iterative technology-mapping tool called IMap is presented. It supports depth-oriented (area is a secondary objective), area-oriented (depth is a secondary objective), and duplication-free mapping modes. The edge-delay model (as opposed to the more commonly used unit-delay model) is used throughout. Two new heuristics are used to obtain area reductions over previously published methods. The first heuristic predicts the effects of various mapping decisions on the area of the final solution, and the second heuristic bounds the depth of the mapping solution at each node. In depth-oriented mode, when targeting five lookup tables (LUTs), IMap obtains depth optimal solutions that are 44.4%, 19.4%, and 5% smaller than those produced by FlowMap, CutMap, and DAOMap, respectively. Targeting the same LUT size in area-oriented mode, IMap obtains solutions that are 17.5% and 9.4% smaller than those produced by duplication-free mapping and ZMap, respectively. IMap is also shown to be highly efficient. Runtime improvements of between 2.3times and 82times are obtained over existing algorithms when targeting five LUTs. Area and runtime results comparing IMap to the other mappers when targeting four and six LUTs are also presented
design automation conference | 2005
Deshanand P. Singh; Valavan Manohararajah; Stephen Dean Brown
In this paper, the authors presented a new linear-time retiming algorithm that produces near-optimal results. The implementation is specifically targeted at Alteras Stratix FPGA-based designs, although the techniques described are general enough for any implementation medium. The algorithm is able to handle the architectural constraints of the target device, multiple timing constraints assigned by the user and implicit legality constraints. It ensures that register moves do not create asynchronous problems such as creating a glitch on a clock/reset signal.
system-level interconnect prediction | 2006
Valavan Manohararajah; Gordon Raymond Chiu; Deshanand P. Singh; Stephen Dean Brown
This paper studies the difficulty of predicting interconnect delay in an industrial setting. Fifty industrial circuits, Alteras Quartus II CAD software, and Alteras Stratix and Stratix II FPGA architectures were used in the study. We show that there is a large amount of inherent randomness in a state-of-the-art FPGA placement algorithm. Thus, it is impossible to predict interconnect delay with a high degree of accuracy. Futhermore, we show that a simple timing model can be used to predict some aspects of interconnect timing with just as much accuracy as predictions obtained by running the placement tool itself. Finally, we examine the benefits of using the simple timing model in a timing driven physical synthesis flow, and attempt to establish an upper bound on these possible gains, given the difficulty of interconnect delay prediction.
custom integrated circuits conference | 2005
Deshanand P. Singh; Valavan Manohararajah; Stephen Dean Brown
This paper presents an overview of an industrial physical synthesis CAD flow for FPGAs. The flow provides a performance speedup of 10%-15% for most circuits, and a significant number of circuits show a speedup of 20%-180%. We describe the algorithms used to achieve this result including: incremental retiming, BDD-based resynthesis, local rewiring, and logic replication. The effectiveness of these operations depends on the ability to accurately determine which portions of logic are timing critical at a stage of the CAD flow where there is still freedom to perform logic restructuring. We show how this problem can be effectively solved by inserting prediction and restructuring operations at multiple points of the FPGA CAD flow.
field programmable gate arrays | 2016
David Lewis; Gordon Raymond Chiu; Jeffrey Christopher Chromczak; David Galloway; Ben Gamsa; Valavan Manohararajah; Ian Milton; Tim Vanderhoek; John Curtis Van Dyken
This paper describes architectural enhancements in the Altera Stratix? 10 HyperFlex? FPGA architecture, fabricated in the Intel 14nm FinFET process. Stratix 10 includes ubiquitous flip-flops in the routing to enable a high degree of pipelining. In contrast to the earlier architectural exploration of pipelining in pass-transistor based architectures, the direct drive routing fabric in Stratix-style FPGAs enables an extremely low-cost pipeline register. The presence of ubiquitous flip-flops simplifies circuit retiming and improves performance. The availability of predictable retiming affects all stages of the cluster, place and route flow. Ubiquitous flip-flops require a low-cost clock network with sufficient flexibility to enable pipelining of dozens of clock domains. Different cost/performance tradeoffs in a pipelined fabric and use of a 14nm process, lead to other modifications to the routing fabric and the logic element. User modification of the design enables even higher performance, averaging 2.3X faster in a small set of designs.
IEEE Transactions on Very Large Scale Integration Systems | 2007
Valavan Manohararajah; Gordon Raymond Chiu; Deshanand P. Singh; Stephen Dean Brown
This paper studies the prediction of interconnect delay in an industrial setting. Industrial circuits and two industrial field-programmable gate-array (FPGA) architectures were used in this paper. We show that there is a large amount of inherent randomness in a state-of-the-art FPGA placement algorithm. Thus, it is impossible to predict interconnect delay with a high degree of accuracy. Furthermore, we show that a simple timing model can be used to predict some aspects of interconnect timing with just as much accuracy as predictions obtained by running the placement tool itself. Using this simple timing model in a two-phase timing driven physical synthesis flow can both improve quality of results and decrease runtime. Next, we present a metric for predicting the accuracy of our interconnect delay model and show how this metric can be used to reduce the runtime of a timing driven physical synthesis flow. Finally, we examine the benefits of using the simple timing model in a timing driven physical synthesis flow, and attempt to establish an upper bound on these possible gains, given the difficulty of interconnect delay prediction.
field-programmable logic and applications | 2005
Valavan Manohararajah; Deshanand P. Singh; Stephen Dean Brown
This work explores the effect of adding a timing driven functional decomposition step to the traditional field programmable gate array (FPGA) CAD flow. Once placement has completed, alternative decompositions of the logic on the critical path are examined for potential delay improvements. The placed circuit is then modified to use the best decompositions found. Any placement illegalities introduced by the new decompositions are resolved by an incremental placement step. Experiments conducted on Alteras Stratix and Stratix II device families indicate that this functional decomposition technique can provide average performance improvements of 6.1% and 5.6% on a large set of industrial designs, respectively.
international conference on computer aided design | 2006
Gordon Raymond Chiu; Deshanand P. Singh; Valavan Manohararajah; Stephen Dean Brown
This work describes a new mapping technique, RAM-MAP, that identifies parts of circuits that can be efficiently mapped into the synchronous embedded memories found on field programmable gate arrays (FPGAs). Previous techniques developed for mapping into asynchronous embedded memories cannot be used because modern FPGAs do not have asynchronous embedded memories. After technology mapping, an area-prediction cost function is used to guide the selection of logic cones to be placed in embedded memories. Extra logic is added to compensate for missing asynchronous functionality on the synchronous memories. Experiments conducted on Alteras Stratix device family indicate that this embedded memory mapping technique can provide an average area reduction of 6.2% and up to 32.5% on a large set of industrial designs. A small architecture change that increases the size of the FPGA fabric by 0.05% can increase the average area reduction to 14.1% and up to 59.1% on the same design set
field-programmable logic and applications | 2006
Valavan Manohararajah; Stephen Dean Brown; Zvonko G. Vranesic
This paper presents preliminary work exploring adaptive field programmable gate arrays (AFPGAs). An AFPGA is adaptive in the sense that the functionality of subcircuits placed on the chip can change in response to changes observed on certain control signals. We describe the high-level architecture which adds additional control logic and SRAM bits to a traditional FPGA to produce an AFPGA. We also describe a synthesis method that identifies and resynthesizes mutually exclusive pieces of logic so that they may share the resources available in an AFPGA. The architectural feature and its associated synthesis method helps reduce circuit size by 28% on average and up to 40% on select circuits
Archive | 2007
Ivan Blunno; Gordon Raymond Chiu; Deshanand P. Singh; Valavan Manohararajah; Stephen Dean Brown