Kasyab Parmesh Subramaniyan

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kasyab Parmesh Subramaniyan is active.

Explore More

Publication

Featured researches published by Kasyab Parmesh Subramaniyan.

application specific systems architectures and processors | 2010

Design space exploration for an embedded processor with flexible datapath interconnect

Tung Thanh Hoang; Ulf Jälmbrant; Erik der Hagopian; Kasyab Parmesh Subramaniyan; Magnus Själander; Per Larsson-Edefors

The design of an embedded processor is dependent on the application domain. Traditionally, design solutions specific to an application domain have been available in three forms: VLIW-based DSP processors, ASICs and FPGAs; each respectively offering generality of application domain, energy efficiency and flexibility. However, while matching the application domain to the resources needed, the design space becomes huge. We present FlexTools, a tool framework built around the FlexCore architecture to evaluate performance and energy efficiency for different applications. Here we demonstrate FlexTools for design space exploration with a focus on the data-routing flexibility of the FlexCore processor, in search of energy-efficient interconnect configurations that are both cycle-count and hardware efficient. Evaluation results suggest that a well-optimized instance of a 65-nm multiplier-extended FlexCore processor datapath, obtained using FlexTools, executes nine integer EEMBC benchmarks with a 15% cycle count reduction and dissipates 17% less energy than a reference MIPS datapath.

international conference on electronics, circuits, and systems | 2009

Layout exploration of geometrically accurate arithmetic circuits

Kasyab Parmesh Subramaniyan; Emil Axelsson; Per Larsson-Edefors; Mary Sheeran

High-performance arithmetic circuits are critical to overall design performance and are therefore designed using full-custom design techniques. However, this is a time-consuming and error-prone task. We present a novel layout exploration methodology to design arithmetic circuits using standard-cell techniques, that retains competitive performance while allowing an almost custom-design kind of control over the layout. It uses an unconventional approach with a Haskell-based front-end in the Wired system, designed to produce logically and topologically accurate circuit descriptions and at the same time be parameterizable. Further, another overall goal of the system was to keep implementation time as low as possible. We demonstrate this methodology on HPM multipliers that exhibit a high degree of layout regularity.

ieee computer society annual symposium on vlsi | 2010

Generation and Exploration of Layouts for Area-Efficient Barrel Shifters

Alen Bardizbanyan; Kasyab Parmesh Subramaniyan; Per Larsson-Edefors

Good layout quality is very important in order to obtain efficient integrated circuits, and custom design methods are thus considered when speed, power, and area requirements are very strict. But since custom design styles require extensive and specialized development resources, automated, less optimal design methods are often chosen. Alternate methods to create efficient layouts may prove useful, especially since custom layout in future technology nodes is associated with prohibitive nonrecurring engineering (NRE) costs. The prototype layout generation environment shown in this paper allows us to define, evaluate and modify fine-grained cell placement strategies for barrel shifters in a quick manner. The three different 90-nm shifter circuit implementations demonstrated here show a performance that is on par with circuits harnessing the capabilities offered by conventional tools. Furthermore, this performance is achieved using the least possible die area. For example, a 32-bit fan-out split shifter conventionally laid out and clocked at 1.11 GHz, dissipates 0.37 mW of switching power and occupies an area of 5698 μm2. The same shifter circuit placed using our environment and routed conventionally, equivalently dissipates 0.34 mW, but occupies only 4711 μm2.

asia symposium on quality electronic design | 2012

On regularity and integrated DFM metrics

Kasyab Parmesh Subramaniyan; Per Larsson-Edefors

Transistor geometries are well into the nanometer regime, keeping with Moores Law. With this scaling in geometry, problems not significant in the larger geometries have come to the fore. These problems, collectively termed variability, stem from second-order effects due to the small geometries themselves and engineering limitations in creating the small geometries. The engineering obstacles have a few solutions which are yet to be widely adopted due to cost limitations in deploying them. Addressing and mitigating variability due to second-order effects comes largely under the purview of device engineers and to a smaller extent, design practices. Passive layout measures that ease these manufacturing limitations by regularizing the different layout pitches have been explored in the past. However, the question of the best design practice to combat systematic variations is still open. In this work we explore considerations for the regular layout of the exclusive-OR gate, the half-adder and full-adder cells implemented with varying degrees of regularity. Tradeoffs like complete interconnect unidirectionality, and the inevitable introduction of vias are qualitatively analyzed and some factors affecting the analysis are presented. Finally, results from the Calibre Critical Feature Analysis (CFA) of the cells are used to evaluate the qualitative analysis.

2014 5th European Workshop on CMOS Variability (VARI) | 2014

MIDAS: Model for IP-inclusive DFM assessment of system manufacturability

Kasyab Parmesh Subramaniyan; Per Larsson-Edefors

Complex system implementations combined with the latest technology nodes allow us to implement hardware for versatile applications. The ever increasing demand for quick time-to-market has led to the widespread use of Intellectual Property (IP) in ASIC design methodologies. These developments, in addition to manufacturing limitations, make early prediction of manufacturability for complete systems challenging. We present MIDAS: a scalable, IP-inclusive model to predict system manufacturability. Results from applying MIDAS to an embedded processor system reveals that several useful insights can be gained towards realizing yield budgets for complex systems allowing quicker co-optimization of all implementation goals.

ieee computer society annual symposium on vlsi | 2011

Application-Specific Energy Optimization of General-Purpose Datapath Interconnect

Babak Hidaji; Salar Alipour; Kasyab Parmesh Subramaniyan; Per Larsson-Edefors

A general-purpose data path is designed for efficient execution of diverse applications. An embedded processor, typically working with a limited application domain, does not necessarily utilize the fixed, general-purpose data path interconnect efficiently. If we consider the interconnect to be a flexible resource, the data path can be fine tuned to an application domain. The addition of an interconnect link between two data path units has the potential to reduce execution time, while the removal of an unused link can save area and power dissipation. Finding the most energy-efficient data path interconnect configuration for a software application domain is a time-consuming process, since it involves rescheduling of the targeted application(s) on different data path implementations. We present an automated optimization engine that is based on a genetic algorithm. This engine aids the designer in finding the most energy-efficient interconnect configuration of a simple processor data path. We show that an optimized data path interconnect can offer an energy saving of 38% with respect to a general-purpose data path reference, if the interconnect links are matched to the need of one application.

international conference on electronics, circuits, and systems | 2009

Custom layout strategy for rectangle-shaped log-depth multiplier reduction tree

Patrik Kimfors; Niklas Broman; Andreas Haraldsson; Kasyab Parmesh Subramaniyan; Magnus Själander; Henrik Eriksson; Per Larsson-Edefors

Multiplier reduction trees that have a logarithmic logic depth generally exhibit poor regularity, in terms of how the gates are interconnected. Consequently, it is well known that partial-product reduction trees (PPRTs), such as Wallace, Dadda and TDM, are very difficult to lay out in a custom design flow. However, our previously proposed HPM scheme enabled the implementation of a PPRT that consistently uses regular interconnect patterns, without sacrificing the performance originating from the logarithmic logic depth. The original HPM layout is perfectly regular, however, it has a PPRT in the shape of a wide triangle. We now propose a custom layout strategy for the HPM scheme that leads to rectangle-shaped PPRTs that are more straightforward to reconcile with the need for a rectangular multiplier footprint. The proposed layout strategy is implemented and evaluated for a 16-bit 90-nm design.

conference on ph.d. research in microelectronics and electronics | 2011

FlexDEF: Development framework for processor architecture implementation and evaluation

Kasyab Parmesh Subramaniyan; Erik J Ryman; Magnus Själander; Tung Thanh Hoang; Mafijul Md. Islam; Per Larsson-Edefors

Designing a processor is a complex task that uses multiple and varied tools. The complete development cycle spans software as well as hardware design and verification. More often than not, in spite of the close dependencies between hardware and software, there is no common platform for quick and accurate testing of these dependencies. Though such systems are often employed in industry, it is not common for end-to-end frameworks to be deployed in educational and research settings.We present the FlexCore Design Exploration Framework (FlexDEF), an end-to-end tool-chain used to develop the FlexCore processor and its accompanying cache system. The tool-chain is a hierarchically linked system that spans the various development phases involved in design and verification. The processor system is intended to be a model, for use in research-oriented projects where both the software and hardware are in a constant state of flux. We discuss the complete framework and the advantages in each context. Finally, we summarize the developments and discuss the future of the FlexDEF tool-chain.

international symposium on quality electronic design | 2013