Salil Raje
Northwestern University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Salil Raje.
international symposium on low power electronics and design | 1995
Salil Raje; Majid Sarrafzadeh
This paper presents a low power design technique at the behavioral synthesis stage. Scheduling technique for low power is studied and a theoretical foundation is established. The equation for dynamic power, Pdyn = V 2 ddCloadfswitch, is used as a basis. The voltage applied to the functional units is varied, slowing down the functional unit throughput, reducing the power but meeting the throughput constraint for the entire system. The input to our problem is an unscheduled data ow graph with a timing constraint. The goal is to establish a voltage value at which each of the operations of the data ow graph would be performed and thereby xing the latency for the operation such that the total timing constraint for the system is met. We give an exact algorithm to minimize the systems power. The timing constraint for our system could be any value more than or equal to the critical path. The experimental results for some High-Level Synthesis benchmarks show considerable reduction in the power consumption. For tighter timing constraints, the maximum reduction is about 40% by using supply voltages 5V and 3V and a maximum reduction of about 46% using supply voltages 5V, 3V and 2.4V. For larger timing constraints, the maximum reduction is about 64% by using supply voltages 5V and 3V and a maximum reduction of about 74% using supply voltages 5V, 3V and 2.4V.
international conference on computer aided design | 2003
Maogang Wang; Abhishek Ranjan; Salil Raje
The recent past has seen a tremendous increase in the size ofdesign circuits that can be implemented in a single FPGA. Theselarge design sizes significantly impact cycle time due to designautomation software runtimes and an increased number ofperformance based iterations. New FPGA physical designapproaches need to be utilized to alleviate some of theseproblems. Hierarchical approaches to divide and conquer thedesign, early estimation tools for design exploration, andphysical optimizations are some of the key methodologies thathave to be introduced in the FPGA physical design tools. Thispaper will investigate the loss/benefit in quality of results due tohierarchical approaches and compare and contrast some of thedesign automation problem formulations and solutions neededfor FPGAs versus known standard cell ASIC approaches.
custom integrated circuits conference | 1993
Jun Dong Cho; Salil Raje; Majid Sarrafzadeh; M. Sriram; Sung-Mo Kang
A novel layer assignment algorithm for high-performance multilayer packages, such as multichip modules (MCMs), is proposed. The focus is on assigning nets to layers to minimize the crosstalk between nets, while simultaneously minimizing the number of vias and layers. A novel net interference measure based on potential crosstalk and planarity is used to construct a net interference graph (NIG), and a new graph coloring and permutation algorithm is used to find an interference-minimized subset in each layer and a minimum crosstalk between layers. Theoretical and experimental results on this multilayer assignment approach are presented. The proposed maximum linear permutation heuristic is very robust and allows the incorporation of various design constraints (e.g., crosstalk, crossover, and critical area) and cost criteria.
international symposium on circuits and systems | 1999
Majid Sarrafzadeh; Salil Raje
This paper presents a low power design technique at the behavioral synthesis stage. Multiple voltages are used to run the functional units. The input to our problem is an unscheduled data flow graph with a timing constraint and a resource constraint. The resource constraint is given as the number and type of each of the functional units to be used and the supply voltage at which it would run. The goal is to maximize the number of operation nodes in the flow graph that are mapped to functional units running at the lower supply voltage while still satisfying the timing constraints. We propose two algorithms, one is a dynamic programing algorithm that gives optimal (i.e., minimum power consumption) results for the two-voltage problem; the second, is a heuristic geometric scheduling algorithm. The experimental results for some high-level synthesis benchmarks show 13% to 32% reduction in the power consumption.
international symposium on physical design | 2004
Taraneh Taghavi; Soheil Ghiasi; Abhishek Ranjan; Salil Raje; Majid Sarrafzadeh
The recent past has seen a tremendous increase in the size of design circuits that can be implemented in a single FPGA. The size and complexity of modern FPGAs has far outpaced the innovations in FPGA physical design. The problems faced by FPGA designers are similar in nature to those that preoccupy ASIC designers, namely, interconnect delays and design management. However, this paper will show that a simple re-targeting of ASIC physical design methodologies and algorithms to the FPGA domain will not suffice. We will show that several well researched problems in the ASIC world need new problem formulations and algorithms research to be useful for todays FPGAs. Partitioning, floorplanning, placement, delay estimation schemes are only some of the topics that need complete overhaul. We will give problem formulations, motivated by experimental results, for some of these topics as applicable in the FPGA domain.
european design and test conference | 1996
Reinaldo A. Bergamaschi; Salil Raje
One of the main problems in high-level synthesis has been the lack of verification techniques for checking the equivalence between the behavioral specification and the scheduled implementation. Due to scheduling it may not be possible to compare simulation results before and after high-level synthesis using the same simulation drivers. Given that simulation is the most time consuming step in the design process, this severely reduces the advantages of high-level synthesis. This paper presents techniques and algorithms for comparing simulation results using the same simulation drivers. The approach is based on creating special hardware structures in the implementation and comparing the simulations only at synchronization points called observable time windows.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2002
Padmini Gopalakrishnan; Altan Odabasioglu; Lawrence T. Pileggi; Salil Raje
Traditional integrated-circuit (IC) design methodologies have used wire-load models during logic synthesis to estimate the expected impact of the metal wiring on the gate delays. These models are based on wire-length statistics from legacy designs to facilitate a top-down IC design flow process. Recently, there has been increased concern regarding the efficacy of wire-load models as deep-submicrometer (DSM) interconnect parasitics begin to dominate the delay of digital IC logic gates. Some technology projections (Sylvester and Keutzer, 1998) have suggested that wire-load models will remain effective to block sizes on the order of 50 000 gates. This suggests that existing top-down synthesis methodologies will not have to be changed substantially since this is approximately the maximum size for which logic synthesis is effective. However, our analyses on production designs show that the problem is not quite so straightforward and the efficacy of synthesis using wire-load models depends upon technology data as well as specific characteristics of the design and the granularity of available physical information. We analyze these effects and dependencies in detail in this paper and draw some conclusions regarding the future challenges associated with top-down IC design and block synthesis, in particular, in the DSM design era.
international symposium on physical design | 2001
Padmini Gopalakrishnan; Altan Odabasioglu; Lawrence T. Pileggi; Salil Raje
The advent of deep sub-micron technologies has created a number of problems for existing design methodologies. Most prominent among them is the problem of timing closure, whereby design time is dramatically increased due to iterations between gate-level synthesis and physical design. It is well known that the heart of this problem lies in the use of wireload models based on wirelength statistics from legacy designs. Some technology projections in have suggested that wireload models will remain effective to block sizes on the order of 50k gates. This suggests that synthesis will not have to be changed much since this is approximately the maximum size for which logic synthesis is effective. However, our analyses on production designs show that the problem is not quite so straightforward, and the efficacy of synthesis using wireload models depends upon technology data as well as specific characteristics of the design. We analyze these effects and dependencies in detail in this paper, and draw some conclusions about the amount of physical information that is required for synthesis to be effective. Finally, we discuss the implications on hierarchical design flows, and propose a solution via physical prototyping.
field programmable gate arrays | 2004
Navaratnasothie Selvakkumaran; Abhishek Ranjan; Salil Raje; George Karypis
As the chip densities increase, the modern FPGAs contain large capacity and increasingly provide heterogeneous units such as multipliers, processor/DSP cores, RAM-blocks etc, for efficient execution of crucial functions of the design. The hypergraph partitioning algorithms are generally used as a divide-and-conquer strategy, during synthesis and placement. The partitioning algorithms for designs with heterogeneous resources, need to not only minimize the cut, but also balance the individual types of resources. Unfortunately, the state-of-the-art multilevel hypergraph partitioning algorithms (hMetis,MLPart), are not capable of distinguishing the types of cells. To overcome this problem, we developed a new set of multilevel hypergraph partitioning algorithms, that are aware of multiple resources, and are guaranteed to balance the utilization of different resources. By evaluating these algorithms on large benchmarks, we found that it is possible to achieve such feasible partitions, while incurring only a slightly higher cut (3.3%-5.7%) compared to infeasible partitions generated by hMetis.
european symposium on algorithms | 1994
Jun Dong Cho; Salil Raje; Majid Sarrafzadeh
Given arbitrary positive weights associated with edges, the maximum cut problem is to find a cut of the maximum cardinality (or weight in general) that partitions the graph G into X and ¯X. Our maxcut approximation algorithm runs in O(e+n) sequential time yielding a node-balanced maxcut with size at least ⌊(e+e/n)/2⌋, improving the time complexity of O(e log e) known before. Employing a height-balanced binary decomposition, an O(e+n log k) time algorithm is devised for the maxcut k-coloring problem which always finds a k-partition of vertices such that the number of bad edges (or “defected” edges with the same color on two of its end-points) does not exceed ⌈(e/k)(n−1)/n)h⌉, where h=⌈log2k⌉, thus improving both the time complexity O(enk) and the bound ⌋e/k⌋ known before. The bound on maxcut k-coloring is also extended to find an approximation bound for the maximum k-covering problem. The relative simplicity of the algorithms and their computational economy are both keys to their practical applications. The proposed algorithms have a number of applications, for example, in VLSI design....