Is this you? Create Your Porfile

Andrew B. Kahng

University of California, San Diego

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Andrew B. Kahng is active.

Explore More

Publication

Featured researches published by Andrew B. Kahng.

intelligent robots and systems | 1995

Cooperative mobile robotics: antecedents and directions

Y.U. Cao; Alex S. Fukunaga; Andrew B. Kahng; F. Meng

There has been increased research interest in systems composed of multiple autonomous mobile robots exhibiting cooperative behavior. Groups of mobile robots are constructed, with an aim to studying such issues as group architecture, resource conflict, origin of cooperation, learning, and geometric problems. As yet, few applications of cooperative robotics have been reported, and supporting theory is still in its formative stages. In this paper, we give a critical survey of existing works and discuss open problems in this field, emphasizing the various theoretical issues that arise in the study of cooperative robotics. We describe the intellectual heritages that have guided early research, as well as possible additions to the set of existing motivations.

international conference on computer aided design | 1991

Fast spectral methods for ratio cut partitioning and clustering

Lars W. Hagen; Andrew B. Kahng

Partitioning of circuit netlists in VLSI design is considered. It is shown that the second smallest eigenvalue of a matrix derived from the netlist gives a provably good approximation of the optimal ratio cut partition cost. It is also demonstrated that fast Lanczos-type methods for the sparse symmetric eigenvalue problem are a robust basis for computing heuristic ratio cuts based on the eigenvector of this second eigenvalue. Effective clustering methods are an immediate by-product of the second eigenvector computation and are very successful on the difficult input classes proposed in the CAD literature. The intersection graph representation of the circuit netlist is considered, as a basis for partitioning, a heuristic based on spectral ratio cut partitioning of the netlist intersection graph is proposed. The partitioning heuristics were tested on industry benchmark suites, and the results were good in terms of both solution quality and runtime. Several types of algorithmic speedups and directions for future work are discussed. >

design, automation, and test in europe | 2009

ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration

Andrew B. Kahng; Bin Li; Li-Shiuan Peh; Kambiz Samadi

As industry moves towards many-core chips, networks-on-chip (NoCs) are emerging as the scalable fabric for interconnecting the cores. With power now the first-order design constraint, early-stage estimation of NoC power has become crucially important. ORION was amongst the first NoC power models released, and has since been fairly widely used for early-stage power estimation of NoCs. However, when validated against recent NoC prototypes - the Intel 80-core Teraflops chip and the Intel scalable communications core (SCC) chip - we saw significant deviation that can lead to erroneous NoC design choices. This prompted our development of ORION 2.0, an extensive enhancement of the original ORION models which includes completely new subcomponent power models, area models, as well as improved and updated technology models. Validation against the two Intel chips confirms a substantial improvement in accuracy over the original ORION. A case study with these power models plugged within the COSI-OCC NoC design space exploration tool confirms the need for, and value of, accurate early-stage NoC power estimation. To ensure the longevity of ORION 2.0, we will be releasing it wrapped within a semi-automated flow that automatically updates its models as new technology files become available.

Integration | 1995

Recent directions in netlist partitioning: a survey

Charles J. Alpert; Andrew B. Kahng

Abstract This survey describes research directions in netlist partitioning during the past two decades in terms of both problem formulations and solution approaches. We discuss the traditional min-cut and ratio cut bipartitioning formulations along with multi-way extensions and newer problem formulations, e.g., constraint-driven partitioning (for FPGAs) and partitioning with module replication. Our discussion of solution approaches is divided into four major categories: move-based approaches, geometric representations, combinatorial formulations, and clustering approaches. Move-based algorithms iteratively explore the space of feasible solutions according to a neighborhood operator; such methods include greed, iterative exchange, simulated annealing, and evolutionary algorithms. Algorithms based on geometric representations embed the circuit netlist in some type of “geometry”, e.g., a 1-dimensional linear ordering or a multi-dimensional vector space; the embeddings are commonly constructed using spectral methods. Combinatorial methods transform the partitioning problem into another type of optimization, e.g., based on network flows or mathematical programming. Finally, clustering algorithms merge the netlist modules into many small clusters; we discuss methods which combine clustering with existing algorithms (e.g., two-phase partitioning). The paper concludes with a discussion of benchmarking in the VLSI CAD partitioning literature and some perspectives on more promising directions for future work.

design automation conference | 2000

Can recursive bisection alone produce routable placements

Andrew Caldwell; Andrew B. Kahng; Igor L. Markov

This work focuses on congestion-driven placement of standard cells into rows in the fixed-die context. We summarize the state-of-the-art after two decades of research in recursive bisection placement and implement a new placer, called Capo, to empirically study the achievable limits of the approach. From among recently proposed improvements to recursive bisection, Capo incorporates a leading-edge multilevel min-cut partitioner [7], techniques for partitioning with small tolerance [8], optimal min-cut partitioners and end-case min-wirelength placers [5], previously unpublished partitioning tolerance computations, and block splitting heuristics. On the other hand, our “good enough” implementation does not use “overlapping” [17], multi-way partitioners [17, 20], analytical placement, or congestion estimation [24, 35]. In order to run on recent industrial placement instances, Capo must take into account fixed macros, power stripes and rows with different allowed cell orientations. Capo reads industry-standard LEF/DEF, as well as formats of the GSRC bookshelf for VLSI CAD algorithms [6], to enable comparisons on available placement instances in the fixed-die regime. Capo clearly demonstrates that despite a potential mismatch of objectives, improved mincut bisection can still lead to improved placement wirelength and congestion. Our experiments on recent industrial benchmarks fail to give a clear answer to the question in the title of this paper. However, they validate a series of improvements to recursive bisection and point out a need for transparent congestion management techniques that do not worsen the wirelength of already routable placements. Our experimental flow, which validates fixed-die placement results by violation-free detailed auto-routability, provides a new norm for comparison of VLSI placement implementations.

IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 1992

Zero skew clock routing with minimum wirelength

Ting-hai Chao; Yu-Chin Hsu; Jan-Ming Ho; Andrew B. Kahng

The deferred-merge embedding (DME) algorithm, which embeds any given connection topology to create a clock tree with zero skew while minimizing total wirelength, is presented. The algorithm always yields exact zero skew trees with respect to the appropriate delay model. Experimental results show an 8% to 15% wire length reduction over some previous constructions. The DME algorithm may be applied to either the Elmore or linear delay model, and yields optimal total wirelength for linear delay. DME is a very fast algorithm, running in time linear in the number of synchronizing elements. A unified BB+DME algorithm, which constructs a clock tree topology using a top-down balanced bipartition (BB) approach and then applies DME to that topology, is also presented. The experimental results indicate that both the topology generation and embedding components of the methodology are necessary for effective clock tree construction. >

design automation conference | 2004

Selective gate-length biasing for cost-effective runtime leakage control

Puneet Gupta; Andrew B. Kahng; Puneet Sharma; Dennis Sylvester

With process scaling, leakage power reduction has become one of the most important design concerns. Multi-threshold techniques have been used to reduce runtime leakage power without sacrificing performance. In this paper, we propose small biases of transistor gate-length to further minimize power in a manufacturable manner. Unlike multi-V th techniques, gate-length biasing requires no additional masks and may be performed at any stage in the design process.Our results show that gate-length biasing effectively reduces leakage power by up to 25% with less than 4% delay penalty. We show the feasibility of the technique in terms of manufacturability and pin-compatibility for post-layout power optimization. We also show up to 54% reduction in leakage uncertainty due to inter-die process variation in circuits when biased gate-lengths, versus only unbiased one, are used. Circuits selectively biased show much less sensitivity to both intra and inter die variations.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1997

An analytical delay model for RLC interconnects

Andrew B. Kahng; Sudhakar Muddu

Elmore delay has been widely used to estimate interconnect delays in the performance-driven synthesis and layout of very-large-scale-integration (VLSI) routing topologies. For typical RLC interconnections, however, Elmore delay can deviate significantly from SPICE-computed delay, since it is independent of inductance of the interconnect and rise time of the input signal. Here, we develop an analytical delay model based on first and second moments to incorporate inductance effects into the delay estimate for interconnection lines under step input. Delay estimates using our analytical model are within 15% of SPICE-computed delay across a wide range of interconnect parameter values. We also extend our delay model for estimation of source-sink delays in arbitrary interconnect trees. We observe significant improvement in the accuracy of delay estimates for interconnect trees when compared to the Elmore model, yet our estimates are as easy to compute as Elmore delay. Evaluation of our analytical models is several orders of magnitude faster than simulation using SPICE. We also illustrate the application of our model in controlling response undershoot/overshoot and reducing interconnect delay through constraints on the moments.

design automation conference | 1997

Multilevel circuit partitioning

Charles J. Alpert; Jen-Hsin Huang; Andrew B. Kahng

Recent work has illustrated the promise ofmultilevel approaches for partitioning large circuits. Multilevel partitioningrecursively clusters the instance until its size is smallerthan a given threshold, then unclusters the instance while applyinga partitioning refinement algorithm. Our multilevel partitioner usesa new technique to control the number of levels in the matching-basedclustering phase and also exploits recent innovations in classiciterative partitioning. Our heuristic outperforms numerousexisting bipartitioning heuristics, with improvements rangingfrom 6.9 to 27.9% for 100 runs and 3.0 to 20.6% for just 10 runs(while also using less CPU time).

IEEE Transactions on Very Large Scale Integration Systems | 2012

ORION 2.0: A Power-Area Simulator for Interconnection Networks

Andrew B. Kahng; Bin Li; Li-Shiuan Peh; Kambiz Samadi

As industry moves towards multicore chips, networks-on-chip (NoCs) are emerging as the scalable fabric for interconnecting the cores. With power now the first-order design constraint, early-stage estimation of NoC power has become crucially important. In this work, we present ORION 2.0, an enhanced NoC power and area simulator, which offers significant accuracy improvement relative to its predecessor, ORION 1.0.

Explore More