Is this you? Create Your Porfile

Xueqian Zhao

Michigan Technological University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xueqian Zhao is active.

Explore More

Publication

Featured researches published by Xueqian Zhao.

international conference on computer aided design | 2011

Power grid analysis with hierarchical support graphs

Xueqian Zhao; Jia Wang; Zhuo Feng; Shiyan Hu

It is increasingly challenging to analyze present day large-scale power delivery networks (PDNs) due to the drastically growing complexity in power grid design. To achieve greater runtime and memory efficiencies, a variety of preconditioned iterative algorithms has been investigated in the past few decades with promising performance, while incremental power grid analysis also becomes popular to facilitate fast re-simulations of corrected designs. Although existing preconditioned solvers, such as incomplete matrix factor-based preconditioners, usually exhibit high efficiency in memory usage, their convergence behaviors are not always satisfactory. In this work, we present a novel hierarchical support-graph preconditioned iterative algorithm that constructs preconditioners by generating spanning trees in power supply networks for fast power grid analysis. The support-graph preconditioner is efficient for handling complex power grid structures (regular or irregular grids), and can facilitate very fast incremental analysis. Our experimental results on IBM power grid benchmarks show that compared with the best direct or iterative solvers, the proposed support-graph preconditioned iterative solver achieves up to 3.6X speedups for DC analysis, and up to 22X speedups for incremental analysis, while reducing the memory consumption by a factor of four.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

Robust Parallel Preconditioned Power Grid Simulation on GPU With Adaptive Runtime Performance Modeling and Optimization

Zhuo Feng; Xueqian Zhao; Zhiyu Zeng

Leveraging the power of nowadays graphics processing units for robust power grid simulation remains a challenging task. Existing preconditioned iterative methods that require incomplete matrix factorizations cannot be effectively accelerated on graphics processing unit (GPU) due to its limited hardware resource as well as data parallel computing. This paper presents an efficient GPU-based multigrid preconditioning algorithm for robust power grid analysis. By combining the fast geometric multigrid solver with the robust Krylov-subspace iterative solver, power grid DC and transient analysis can be performed efficiently on GPU without loss of accuracy (largest errors <;0.5 mV). Unlike previous GPU-based algorithms that rely on good power grid regularities, the proposed algorithm can be applied for more general power grid structures. Additionally, we also propose an accuracy-aware GPU performance modeling and optimization framework to automatically obtain the best power grid simulation configurations. Experimental results show that the DC and transient analysis on GPU can achieve more than 25X speedups over the best available CPU-based solvers.

design automation conference | 2011

Fast multipole method on GPU: tackling 3-D capacitance extraction on massively parallel SIMD platforms

Xueqian Zhao; Zhuo Feng

To facilitate full chip capacitance extraction, field solvers are typically deployed for characterizing capacitance libraries for various interconnect structures and configurations. In the past decades, various algorithms for accelerating boundary element methods (BEM) have been developed to improve the efficiency of field solvers for capacitance extraction. This paper presents the first massively parallel capacitance extraction algorithm FMMGpu that accelerates the well-known fast multipole methods (FMM) on modern Graphics Processing Units (GPUs). We propose GPU-friendly data structures and SIMD parallel algorithm flows to facilitate the FMM-based 3-D capacitance extraction on GPU. Effective GPU performance modeling methods are also proposed to properly balance the workload of each critical kernel in our FMMGpu implementation, by taking advantage of the latest Fermi GPUs concurrent kernel executions on streaming multiprocessors (SMs). Our experimental results show that FMMGpu brings 22X to 30X speedups in capacitance extractions for various test cases. We also show that even for small test cases that may not well utilize GPUs hardware resources, the proposed cube clustering and workload balancing techniques can bring 20% to 60% extra performance improvements.

design automation conference | 2013

TinySPICE: a parallel SPICE simulator on GPU for massively repeated small circuit simulations

Lengfei Han; Xueqian Zhao; Zhuo Feng

In nowadays variation-aware IC designs, cell characterizations and SRAM memory yield analysis require many thousands or even millions of repeated SPICE simulations for relatively small nonlinear circuits. In this work, we present a massively parallel SPICE simulator on GPU, TinySPICE, for efficiently analyzing small nonlinear circuits, such as standard cell designs, SRAMs, etc. In order to gain high accuracy and efficiency, we present GPU-based parametric three-dimensional (3D) LUTs for fast device evaluations. A series of GPU-friendly data structures and algorithm flows have been proposed in TinySPICE to fully utilize the GPU hardware resources, and minimize data communications between the GPU and CPU. Our GPU implementation allows for a large number of small circuit simulations in GPUs shared memory that involves novel circuit linearization and matrix solution techniques, and eliminates most of the GPU device memory accesses during the Newton-Raphson (NR) iterations, which enables extremely high-throughput SPICE simulations on GPU. Compared with CPU-based TinySPICE simulator, GPU-based TinySPICE achieves up to 138X speedups for parametric SRAM yield analysis without loss of accuracy.

international conference on computer aided design | 2012

GPSCP: a general-purpose support-circuit preconditioning approach to large-scale SPICE-accurate nonlinear circuit simulations

Xueqian Zhao; Zhuo Feng

To improve the efficiency of direct solution methods in SPICE-accurate nonlinear circuit simulations, preconditioned iterative solution techniques have been widely studied in the past decades. However, it still has been an extremely challenging task to develop general-purpose preconditioning methods that can deal with various large-scale nonlinear circuit simulations. In this work, a novel circuit-oriented, generalpurpose support-circuit preconditioning technique (GPSCP) is proposed to significantly improve the matrix solving time and reduce the memory consumption during large-scale nonlinear circuit simulations. We show that by decomposing the system Jacobian matrix at a given solution point into a graph Laplacian matrix as well as a matrix including all voltage and controlled sources, and subsequently sparsifying the graph Laplacian matrix based on support graph theory, the general-purpose support-circuit preconditioning matrix can be efficiently obtained, thereby serving as a very effective and efficient preconditioner in solving the original Jacobian matrix through Krylov-subspace iterations. Additionally, a novel critical node selection method and an energy-based spanning-graph scaling method have been proposed to further improve the quality of ultra-sparsifier support graph. To gain higher computational efficiency during transient circuit analysis, a dynamic support-circuit preconditioner updating approach has also been investigated. Our experimental results for a variety of large-scale nonlinear circuit designs show that the proposed technique can achieve up to 14.0X runtime speedups and 6.7X memory reduction in DC and transient simulations.

design automation conference | 2012

Towards efficient SPICE-accurate nonlinear circuit simulation with on-the-fly support-circuit preconditioners

Xueqian Zhao; Zhuo Feng

SPICE-accurate simulation of present-day large-scale nonlinear integrated circuit (IC) systems with millions of linear/nonlinear components can be prohibitively expensive, and thus extremely challenging. In this paper, we present a novel support-circuit preconditioning (SCP) technique for tackling large-scale nonlinear circuit simulations by exploiting sparsified graphs of a given circuit network. By extracting support graphs (SGs) from the original linear circuit networks, and combining them with nonlinear devices, support-circuit preconditioner can be efficiently computed using existing matrix solvers, allowing for on-the-fly updates during transient simulations when adopted in Krylov-subspace iterative solvers. Experimental results for a variety of large-scale circuit designs show that the proposed method achieves up to 22X speedups in solving the matrices involved in DC and transient (TR) simulations, and up to 8X reduction in memory usage, when compared with the simulator powered by the state-of-the-art direct solver KLU.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

Hierarchical Cross-Entropy Optimization for Fast On-Chip Decap Budgeting

Xueqian Zhao; Yonghe Guo; Xiaodao Chen; Zhuo Feng; Shiyan Hu

Decoupling capacitor (decap) has been widely used to effectively reduce dynamic power supply noise. Traditional decap budgeting algorithms usually explore the sensitivity-based nonlinear optimizations or conjugate gradient (CG) methods, which can be prohibitively expensive for large-scale decap budgeting problems and cannot be easily parallelized. In this paper, we propose a hierarchical cross-entropy based optimization technique which is more efficient and parallel-friendly. Cross-entropy (CE) is an advanced optimization framework which explores the power of rare event probability theory and importance sampling. To achieve the high efficiency, a sensitivity-guided cross-entropy (SCE) algorithm is introduced which integrates CE with a partitioning-based sampling strategy to effectively reduce the solution space in solving the large-scale decap budgeting problems. Compared to improved CG method and conventional CE method, SCE with Latin hypercube sampling method (SCE-LHS) can provide 2× speedups, while achieving up to 25% improvement on power supply noise. To further improve decap optimization solution quality, SCE with sequential importance sampling (SCE-SIS) method is also studied and implemented. Compared to SCE-LHS, in similar runtime, SCE-SIS can lead to 16.8% further reduction on the total power supply noise.

international conference on computer aided design | 2014

An efficient spectral graph sparsification approach to scalable reduction of large flip-chip power grids

Xueqian Zhao; Zhuo Feng; Cheng Zhuo

Existing state-of-the-art realizable RC reduction methods may not be suitable for scalable power grid reductions due to the fast growing computational complexity and the large number of ports. In this work, we present a scalable power grid reduction method for reducing large-scale flip-chip power grids based on recent spectral graph sparsification techniques. The first step of the proposed approach aggressively reduces the large power grid blocks into much smaller power grid blocks by properly matching the effective resistances of the original power grid networks. Next, an efficient spectral graph sparsification scheme is introduced to dramatically sparsify the relatively dense power grid blocks that are generated during the previous step. In the last, an effective grid compensation scheme is proposed to further improve the model accuracy of the reduced and sparsified power grid. Since reduction of each power grid block can be performed independently, our method can be easily accelerated on parallel computers, and therefore expected to be capable of handling large power grid designs as well as incremental designs. Extensive experimental results show that our method can scale linearly with power grid sizes and efficiently reduce industrial power grids sizes by 20X without loss of much accuracy in both DC and transient analysis.

international conference on computer aided design | 2013

An efficient graph sparsification approach to scalable harmonic balance (HB) analysis of strongly nonlinear RF circuits

Lengfei Han; Xueqian Zhao; Zhuo Feng

In the past decades, harmonic balance (HB) has been widely used for computing steady-state solutions of nonlinear radio-frequency (RF) and microwave circuits. However, using HB for simulating strongly nonlinear RF circuits still remains a very challenging task. Although direct solution methods can be adopted to handle moderate to strong nonlinearities in HB analysis, such methods do not scale efficiently with large-scale problems due to excessively long simulation time and huge memory consumption. In this work, we present a novel graph sparsification approach for generating preconditioners that can be efficiently applied for simulating strongly nonlinear RF circuits. Our approach first sparsifies RF circuit matrices that can be subsequently leveraged for sparsifying the entire HB Jacobian matrix. We show that the resultant sparsified Jacobian matrix can be used as a robust yet efficient preconditioner in HB analysis. Our experimental results show that when compared with existing state-of-the-art direct solvers, the proposed HB solver can more efficiently handle moderate to strong nonlinearities during the HB analysis of RF circuits, achieving more than 10X speedups and 8X memory reductions.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2015

An Adaptive Graph Sparsification Approach to Scalable Harmonic Balance Analysis of Strongly Nonlinear Post-Layout RF Circuits

Lengfei Han; Xueqian Zhao; Zhuo Feng

In the past decades, harmonic balance (HB) has been widely used for computing steady-state solutions of nonlinear radio-frequency (RF) and microwave circuits. However, using HB for simulating strongly nonlinear post-layout RF circuits still remains a very challenging task. Although direct solution methods can be adopted to handle moderate to strong nonlinearities in HB analysis, such methods do not scale efficiently with large-scale problems due to excessively long simulation time and prohibitively large memory consumption. In this paper, we present a novel graph sparsification approach for automatically generating preconditioners that can be efficiently applied for simulating strongly nonlinear post-layout RF circuits. Our approach allows to sparsify time-domain circuit modified nodal analysis matrices that can be subsequently leveraged for sparsifying the entire HB Jacobian matrix. We show that the resultant sparsified Jacobian matrix can be used as a robust yet efficient preconditioner in HB analysis. Our experimental results show that when compared with the prior state-of-the-art direct solution method, the proposed solver can more efficiently handle moderate to strong nonlinearities during the HB analysis of RF circuits, achieving up to 20× speedups and 6× memory reductions.

Explore More