Xue-Xin Liu
University of California, Riverside
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xue-Xin Liu.
ACM Transactions on Design Automation of Electronic Systems | 2012
Fang Gong; Xue-Xin Liu; Hao Yu; Sheldon X.-D. Tan; Junyan Ren; Lei He
Performance failure has become a significant threat to the reliability and robustness of analog circuits. In this article, we first develop an efficient non-Monte-Carlo (NMC) transient mismatch analysis, where transient response is represented by stochastic orthogonal polynomial (SOP) expansion under PVT variations and probabilistic distribution of transient response is solved. We further define performance yield and derive stochastic sensitivity for yield within the framework of SOP, and finally develop a gradient-based multiobjective optimization to improve yield while satisfying other performance constraints. Extensive experiments show that compared to Monte Carlo-based yield estimation, our NMC method achieves up to 700X speedup and maintains 98% accuracy. Furthermore, multiobjective optimization not only improves yield by up to 95.3% with performance constraints, it also provides better efficiency than other existing methods.
international conference on computer aided design | 2013
Xue-Xin Liu; Hai Wang; Sheldon X.-D. Tan
In this paper, we propose an efficient parallel dynamic linear solver, called GPU-GMRES, for transient analysis of large power grid networks. The new method is based on the preconditioned generalized minimum residual (GMRES) iterative method implemented on heterogeneous CPU-GPU platforms. The new solver is very robust and can be applied to power grids with different structures and other applications like thermal analysis. The proposed GPU-GMRES solver adopts the very general and robust incomplete LU (ILU) based preconditioner. We show that by properly selecting the right amount of fill-ins in the incomplete LU factors, a good trade-off between GPU efficiency and GMRES convergence rate can be achieved for the best overall performance. Such a tunable feature makes this algorithm very adaptive to different problems. Furthermore, we properly partition the major computing tasks in GMRES solver to minimize the data traffic between CPU and GPU, which further boosts performance of the proposed method. Experimental results on the set of published IBM benchmark circuits and mesh-structured power grid networks show that the GPU-GMRES solver can deliver order of magnitudes speedup over the direct LU solver UMFPACK. GPU-GMRES can also deliver 3-10× speedup over the CPU implementation of the same GMRES method on transient analysis.
international symposium on quality electronic design | 2012
Xue-Xin Liu; Zao Liu; Sheldon X.-D. Tan; Joseph A. Gordon
Cooling and related thermal problems are the principal challenges facing 3D integrated circuits (3D-ICs). Active cooling techniques such as integrated inter-tier liquid cooling are promising alternatives for traditional fan-based cooling, which is insufficient for 3D-ICs. In this regard, fast full-chip transient thermal modeling and simulation techniques are required to design efficient and cost-effective cooling solutions for optimal performance, cost and reliability of packages and 3D ICs. In this paper, we propose an efficient finite difference based full-chip simulation algorithm for 3D-ICs using the GMRES method based on CPU platforms. Unlike existing fast thermal analysis methods, the new method starts from the physics-based heat equations to model 3D-ICs with inter-tier liquid cooling microchannels and directly solves the resulting partial differential equations using GMRES. To speedup the simulation, we further develop a preconditioned GPU-accelerated GMRES solver, GPU-GMRES, to solve the resulting thermal equations on top of some published sparse numerical routines. Experimental results show the proposed GPU-GMRES solver is up to 4.3× faster than parallel CPU-GMRES for DC analysis and 2.3× faster than parallel LU decomposition and one or two orders of magnitude faster than the single-thread CPU-GMRES for transient analysis on a number of thermal circuits and other published problems.
asia and south pacific design automation conference | 2010
Hao Yu; Xue-Xin Liu; Hai Wang; Sheldon X.-D. Tan
To cope with an increasing complexity when analyzing analog mismatch in sub-90nm designs, this paper presents a fast non-Monte-Carlo method to calculate mismatch in time domain. The local random mismatch is described by a noise source with an explicit dependence on geometric parameters, and is further expanded by stochastic orthogonal polynomials (SOPs). This forms a stochastic differential-algebra-equation (SDAE). To deal with large-scale problems, the SDAE is linearized at a number of snapshots along the nominal transient trajectory, and hence is naturally embedded into a trajectory-piecewise-linear (TPWL) macromodeling. The TPWL is improved with a novel incremental aggregation of subspaces identified at those snapshots. Experiments show that the proposed method, isTPWL, is hundreds of times faster than Monte-Carlo method with a similar accuracy. In addition, our macromodel further reduces runtime by up to 25X, and is faster to build and more accurate to simulate compared to existing approaches.
ACM Transactions on Design Automation of Electronic Systems | 2013
Xue-Xin Liu; Sheldon X.-D. Tan; A. A. Palma-Rodriguez; Esteban Tlelo-Cuautle; Guoyong Shi
In this article, we propose a new performance bound analysis of analog circuits considering process variations. We model the variations of component values as intervals measured from tested chips and manufacture processes. The new method first applies a graph-based analysis approach to generate the symbolic transfer function of a linear(ized) analog circuit. Then the frequency response bounds (maximum and minimum) are obtained by performing nonlinear constrained optimization in which magnitude or phase of the transfer function is the objective function to be optimized subject to the ranges of process variational parameters. The response bounds given by the optimization-based method are very accurate and do not have the over-conservativeness issues of existing methods. Based on the frequency-domain bounds, we further develop a method to calculate the time-domain response bounds for any arbitrary input stimulus. Experimental results from several analog benchmark circuits show that the proposed method gives the correct bounds verified by Monte Carlo analysis while it delivers one order of magnitude speedup over Monte Carlo for both frequency-domain and time-domain bound analyses. We also show analog circuit yield analysis as an application of the frequency-domain variational bound analysis.
international symposium on quality electronic design | 2012
Ruijing Shen; Sheldon X.-D. Tan; Xue-Xin Liu
In this paper, we propose a new voltage binning technique to improve yield. Voltage binning technique tries to assign different supply voltages to different chips in order to improve the yield. A novel valid voltage segment concept is proposed, which is determined by the timing and power constraints of chips. Then we develop a formulation to predict the maximum number of bins required under the uniform binning scheme from the distribution of length of valid supply voltage segment. With the new concept, an optimal binning scheme can be modeled as a set-cover problem. A greedy algorithm is developed to solve the set-cover problem in an incremental way. The new method is also extendable to deal with a range of working supply voltages for dynamic voltage scaling under different operation modes (like lower power and high performance modes). Experimental results on some benchmarks in 45nm technology show that the proposed method can correctly predict the upper bound on the number of bins required. The optimal binning scheme can lead to significant saving for the number of bins compared to the uniform one to achieve the same yield with very small CPU cost.
asia and south pacific design automation conference | 2013
Xue-Xin Liu; A. A. Palma-Rodriguez; Santiago Rodriguez-Chavez; Sheldon X.-D. Tan; Esteban Tlelo-Cuautle; Yici Cai
Yield estimation for analog integrated circuits are crucial for analog circuit design and optimization in the presence of process variations. In this paper, we present a novel analog yield estimation method based on performance bound analysis technique in frequency domain. The new method first derives the transfer functions of linear (or linearized) analog circuits via a graph-based symbolic analysis method. Then frequency response bounds of the transfer functions in terms of magnitude and phase are obtained by a nonlinear constrained optimization technique. To predict yield rate, bound information are employed to calculate Gaussian distribution functions. Experimental results show that the new method can achieve similar accuracy while delivers 20 times speedup over Monte Carlo simulation of HSPICE on some typical analog circuits.
IEEE Transactions on Very Large Scale Integration Systems | 2015
Xue-Xin Liu; Kuangya Zhai; Zao Liu; Kai He; Sheldon X.-D. Tan; Wenjian Yu
In this brief, we propose an efficient parallel finite difference-based thermal simulation algorithm for 3-D-integrated circuits (ICs) using generalized minimum residual method (GMRES) solver on CPU-graphic processing unit (GPU) platforms. First, the new method starts from basic physics-based heat equations to model 3-D-ICs with intertier liquid cooling microchannels and directly solves the resulting partial differential equations. Second, we develop a new parallel GPU-GMRES solver to compute the resulting thermal systems on a CPU-GPU platform. We also explore different preconditioners (implicit and explicit) and study their performances on thermal circuits and other types of matrices. Experimental results show the proposed GPU-GMRES solver can deliver orders of magnitudes speedup over the parallel LU-based solver and up to 4× speedup over CPU-GMRES for both dc and transient thermal analyzes on a number of thermal circuits and other published problems.
design, automation, and test in europe | 2013
Hai Wang; Sheldon X.-D. Tan; Sahana Swarup; Xue-Xin Liu
On-chip physical thermal sensors play a vital role for accurately estimating the full-chip thermal profile. How to place physical sensors such that both the number of thermal sensors and the temperature estimation errors are minimized becomes important for on-chip dynamic thermal management of todays high-performance microprocessors. In this paper, we present a new systematic thermal sensor placement algorithm. Different from the traditional thermal sensor placement algorithms where only the temperature information is explored, the new placement method takes advantage of functional unit power information by exploiting the correlation of power estimation errors among functional blocks. The new power-driven placement algorithm applies the correlation clustering algorithm to determine both the locations of sensors and the number of sensors automatically such that the temperature estimation errors can be minimized. Experimental results on a dual-core architecture show that the new thermal sensor placements yield more accurate full-chip temperature estimation compared to the uniform and the k-means based placement approaches.
design, automation, and test in europe | 2012
Xue-Xin Liu; Sheldon X.-D. Tan; Hai Wang; Hao Yu
In this paper, we propose a new envelope-following parallel transient analysis method for the general switching power converters. The new method first exploits the parallelisim in the envelope-following method and parallelize the Newton update solving part, which is the most computational expensive, in GPU platforms to boost the simulation performance. To further speed up the iterative GMRES solving for Newton update equation in the envelope-following method, we apply the matrix-free Krylov basis generation technique, which was previously used for RF simulation. Last, the new method also applies more robust Gear-2 integration to compute the sensitivity matrix instead of traditional integration methods. Experimental results from several integrated on-chip power converters show that the proposed GPU envelope-following algorithm leads to about 10× speedup compared to its CPU counterpart, and 100× faster than the traditional envelop-following methods while still keeps the similar accuracy.