Is this you? Create Your Porfile

Jeng-Hau Lin

University of California, San Diego

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jeng-Hau Lin is active.

Explore More

Publication

Featured researches published by Jeng-Hau Lin.

field programmable gate arrays | 2017

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs

Ritchie Zhao; Weinan Song; Wentao Zhang; Tianwei Xing; Jeng-Hau Lin; Mani B. Srivastava; Rajesh K. Gupta; Zhiru Zhang

Convolutional neural networks (CNN) are the current stateof-the-art for many computer vision tasks. CNNs outperform older methods in accuracy, but require vast amounts of computation and memory. As a result, existing CNN applications are typically run on clusters of CPUs or GPUs. Studies into the FPGA acceleration of CNN workloads has achieved reductions in power and energy consumption. However, large GPUs outperform modern FPGAs in throughput, and the existence of compatible deep learning frameworks give GPUs a significant advantage in programmability. Recent research in machine learning demonstrates the potential of very low precision CNNs -- i.e., CNNs with binarized weights and activations. Such binarized neural networks (BNNs) appear well suited for FPGA implementation, as their dominant computations are bitwise logic operations and their memory requirements are reduced. A combination of low-precision networks and high-level design methodology may help address the performance and productivity gap between FPGAs and GPUs. In this paper, we present the design of a BNN accelerator that is synthesized from C++ to FPGA-targeted Verilog. The accelerator outperforms existing FPGA-based CNN accelerators in GOPS as well as energy and resource efficiency.

IEEE Transactions on Advanced Packaging | 2009

Fast Methodology for Determining Eye Diagram Characteristics of Lossy Transmission Lines

Wei-Da Guo; Jeng-Hau Lin; Chien-Min Lin; Tian Wei Huang; Ruey-Beei Wu

As the speed of signal through an interconnection increases toward the multigigabit ranges, the effects of lossy transmission lines on the signal quality of printed circuit boards becomes a critical issue. To evaluate the eye diagram and thus the signal integrity in the modern digital systems, this paper proposes a fast methodology that employs only two anti-polarity one-bit data patterns instead of the pseudo-random bit sequence as input sources to simulate the worst-case eye diagram. Analytic expressions are derived for the impulse response of the lossy transmission lines due to the skin-effect loss, while the Kramers-Kronig relations are employed to deal with the noncausal problem related to the dielectric loss. Two design graphs that can be used to rapidly predict the eye diagram characteristics versus the conductive and dielectric losses are then constructed and based on which, the maximally usable length of transmission lines under a certain signal specification can be easily acquired. At last, the time-domain simulations and experiments are implemented to verify the exactitude of proposed concept.

design automation conference | 2014

MATEX: A Distributed Framework for Transient Simulation of Power Distribution Networks

Hao Zhuang; Shih-Hung Weng; Jeng-Hau Lin; Chung-Kuan Cheng

We proposed MATEX, a distributed framework for transient simulation of power distribution networks (PDNs). MATEX utilizes matrix exponential kernel with Krylov subspace approximations to solve differential equations of linear circuit. First, the whole simulation task is divided into subtasks based on decompositions of current sources, in order to reduce the computational overheads. Then these subtasks are distributed to different computing nodes and processed in parallel. Within each node, after the matrix factorization at the beginning of simulation, the adaptive time stepping solver is performed without extra matrix re-factorizations. MATEX overcomes the stiffness hinder of previous matrix exponential-based circuit simulator by rational Krylov subspace method, which leads to larger step sizes with smaller dimensions of Krylov subspace bases and highly accelerates the whole computation. MATEX outperforms both traditional fixed and adaptive time stepping methods, e.g., achieving around 13X over the trapezoidal framework with fixed time step for the IBM power grid benchmarks.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2016

Simulation Algorithms With Exponential Integration for Time-Domain Analysis of Large-Scale Power Delivery Networks

Hao Zhuang; Wenjian Yu; Shih-Hung Weng; Ilgweon Kang; Jeng-Hau Lin; Xiang Zhang; Ryan Coutts; Chung-Kuan Cheng

We design an algorithmic framework using matrix exponentials for time-domain simulation of power delivery network (PDN). Our framework can reuse factorized matrices to simulate the large-scale linear PDN system with variable stepsizes. In contrast, current conventional PDN simulation solvers have to use fixed step-size approach in order to reuse factorized matrices generated by the expensive matrix decomposition. Based on the proposed exponential integration framework, we design a PDN solver R-MATEX with the flexible time-stepping capability. The key operation of matrix exponential and vector product is computed by the rational Krylov subspace method. To further improve the runtime, we also propose a distributed computing framework DR-MATEX. DR-MATEX reduces Krylov subspace generations caused by frequent breakpoints from a large number of current sources during simulation. By virtue of the superposition property of linear system and scaling invariance property of Krylov subspace, DR-MATEX can divide the whole simulation task into subtasks based on the alignments of breakpoints among those sources. The subtasks are processed in parallel at different computing nodes without any communication during the computation of transient simulation. The final result is obtained by summing up the partial results among all the computing nodes after they finish the assigned subtasks. Therefore, our computation model belongs to the category known as embarrassingly parallel model. Experimental results show R-MATEX and DR-MATEX can achieve up to around 14.4× and 98.0× runtime speedups over traditional trapezoidal integration-based solver with fixed time-step approach.

2015 IEEE Symposium on Electromagnetic Compatibility and Signal Integrity | 2015

Dynamic analysis of power delivery network with nonlinear components using matrix exponential method

Hao Zhuang; Ilgweon Kang; Xinan Wang; Jeng-Hau Lin; Chung-Kuan Cheng

In this work, we propose a matrix exponential-based time-integration algorithm for dynamic analysis of power delivery network (PDN) with nonlinear components. The presented method is an explicit method and is very competitive for applications compared to traditional low order approximation methods, such as backward Euler method with Newton-Raphson iterations (BE). The proposed method takes comparable number of time steps to complete the whole simulation. Second, the method takes only one LU decomposition per time step while BE requires at least two LU decompositions for the convergence check of solutions of nonlnear system. Moreover, our method does not need to repeat expensive LU decomposition operations when the length of time steps are adjusted for error controls. The experimental results validate our methods efficiency. We observe the reductions of total LU operation number and the simulation runtime.

electrical performance of electronic packaging | 2007

Fast Algorithm for Determining Eye-Diagram Characteristics of Lossy Transmission Lines

Jeng-Hau Lin; Wei-Da Guo; Guang-Hwa Shiue; Chien-Min Lin; Tian Wei Huang; Ruey-Beei Wu

A novel algorithm for fast and accurately determining the height and width of eye diagrams at the receiving ends of transmission lines is proposed. While the two parameters concerned in the conductive and dielectric losses in response to the impulse stimulus are derived, the transfer function associated with the propagation coefficient to represent the signaling mechanism on the eye diagram can be developed. A systematic flow is implemented to acquire the predictable eye diagrams in a good agreement with the analysis results by the time-domain circuit simulator for varying designed geometries.

biomedical circuits and systems conference | 2015

An interdigitated non-contact ECG electrode for impedance compensation and signal restoration

Jeng-Hau Lin; Hao Liu; Chia-Hung Liu; Phillip Lam; Gung-Yu Pan; Hao Zhuang; Ilgweon Kang; Patrick P. Mercier; Chung-Kuan Cheng

This paper presents a non-contact electrocardiogram (ECG) measurement platform that compensates for motion-induced impedance changes via interdigitated electrode channels in concert with software reconstruction algorithms. Specifically, the impedance of the non-contact electrode is non-invasively acquired in real-time by exploiting a custom electrode designed with two independent channels featuring independent transfer functions that are used to reconstruct motion-compensated ECG waveforms. The developed platform is validated on human subjects, illustrating up to a 76.3% improvement over conventional approaches, paving the path towards comfortable, convenient, and robust non-contact electrophysiological sensing.

2015 IEEE Symposium on Electromagnetic Compatibility and Signal Integrity | 2015

Impulse response generation from S-parameters for power delivery network simulation

Ilgweon Kang; Xinan Wang; Jeng-Hau Lin; Ryan Coutts; Chung-Kuan Cheng

Accurate analysis of power delivery network is indispensable to assess VLSI package and interconnection network. Given the S-parameters that characterize the linear packaging system, we derive the transient response of power delivery networks. We utilize the compressed sensing technique to generate the impulse response that fits the S-parameters with sparsity. Our method shows accurate, concise, and stable results.

international conference on communications circuits and systems | 2013

A non-contact biopotential sensing system with motion artifact suppression

Haibing Su; Hao Liu; Shih-Hung Weng; Hui Wang; Aliasgar Presswala; Hao Zhuang; Jeng-Hau Lin; Patrick P. Mercier; Chung-Kuan Cheng

This paper describes a wearable sensing system to monitor biopotentials via noncontact capacitive sensors that are suitable for long-term and ambulatory monitoring applications. To overcome motion-induced measurement artifacts typically encountered in such systems, a motion artifact suppression technique is introduced. Specifically, a sensor that consists of a pair of physically-interleaved capacitive channels is designed to have different amounts of parasitic input capacitance, creating channel-specific outputs that depend on the input coupling capacitance itself. Differences in output channel results can then be placed through a digital reconstruction filter to re-create the original biopotential with attenuated motion artifacts. To validate the system concept, a wireless ECG sensing system is designed. Simulation results indicate that motion-induced signal distortion is reduced by over 14X after reconstruction.

computer vision and pattern recognition | 2017

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

Jeng-Hau Lin; Tianwei Xing; Ritchie Zhao; Zhiru Zhang; Mani B. Srivastava; Zhuowen Tu; Rajesh K. Gupta

State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution. Such networks strain the computational capabilities and energy available to embedded and mobile processing platforms, restricting their use in many important applications. In this paper, we propose BCNN with Separable Filters (BCNNw/SF), which applies Singular Value Decomposition (SVD) on BCNN kernels to further reduce computational and storage complexity. We provide a closed form of the gradient over SVD to calculate the exact gradient with respect to every binarized weight in backward propagation. We verify BCNNw/SF on the MNIST, CIFAR-10, and SVHN datasets, and implement an accelerator for CIFAR10 on FPGA hardware. Our BCNNw/SF accelerator realizes memory savings of 17% and execution time reduction of 31.3% compared to BCNN with only minor accuracy sacrifices.

Explore More