Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Lurng-Kuo Liu is active.

Publication


Featured researches published by Lurng-Kuo Liu.


IEEE Transactions on Circuits and Systems for Video Technology | 1996

A block-based gradient descent search algorithm for block motion estimation in video coding

Lurng-Kuo Liu; Ephraim Feig

A block-based gradient descent search (BBGDS) algorithm is proposed in this paper to perform block motion estimation in video coding. The BBGDS evaluates the values of a given objective function starting from a small centralized checking block. The minimum within the checking block is found, and the gradient descent direction where the minimum is expected to lie is used to determine the search direction and the position of the new checking block. The BBGDS is compared with full search (FS), three-step search (TSS), one-at-a-time search (OTS), and new three-step search (NTSS). Experimental results show that the proposed technique provides competitive performance with reduced computational complexity.


international parallel and distributed processing symposium | 2008

Financial modeling on the cell broadband engine

Virat Agarwal; Lurng-Kuo Liu; David A. Bader

High performance computing is critical for financial markets where analysts seek to accelerate complex optimizations such as pricing engines to maintain a competitive edge. In this paper we investigate the performance of financial workloads on the Sony-Toshiba- IBM Cell Broadband Engine, a heterogeneous multicore chip architected for intensive gaming applications and high performance computing. We analyze the use of Monte Carlo techniques for financial workloads and design efficient parallel implementations of different high performance pseudo and quasi random number generators as well as normalization techniques. Our implementation of the Mersenne Twister pseudo random number generator outperforms current Intel and AMD architectures by over an order of magnitude. Using these new routines, we optimize European option (EO) and collateralized debt obligation (CDO) pricing algorithms. Our Cell-optimized EO pricing achieves a speedup of over 2 in comparison with using RapidMind SDK for Cell, and comparing with GPU, a speedup of 1.26 as compared with using RapidMind SDK for GPU (NVIDIA GeForce 8800), and a speedup of 1.51 over NVIDIA GeForce 8800 (using CUDA). Our detailed analyses and performance results demonstrate that the Cell/B.E. processor is well suited for financial workloads and Monte Carlo simulation.


international parallel and distributed processing symposium | 2012

Reducing Data Movement Costs: Scalable Seismic Imaging on Blue Gene

Michael P. Perrone; Lurng-Kuo Liu; Ligang Lu; Karen A. Magerlein; Changhoan Kim; Irina Fedulova; Artyom Semenikhin

We present an optimized Blue Gene/P implementation of Reverse Time Migration, a seismic imaging algorithm widely used in the petroleum industry today. Our implementation is novel in that it uses large communication bandwidth and low latency to convert an embarrassingly parallel problem into one that can be efficiently solved using massive domain partitioning. The success of this seemingly counterintuitive approach is the result of several key aspects of the imaging problem, including very regular and local communication patterns, balanced compute and communication requirements, scratch data handling, multiple-pass approaches, and most importantly, the fact that partitioning the problem allows each sub-problem to fit in cache, dramatically increasing locality and bandwidth and reducing latency. This approach can be easily extended to next-generation imaging algorithms currently being developed. In this paper we present details of our implementation, including application-scaling results on Blue Gene/P.


international conference on multimedia and expo | 2007

Digital Media Indexing on the Cell Processor

Lurng-Kuo Liu; Qiang Liu; Apostol Natsev; Kenneth A. Ross; John R. Smith; Ana Lucia Varbanescu

We present a case study of developing a digital media indexing application, code-named MARVEL, on the STI cell broadband engine (CBE) processor. There are two aspects of the target application that require significant computing power: image analysis for feature extraction, and support vector machine (SVM) based pattern classification for concept detection. We discuss the mapping of a large application like MARVEL onto a multicore processor, and show how feature extraction and concept detection can be implemented on the CBE. We discuss how the synergistic processing units of a CBE can be used to gain dramatic performance improvements. The empirical results of our experiments, conducted on a Cell blade running at 3.2 GHz, show that the CBE provides a significant performance speed-up in our digital media indexing application.


international conference on multimedia and expo | 2006

Video Analysis and Compression on the STI Cell Broadband Engine Processor

Lurng-Kuo Liu; Sreeni Kesavarapu; Jonathan H. Connell; Ashish Jagmohan; Lark-hoon Leem; Brent Paulovicks; Vadim Sheinin; Lijung Tang; Hangu Yeo

With increased concern for physical security, video surveillance is becoming an important business area. Similar camera-based system can also be used in such diverse applications as retail-store shopper motion analysis and casino behavioral policy monitoring. There are two aspects of video surveillance that require significant computing power: image analysis for detecting objects, and video compression for digital storage. The new STI CELL broadband engine (CBE) processor is an appealing platform for such applications because it incorporates 8 separate high-speed processing cores with an aggregate performance of 256Gflops. Moreover, this chip is the heart of the new Sony Playstation 3 and can be expected to be relatively inexpensive due to the high volume of production. In this paper we show how object detection and compression can be implemented on the CBE, discuss the difficulties encountered in porting the code, and provide performance results demonstrating significant speed-up


international conference on parallel processing | 2007

An Effective Strategy for Porting C++ Applications on Cell

A. Lucia; Henk J. Sips; Kenneth A. Ross; Qiang Liu; Lurng-Kuo Liu; Apostol Natsev; John R. Smith

In this paper we present a solution for efficient porting of sequential C++ applications on the Cell B.E. processor. We present our step-by-step approach, focusing on its generality, we provide a set of code templates and optimization guidelines to support the porting, and we include a set of equations to estimate the performance gain of the new application. As a case-study, we show the use of our solution on a multimedia content analysis application, named MARVEL. The results of our experiments with MARVEL prove the significant performance increase in favor of the application running on Cell when compared with the reference implementation.


visual communications and image processing | 1998

Low-delay MPEG-2 video coding

Tri D. Tran; Lurng-Kuo Liu; Peter Westerink

High-quality and low-delay MPEG-2 video coding can be achieved by avoiding the use of intra (I) and bidirectional prediction (B) pictures. Such coding requires intra macroblocks refreshing techniques for channel error propagation resilience and for compliance with the accuracy requirements of the MPEG-2 standard. This paper describes some of these techniques and presents software simulation results of their performance in terms of image quality and their robustness of transmission channel errors.


international symposium on multimedia | 2002

An integrated live interactive content insertion system for digital TV commerce

Liang-Jie Zhang; Jen-Yao Chung; Lurng-Kuo Liu; James S. Lipscomb; Qun Zhou

We present an advanced architecture for selectively inserting interactive content into a live TV broadcast anti tracking the usage of the inserted content by client viewers. Specifically, we propose a multi-level interactive content preview mechanism, agent-based information exchange mechanism and E-commerce enablement technology powered by an Interactive Content Creation Engine (ICCE). Also, we introduce two types of mapping lists for scheduling and for an intelligent decision maker based oil the transaction monitoring and multi-level content preview. We propose that the improved live interactive content insertion method and system will be especially useful for building interactive TV e-commerce solutions.


multimedia signal processing | 1997

Dynamic search range motion estimation for video coding

Lurng-Kuo Liu

A fast algorithm for motion estimation in interframe video coding is proposed in this paper. In contrast to previously proposed fast algorithms which use limited number of check points in a constant search range, the proposed algorithm performs search in a dynamic search range. It provides better estimation accuracy than that of previously proposed fast algorithms which use limited search points. Experimental results show that the proposed algorithm provides a comparable performance with that of full search but with reduced computational complexity.


international conference on image processing | 1997

Rate-constrained motion estimation algorithm for video coding

Lurng-Kuo Liu

We propose a rate-distortion-constrained motion estimation algorithm that leads to improvements in motion-compensated video coding. This is especially true in low bit rate video coding applications such as video conferencing. We introduce the concept of coding efficiency, and the rate-distortion optimization process is formulated as a problem of maximizing coding efficiency. Based on the coding efficiency, the proposed motion estimation algorithm measures the effect of choosing different motion vectors on the overall bit rate and reconstruction distortion. The main advantages of the algorithm are that it employs a more effective measure of performance and it is computationally simple. Our experimental results show that our proposed rate-constrained motion estimation algorithm yields better rate-distortion performance when compared with conventional motion estimation algorithms.

Researchain Logo
Decentralizing Knowledge