Yeong-Kang Lai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yeong-Kang Lai is active.

Explore More

Publication

Featured researches published by Yeong-Kang Lai.

IEEE Transactions on Circuits and Systems for Video Technology | 1998

A data-interlacing architecture with two-dimensional data-reuse for full-search block-matching algorithm

Yeong-Kang Lai; Liang-Gee Chen

This paper describes a data-interlacing architecture with two-dimensional (2-D) data-reuse for full-search blockmatching algorithm. Based on a one-dimensional processing element (PE) array and two data-interlacing shift-register arrays, the proposed architecture can efficiently reuse data to decrease external memory accesses and save the pin counts. It also achieves 100% hardware utilization and a high throughput rate. In addition, the same chips can be cascaded for different block sizes, search ranges, and pixel rates.

IEEE Transactions on Consumer Electronics | 2009

A high-performance and memory-efficient VLSI architecture with parallel scanning method for 2-D lifting-based discrete wavelet transform

Yeong-Kang Lai; Lien-Fei Chen; Yui-Chih Shih

In this paper, we present a high performance and memory-efficient pipelined architecture with parallel scanning method for 2-D lifting-based DWT in JPEG2000 applications. The Proposed 2-D DWT architecture are composed of two 1-D DWT cores and a 2times2 transposing register array. The proposed 1-D DWT core consumes two input data and produces two output coefficients per cycle, and its critical path takes one multiplier delay only. Moreover, we utilize the parallel scanning method to reduce the internal buffer size instead of the line-based scanning method. For the NtimesN tile image with one-level 2-D DWT decomposition, only 4N temporal memory and the 2times2 register array are required for 9/7 filter to store the intermediate coefficients in the column 1-D DWT core. And the column-processed data can be rearranged in the transposing array. According to the comparison results, the hardware cost of the 1-D DWT core and the internal memory requirements of proposed 2-D DWT architecture are smaller than other familiar architectures based on the same throughput rate. The implementation results show that the proposed 2-D DWT architecture can process 1080 p HDTV pictures with five-level decomposition at 30 frames/sec.

international symposium on circuits and systems | 2008

A high-speed 2-D transform architecture with unique kernel for multi-standard video applications

Chong-Yu Huang; Lien-Fei Chen; Yeong-Kang Lai

In this paper, a high-speed two-dimensional (2-D) transform architecture with unique kernel for multi-standard video applications is proposed. On the basis of the new distributed arithmetic algorithm (NEDA), we utilize the recursive discrete cosine transform (DCT) algorithm to reduce the computational complexity of the NEDA and propose the unique kernel framework to unify the adder matrix design for different coefficient requirements. Owing to the proposed unique kernel framework, the adder matrices, which only have 13 adders, are independent to the coefficients of the 2-D transform. Therefore, many 2-D transform matrices can be easily realized via proposed unique kernel and the efficient routing network to accomplish multi-standard video coding requirement.

IEEE\/OSA Journal of Display Technology | 2011

Content-Based LCD Backlight Power Reduction With Image Contrast Enhancement Using Histogram Analysis

Yeong-Kang Lai; Yu-Fan Lai; Peng-Yu Chen

In recent years, low-power technology has had a significant impact on portable electronic devices; with mobile devices, the low-power circuit design has become the primary issue. At present, thin-film transistor liquid crystal display (TFT LCD) is widely used in handheld mobile devices. In terms of the overall system power consumption, TFT LCD power consumes 20%-45% of total system power due to different applications. The backlight of an LCD display dominates the power consumption of the whole system; controlling the backlight current to reduce the brightness and the contrast of LCDs can reduce the overall power consumption. However, this may cause significant changes in visual perception. In order to reduce the power consumption and eliminate the visual changes, the issue becomes: how to reduce the current by adjusting brightness and contrast in accordance with the current image. Based on content analysis, this paper proposes two new algorithms: the new backlight-dimming algorithm (NBDA) and the new image enhancement algorithm (NIEA). The proposed methods can, on average, simultaneously reduce power consumption by 47% and improve the image enhancement ratio by 6.8%. Moreover, the structural-similarity index metric (SSIM) is used to evaluate image quality.

international symposium on circuits and systems | 2005

A simple and cost effective video encoder with memory-reducing CAVLC

Yeong-Kang Lai; Chih-Chung Chou; Yu-Chieh Chung

In this paper, a simple and cost effective video encoder with memory efficient context adaptive variable length coder (CAVLC) is proposed for low cost multimedia applications. According to the proposed memory reduction architecture, three coding level variables (prefix, length, and codeword) can be calculated on-the-fly to eliminate seven (level-VLCN, N=0 to 6) 28/spl times/64 k bit coding table memories. We implemented the design on a Xilinx FPGA prototyping board. Its maximum working frequency is 28 MHz. And the gate count is 9171 (NAND2) in TSMC 0.35 /spl mu/m technology (only the video encoder). The results show that a low-cost encoder is feasible, and the memory size of the proposed architecture is smaller than others.

international conference on consumer electronics | 1997

A Novel MPEG-2 Audio Decoder With Efficient Data Arrangement And Memory Configuration

Tsung-Han Tsai; Liang-Gee Chen; Yeong-Kang Lai; Po-Cheng Wu

The paper describes a novel MPEG-2 audio decoder with a new modified scheme. In the techniques of intelligent data arrangement, the complexity of the multichannel decoding can be largely reduced. In the modified decoding scheme, the bottleneck computation module can be reduced to a quarter of the size of the original. Also, the major memory storage only requires half the size of the standard synthesis subband filterbank.

IEEE\/OSA Journal of Display Technology | 2013

An Effective Hybrid Depth-Generation Algorithm for 2D-to-3D Conversion in 3D Displays

Yeong-Kang Lai; Yu-Fan Lai; Ying-Chang Chen

In recent years, 3D display technology has been receiving increasingly more attention. The most intuitive 3D method is to use two temporally synchronized video streams for the left and right eyes, respectively. However, traditional 2D video contents are captured by one camera, and in order to synthesize the left and right views as the behavior of two cameras, depth map information is required. In this paper, we propose a hybrid algorithm for 2D-to-3D conversion in 3D displays; it is a good way to solve the problem of traditional 2D video contents which need to generate 3D effects in 3D displays. We choose three depth cues for depth estimation: motion information, linear perspective, and texture characteristics. Moreover, we adopt a bilateral filter for depth map smoothing and noise removal. From the experimental results, execution time can be reduced by 25%-35% and the depth perception score is between 75 and 85. Thus, the human eye cannot sense the noticeable differences from the final 3D rendering. Furthermore, it is very suitable to apply our proposed hybrid algorithm to 2D-to-3D conversion in 3D displays.

international conference on consumer electronics | 2001

A memory efficient motion estimator for three step search block-matching algorithm

Yeong-Kang Lai

This paper describes a memory efficient array architecture with data-rings for the 3-step hierarchical search block-matching algorithm (3SHS). With the efficient data-rings and memory organization, the regular raster-scanned data flow and comparator-tree structure can be used to simplify control scheme and reduce latency, respectively. In addition, we utilize the three-half-search-area scheme and circular addressing method to reduce external memory access and memory size, respectively. The results demonstrate that the array architecture with a memory efficient scheme requires a smaller memory size and low I/O bandwidth. It also provides a high normalized throughput solution for the 3SHS.

IEEE Transactions on Consumer Electronics | 2010

Hybrid parallel motion estimation architecture based on fast top-winners search algorithm

Yeong-Kang Lai; Lien-Fei Chen; Shien-Yu Huang

In this paper, a hybrid parallel motion estimation architecture based on the fast top-winners algorithm is proposed. In the first instance, the fast top-winners search algorithm is discussed based on the pel-subsampling technique to reduce the computational amount of the sum of absolute difference (SAD). Moreover, the four-parallel spiral scanning (4PSP) with the partial distortion elimination (PDE) mechanism is also utilized to early terminate the unnecessary SAD. Therefore, the proposed fast algorithm can not only avoid trapping into the problem of the local minimum but also save the computational operations with a little performance degradation. According to our proposed algorithm, the 4×4 processing element (PE) array and the dual mode SAD tree are proposed to efficiently perform SAD and Sub-SAD which is accumulated based on the pel-subsampling. For the sake of reducing the system memory bandwidth and decreasing the frequency of the memory access, the local memory configuration and the novel memory interleaving organization are proposed to arrange the current data and reference pixels easily, to access the image pixels efficiently, and to achieve the Level C (Lv. C) data reuse scheme.

international conference on consumer electronics | 1996

A multimedia video conference system: using region base hybrid coding

Hsu-Tung Chen; Po-Cheng Wu; Yeong-Kang Lai; Liang-Gee Chen

In this paper, a video coding algorithm suitable for video conference application, and an investigation of a video conference system over a LAN are presented. The proposed video coding algorithm is called the region base hybrid coding algorithm. This method can perform foreground and background segmentation. This is based on the characteristic of an unchanged background on most of the video conference applications. This property can reduce both data rate and computation load. The proposed algorithm is verified on a video conference system over a LAN, and it can work successfully.

Explore More