Philip P. Dang
STMicroelectronics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Philip P. Dang.
Journal of Real-time Image Processing | 2006
Philip P. Dang
This paper discusses the challenges of the design of real-time image and video processing systems and reviews some practical design approaches for these systems.
IEEE Transactions on Very Large Scale Integration Systems | 2008
Philip P. Dang
This paper presents an efficient architecture of an application specific processor (ASP) designed for the deblocking filter algorithm of the H.264 video compression standard. Several optimization techniques at different design levels, such as vector register, pipeline processing, very long instruction word (VLIW) processor, and predication, are utilized in this design. The proposed ASP can meet the real time constraint of the deblocking filter algorithm for the 16:9 video format (4690 times 2304) at 30 frames per second (fps) at 200-MHz clock rate.
international conference on consumer electronics | 2005
Philip P. Dang
This work presents an efficient architecture to implement the in-loop deblocking filter for the H.264 video compression standard. The proposed solution is based on the implementation of a very long instruction word (VLIW) architecture, pipelined processing and predication technique.
electronic imaging | 2008
Philip P. Dang
This paper presents an efficient VLSI architecture for the intra prediction of the H.264 video compression standard. To address the computational complexity issue, we propose a dedicated processor that can compute multiple intra prediction modes in parallel. The proposed architecture accelerates the intra coding process. It can support large video format at high frame rate in real-time.
electronic imaging | 2005
Philip P. Dang
H.264 is the latest video compression standard. Its rate distortion is greatly improved comparing to the MPEG-1, MPEG-2, MPEG-4, H.261 and H.263. Among many features of H.264, sub-pixel motion compensation is one of the factors that make H.264 a better coding scheme. H.264 implements both half-pixel interpolation and quarter-pixel interpolation. The computational complexity of sub-pixel motion compensation is therefore high. This paper presents an efficient VLSI architecture for fast implementation of sub-pixel interpolation of H.264. Several techniques are designed to reduce the number of memory access and accelerate the interpolation computations.
electronic imaging | 2004
Philip P. Dang; Truong Q. Nguyen; Trac D. Tran
This paper presents an efficient VLSI architecture and a low complexity implementation of BinDCT coprocessor for wireless video application. The coprocessor architecture was implemented in VHDL and was synthesized with 0.18 mm CMOS technology. The footprint of the 2-D BinDCT coprocessor, which includes memory buffer, is 0.1173 mm2. The BinDCT coprocessor can calculate video in CIF format at 30 frames per second at 5 MHz clock rate with 1.55-volt power supply. The BinDCT coprocessor dissipates 12.05 mW. With its fast transform, compact size and low power consumption, the BinDCT coprocessor is an excellent candidate for DCT-based wireless multimedia coding systems.
international conference on consumer electronics | 2006
Philip P. Dang
This paper presents an efficient VLSI architecture to implement subpixel interpolation for H.264 video compression standard.
electronic imaging | 2006
Philip P. Dang
This paper introduces an adaptive approach for image scaling. In addition, we present an efficient VLSI architecture to implement the proposed algorithm in hardware. The proposed architecture is designed to address the real-time constrain for high performance consumer products. A case study for printer application is presented.
electronic imaging | 2004
Philip P. Dang
This paper presents a VLSI architecture and an efficient implementation of an embedded transform coprocessor for H.264 video compression standard. The proposed coprocessor was designed to work with an ARM946E-S processor. To enhance the performance, both data parallelism and pipelined architecture are utilized in the design. In this study, coprocessor was synthesized with 0.18 μm CMOS technology and its footprint is only 0.0838 mm2. Coprocessor can calculate 2-D transform for a macroblock in 30 clock cycles. The 2-D transform coprocessor dissipates 529 μW with 1.55-volt power supply at 10 MHz clock rate.
color imaging conference | 2003
Philip P. Dang
In this paper, we present an architecture of a color halftoning coprocessor. The design is based on a software/hardware design approach in which the flexibility and adaptability of the programmable processor and the high performance, low power of ASIC design are utilized. We employ the concurrency and locality concepts in computer architecture to address the computational intensive and data intensive issues of the color halftoning algorithm. Both instruction parallelism and data parallelism are exploited to speed up the performance. In addition, the fine-grain and middle-grain instruction level parallelism (ILP) are utilized to accelerate the computation in the color error diffusion halftoning process.