Derek Pang
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Derek Pang.
2010 18th International Packet Video Workshop | 2010
Aditya Mavlankar; Piyush Agrawal; Derek Pang; Sherif A. Halawa; Ngai-Man Cheung; Bernd Girod
ClassX is an interactive online lecture viewing system developed at Stanford University. Unlike existing solutions that restrict the user to watch only a pre-defined view, ClassX allows interactive pan/tilt/zoom while watching the video. The interactive video streaming paradigm avoids sending the entire field-of-view in the recorded high resolution, thus reducing the required data rate. To alleviate the navigation burden on the part of the online viewer, ClassX offers automatic tracking of the lecturer. ClassX also employs slide recognition technology, which allows automatic synchronization of digital presentation slides with those appearing in the lecture video. This paper presents a design overview of the ClassX system and the evaluation results of a 3-month pilot deployment at Stanford University. The results demonstrate that our system is a low-cost, efficient and pragmatic solution to interactive online lecture viewing.
acm multimedia | 2011
Sherif A. Halawa; Derek Pang; Ngai-Man Cheung; Bernd Girod
The ClassX open source project is a free experimental interactive video streaming platform designed for educators, researchers and software developers. With minimal infra-structure set-up, ClassX offers educational communities a cost-effective solution for online lecture delivery. Our goal is to encourage contributions from other researchers, developers and educators in building an open, cost-effective and state-of-the-art online education video viewing system for the general public.
acm multimedia | 2011
Derek Pang; Sherif A. Halawa; Ngai-Man Cheung; Bernd Girod
Small screen sizes, limited bandwidth, and low computational power often prohibit streaming of high-resolution videos to mobile devices over a wireless network. Recent advances in interactive region-of-interest (IRoI) video streaming technology enable users to interactively control pan/tilt/ zoom, while provide bit-rate and complexity savings. One recent application of IRoI video streaming is ClassX developed at Stanford University. It offers an open-source experimental platform for interactive online lecture streaming. In this technical demonstration, we present ClassX Mobile, which extends the current ClassX system and delivers high-quality interactive video to smartphones and tablets with multi-touch screens.
acm multimedia | 2011
Derek Pang; Sherif A. Halawa; Ngai-Man Cheung; Bernd Girod
Small screen sizes, limited bandwidth, and low computational power often prohibit streaming of high-resolution videos to mobile devices over a wireless network. Recent advances in interactive region-of-interest (IRoI) video streaming technology allow users to interactively control pan/tilt/zoom, while providing bit-rate and complexity savings. In this paper, we present a mobile IRoI video streaming system that delivers high-quality interactive video to smartphones and tablets with multi-touch screens. One of the challenges in IRoI video streaming is to enable low-latency interaction when a user switches between different RoIs. We propose a crowd-driven RoI prediction scheme to prefetch future selected regions. Different from previous approaches that extrapolate past user inputs or perform video semantic analysis, our proposed scheme exploits user viewing statistics collected at the server to make RoI predictions. Our experiments show that a crowd-driven prefetching scheme can substantially reduce average RoI switching delays compared to a system without prefetching.
international conference on multimedia and expo | 2009
Derek Pang; Xiaoyu Xiu; Jie Liang
Current view synthesis prediction (VSP) techniques for multiview video coding (MVC) rely on disparity-based view interpolation or depth-based 3D warping. The former cannot be applied to every camera view, whereas the latter may require coding of the depth information of a scene. To avoid these constraints, we propose an improved VSP-based MVC scheme based on the following three techniques: 1) view extrapolation, which allows VSP to be applicable to almost all camera views, 2) projective rectification, which improves the synthesis quality when neighboring camera planes are not parallel, and 3) synthesis bias correction, which uses the past synthesis biases to improve the synthesis quality of the current frame. Experimental results demonstrate that our scheme offers PSNR gains of up to 1.6 dB compared to the current MVC standard.
IEEE Transactions on Circuits and Systems for Video Technology | 2011
Xiaoyu Xiu; Derek Pang; Jie Liang
In this paper, we first develop improved projective rectification-based view interpolation and extrapolation methods, and apply them to view synthesis prediction-based multiview video coding (MVC). A geometric model for these view synthesis methods is then developed. We also propose an improved model to study the rate-distortion (R-D) performances of various practical MVC schemes, including the current joint multiview video coding standard. Experimental results show that our schemes achieve superior view synthesis results, and can lead to better R-D performance in MVC. Simulation results with the theoretical models help explaining the experimental results.
international conference on pattern recognition | 2008
Akisato Kimura; Derek Pang; Tatsuto Takeuchi; Junji Yamato; Kunio Kashino
This report proposes a new stochastic model of visual attention to predict the likelihood of where humans typically focus on a video scene. The proposed model is composed of a dynamic Bayesian network that simulates and combines a person¿s visual saliency response and eye movement patterns to estimate the most probable regions of attention. Dynamic Markov random field (MRF) models are newly introduced to include spatiotemporal relationships of visual saliency responses. Experimental results have revealed that the propose model outperforms the previous deterministic model and the stochastic model without dynamic MRF in predicting human visual attention.
international conference on image processing | 2011
Mina Makar; Yao-Chung Lin; Ngai-Man Cheung; Derek Pang; Bernd Girod
Multiview video coding systems are characterized by their high encoding complexity. In this paper, we propose to reduce the encoding complexity by omitting frames at the encoder in a pattern that keeps the ability to reconstruct an interpolated version of each omitted frame, by motion-compensated and inter-view interpolation. Since our goal is maintaining good visual quality, we propose a method to estimate the quality of the interpolated frames at the decoder by transmitting a small amount of error control information in lieu of an omitted frame. This information is obtained from a projection of the frame on a suitable low-dimensional basis. The projection coefficients can be compressed by conventional techniques or, more efficiently, by Slepian-Wolf coding. Based on the quality estimate, the decoder can adaptively detect the interpolation technique that results in better visual quality. Moreover, it can suppress occasional frames for which interpolation yields noticeable artifacts. Experimental results demonstrate that our approach eliminates most interpolation artifacts and achieves much better visual quality at a very small increase in bit-rate.
asilomar conference on signals, systems and computers | 2010
Mina Makar; Derek Pang; Yao-Chung Lin; Bernd Girod
Low-complexity video encoding is important for mobile and power-sensitive applications. One way to reduce encoding complexity is to drop frames at the encoder and perform motion-compensated frame interpolation at the decoder. We propose a method to estimate the quality of the interpolated frames at the decoder by transmitting a small amount of error control information in lieu of an omitted frame. Typically, this information is obtained from a projection of the frame on a suitable low-dimensional basis. The projection coefficients can be compressed by conventional techniques or, more efficiently, by Slepian-Wolf coding. Based on the quality estimate, the decoder can recognize and suppress occasional frames for which motion-compensated interpolation does not yield satisfactory picture quality. Experimental results demonstrate that our approach eliminates most interpolation artifacts and achieves much better visual quality at a negligible increase in bit-rate.
arXiv: Computer Vision and Pattern Recognition | 2010
Akisato Kimura; Derek Pang; Tatsuto Takeuchi; Kouji Miyazato; Junji Yamato; Kunio Kashino