Jordi Ribas-Corbera
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jordi Ribas-Corbera.
IEEE Transactions on Circuits and Systems for Video Technology | 1999
Jordi Ribas-Corbera; Shaw-Min Lei
An important motivation for the development of the emerging H.263+ and MPEG-4 coding standards is to enhance the quality of highly compressed video for two-way, real-time communications. In these applications, the delay produced by bits accumulated in the encoder buffer must be very small, typically below 100 ms, and the rate control strategy is responsible for encoding the video with high quality and maintaining a low buffer delay. In this work, we present a simple rate control technique that achieves these two objectives by smartly selecting the values of the quantization parameters in typical discrete cosine transform video coders. To do this, we derive models for bit rate and distortion in this type of coders, in terms of the quantization parameters. Using Lagrange optimization, we minimize distortion subject to the target bit constraint, and obtain formulas that indicate how to choose the quantization parameters. We implement our technique in H.263 and MPEG-4 coders, and compare its performance to TMN7 and VM7 rate control when the encoder buffer is small, for a variety of video sequences and bit rates. This new method has been adopted as a rate control tool in the test model TMN8 of H.263+ and (with some modifications) in the verification model VM8 of MPEG-4.
IEEE Transactions on Circuits and Systems for Video Technology | 2003
Jordi Ribas-Corbera; Philip A. Chou; Shankar Regunathan
In video coding standards, a compliant bit stream must be decoded by a hypothetical decoder that is conceptually connected to the output of an encoder and consists of a decoder buffer, a decoder, and a display unit. This virtual decoder is known as the hypothetical reference decoder (HRD) in H.263 and the video buffering verifier in MPEG. The encoder must create a bit stream so that the hypothetical decoder buffer does not overflow or underflow. These previous decoder models assume that a given bit stream will be transmitted through a channel of a known bit rate and will be decoded (after a given buffering delay) by a device of some given buffer size. Therefore, these models are quite rigid and do not address the requirements of many of todays important video applications such as broadcasting video live or streaming pre-encoded video on demand over network paths with various peak bit rates to devices with various buffer sizes. In this paper, we present a new HRD for H.264/AVC that is more general and flexible than those defined in prior standards and provides significant additional benefits.
Signal Processing-image Communication | 2004
Sridhar Srinivasan; Pohsiang Hsu; Tom Holcomb; Kunal Mukerjee; Shankar Regunathan; Bruce Lin; Jie Liang; Ming-Chieh Lee; Jordi Ribas-Corbera
Abstract Microsoft ® Windows Media 9 Series is a set of technologies that enables rich digital media experiences across many types of networks and devices. These technologies are widely used in the industry for media delivery over the internet and other media, and are also applied to broadcast, high definition DVDs, and digital projection in theaters. At the core of these technologies is a state-of-the-art video codec called Windows Media Video 9 (WMV-9), which provides highly competitive video quality for reasonable computational complexity. WMV-9 is currently under standardization by the Society of Motion Picture and Television Engineers (SMPTE) and the spec is at the CD (Committee Draft) stage. This paper includes a brief introduction to Windows Media technologies and their applications, with a focus on the compression algorithms used in WMV-9. We present analysis, experimental results, and independent studies that demonstrate quality benefits of WMV-9 over a variety of codecs, including optimized implementations of MPEG-2, MPEG-4, and H.264/AVC. We also discuss the complexity advantages of WMV-9 over H.264/AVC.
IEEE Transactions on Circuits and Systems for Video Technology | 2000
Jordi Ribas-Corbera; Shaw-Min Lei
In typical block-based video coding, the rate-control scheme allocates a target number of bits to each frame of a video sequence and selects the block quantization parameters to meet the frame targets. In this work, we present a new technique for assigning such targets. This method has been adopted in the test model TMN10 of H.263+, but it is applicable to any video coder and is particularly useful for those that use B frames. Our approach selects the frame targets using formulas that result from combining an analytical rate-distortion optimization and a heuristic technique that compensates for the distortion dependency among frames. The method does not require pre-analyses, and encodes each frame only once; hence, it is geared toward low-complexity real-time video coding. We compare this new frame-layer bit allocation in TMN10 to that in MPEG-2s TM5 for a variety of bit rates and video sequences.
international conference on image processing | 1998
Scott J. Daly; Kristine E. Matthews; Jordi Ribas-Corbera
We have developed a coding technique which exploits characteristics of the human visual system to allocate more bits to the region in which a viewer is most likely, and assumed, to be looking. We focus on applications, such as video phone and video teleconferencing, in which the regions of interest are those containing human faces. Our approach encodes the video frames more efficiently than uniform quantization and can be used to significantly reduce the encoding bit rate or to improve the perceived image quality for a given bit budget. The method consists of the following: face detection and tracking, local visual-sensitivity determination, and quantizer control.
IEEE Transactions on Circuits and Systems for Video Technology | 2001
Jordi Ribas-Corbera; David L. Neuhoff
All motion-vectors are encoded with the same tired accuracy, typically 1/2-pixel accuracy, but the best motion-vector accuracies are not known. We present a theoretical framework to find the motion-vector accuracies that minimize the total encoding rate with this type of coder, for the classical case where all motion-vectors are encoded with the same accuracy, and for new cases where the accuracy is adapted on a frame-by-frame or block-by-block basis. To do this, we analytically model the effect of motion-vector accuracy and show that the energy in a block of the difference frame is approximately quadratic in the accuracy of the blocks motion-vector. This energy-accuracy model is then used to obtain expressions for the total bit rate (motion rate plus difference frame rate) in terms of the blocks motion accuracies and other key parameters. Minimizing these expressions leads to simple formulas that indicate how to choose the best motion-vector accuracies for this type of coder. These formulas also show that the motion accuracy must increase where more texture is present and decrease when there is much scene noise or when the level of compression is high. We implement several entropy and MPEG-like video coders based on our analysis and present experimental results on synthetic and real video sequences. These results suggest that our formulas are accurate and that significant bit rate savings can be achieved when our optimization procedures are used.
visual communications and image processing | 1997
Jordi Ribas-Corbera; David L. Neuhoff
In block-based video coding, the current frame to be encoded is decomposed into blocks of the same size, and a motion vector is used to improve the prediction for each block. The motion vectors and the difference frame, which contains the blocks prediction errors, must be encoded with bits. Typically, choosing a smaller block size will improve the prediction and hence decrease the number of difference frame bits, but it will increase the number of motion bits since more motion vectors need to be encoded. Not surprisingly, there must be some value for the block size that optimizes the tradeoff between motion and difference frame bits, and thus minimizes their sum. Despite the widespread experience with block-based video coders, there is little analysis or theory that quantitatively explains the effect of block size on encoding bit rate, and ordinarily the block size for a coder is chosen based on empirical experiments on video sequences of interest. In this work, we derive a procedure to determine the optimal block size that minimizes the encoding rate for a typical block-based video coder. To do this, we analytically model the effect of block size and derive expressions for the encoding rates for both motion vectors and difference frames, as functions of block size. Minimizing these expressions leads to a simple formula that indicates how to choose the block size in these types of coders. This formula also shows that the best block size is a function of the accuracy with which the motion vectors are encoded and several parameters related to key characteristics of the video scene,such as image texture, motion activity, interframe noise, and coding distortion. We implement the video coder and use our analysis to optimize and explain its performance on real video frames.
Journal of Electronic Imaging | 1998
Jordi Ribas-Corbera; David L. Neuhoff
Despite the widespread experience with block-based video coders, there is little analysis or theory that quantitatively explains the effect of block size on encoding bit rate, and ordinarily the block size for a coder is chosen based on empirical experiments on video sequences of interest. In this work, we derive a procedure to determine the optimal block size that minimizes the encoding rate for a typical block-based video coder. To do this, we analytically model the effect of block size and derive expressions for the encoding rates for both motion vectors and difference frames as functions of block size. Minimizing these expressions leads to a simple formula that indicates how to choose the block size in these types of coders. This formula also shows that the best block size is a function of the accuracy with which the motion vectors are encoded and several parameters related to key characteristics of the video scene, such as image texture, motion activity, interframe noise, and coding distortion. We implement the video coder and use our analysis to optimize and explain its performance on real video frames.
international conference on image processing | 2000
Jiandong Shen; Jordi Ribas-Corbera
In video coding standards such as H.263 and MPEG4, the motion vectors are restricted to have the same accuracy for the whole video sequence. In this paper, we explore the benefits of adapting motion vector accuracy at each macroblock of the video frames. We use a Lagrange criterion to select the best motion vectors and their accuracies, and encode them using an adaptive VLC table. We implemented full-search and fast-search versions of our technique in the test model TML-1 of the emerging video coding standard H.26L. In comparison to the original TML-1, which uses fixed 1/3-pixel motion accuracy for all the macroblocks, the proposed adaptive scheme can provide significant coding gains at the cost of slightly increasing computational complexity.
Journal of Electronic Imaging | 2001
Scott J. Daly; Kristine E. Matthews; Jordi Ribas-Corbera
Sensitivity and resolution reduction as a function of eccentricity account for one of the largest sources of compression in vision. However, utilization of this visual property has been limited to systems that directly measure the viewer’s gaze position. We have applied visual eccentricity models to videophone compression applications without using eye tracking by combining the visual model with a face tracking algorithm. In lieu of a gaze detector, we assume the gaze will be directed to the faces appearing in images. The incorporation of resolution as well as sensitivity-based eccentricity models in a low bit rate video-compression standard (H.263) will be discussed. For videophone applications, we achieve up to a 50% reduction in bit rate while retaining similar image quality. Problems arising from the increased temporal sensitivity of the periphery, despite its reduced spatial bandwidth, are also discussed.