Keman Yu
Microsoft
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Keman Yu.
international symposium on circuits and systems | 2005
Jiefu Zhai; Keman Yu; Jiang Li; Shipeng Li
In low bit-rate video communication, temporal subsampling is usually used due to limited available bandwidth. Motion compensated frame interpolation (MCFI) techniques are often employed in the decoder to restore the original frame rate and enhance the temporal quality. In this paper, we propose a low-complexity and high efficiency MCFI method. It first examines the motion vectors embedded in the bit-stream, then carries out overlapped block bi-directional motion estimation on those blocks whose embedded motion vectors are regarded as not accurate enough. Finally, it utilizes motion vector post-processing and overlapped block motion compensation to generate interpolated frames and further reduce blocking artifacts. Experimental results show that the proposed algorithm outperforms other methods in both PSNR and visual performance, while its complexity is also lower than other methods.
IEEE Transactions on Circuits and Systems for Video Technology | 2005
Libo Yang; Keman Yu; Jiang Li; Shipeng Li
The H.264 video coding standard provides considerably higher coding efficiency than previous standards do, whereas its complexity is significantly increased at the same time. In an H.264 encoder, the most time-consuming component is variable block-size motion estimation. To reduce the complexity of motion estimation, an early termination algorithm is proposed in this paper. It predicts the best motion vector by examining only one search point. With the proposed method, some of the motion searches can be stopped early, and then a large number of search points can be skipped. The proposed method can work with any fast motion estimation algorithm. Experiments are carried out with a fast motion estimation algorithm that has been adopted by H.264. Results show that significant complexity reduction is achieved while the degradation in video quality is negligible.
international conference on acoustics, speech, and signal processing | 2005
Libo Yang; Keman Yu; Jiang Li; Shipeng Li
In an H.264 video encoder, motion estimation (ME) is the most time-consuming component. The ME process consists of two stages, integer pixel search and fractional pixel search. Since the complexity of integer pixel search has been greatly reduced by numerous fast ME algorithms, the computation overhead required by fractional pixel ME has become relatively significant. To reduce the complexity of fractional pixel ME, we propose a prediction-based directional fractional pixel ME algorithm. We utilize more accurate motion vector predictions and directional search to achieve better computation reduction. We further propose an early termination method to decrease the amount of search. Experimental results show that, compared to the full search sub-pel ME and the fast sub-pel ME proposed in H.264, the proposed method can reduce up to 84% and 74% of fractional pixel search points respectively, with a negligible degradation in quality.
international conference on acoustics, speech, and signal processing | 2004
Cuizhu Shi; Keman Yu; Jiang Li; Shipeng Li
In videoconferencing, the image quality is significantly affected by the illumination condition. Unsatisfactory illumination conditions may lead to underexposure or overexposure of the area of interest, in particular a human face. To resolve this issue, we propose a solution to improve image quality automatically by correcting exposure and enhancing contrast. Our work is characterized by a method for automatically building a skin-color model and a novel contrast enhancement approach. Some techniques that can reduce the computational cost are also introduced. Experimental results show that obvious improvement in image quality is achieved while the computation overhead is very small. The proposed solution can be integrated into videoconferencing systems and is especially suitable for scenarios where low-complexity computing is required.
acm multimedia | 2001
Jiang Li; Gang Chen; Jizheng Xu; Yong Wang; Hanning Zhou; Keman Yu; King To Ng; Heung-Yeung Shum
The rapid development of wired and wireless networks tremendouslyfacilitates communications between people. However, most of thecurrent wireless networks still work in low bandwidths, and mobiledevices still suffer from weak computational power, short batterylifetime and limited display capability. We developed a very lowbit-rate bi-level video coding technique, which can be used invideo communications almost anywhere, anytime on any device. Thespirit of this method is that rather than giving highest priorityto the basic colors of an image as in conventional DCT-basedcompression methods, we give preference to the outline features ofscenes when we have limited bandwidths. These features can berepresented by bi-level image sequences that are converted fromgray-scale image sequences. By analyzing the temporal correlationbetween successive frames and flexibilities in the scenepresentation using bi-level images, we achieve very high ratioswith our bi-level video compression scheme. Experiments show thatin low bandwidths, our method provides clearer shape, smoothermotion, shorter initial latency and much cheaper computational costthan do DCT-based methods. Our method is especially suitable forsmall mobile devices such as handheld PCs, palm-size PCs and mobilephones that possess small display screens and light computationalpower, and work in low bandwidth wireless networks. We have builtPC and Pocket PC versions of bi-level video phone systems, whichtypically provide QCIF-size video with a frame rate of 5-15 fps fora 9.6 Kbps bandwidth.
IEEE Transactions on Circuits and Systems for Video Technology | 2003
Jiang Li; Keman Yu; Tielin He; Yunfeng Lin; Shipeng Li; Ya-Qin Zhang
Wireless networks have been rapidly developing in recent years. General Packet Radio Service (GPRS) and Code Division Multiple Access (CDMA 1X) for wide areas, and 802.11 and Bluetooth for local areas have already emerged. Broadband wireless networks urgently call for rich contents for consumers. Among various possible applications, video communication is one of the most promising for mobile devices on wireless networks. This paper describes the generation, coding, and transmission of an effective video form, scalable portrait video for mobile video communication. As an expansion to bilevel video, portrait video is composed of more gray levels, and therefore possesses higher visual quality while it maintains a low bit rate and low computational costs. Portrait video is a scalable video in that each video with a higher level always contains all the information of the video with a lower level. The bandwidths of 2-4-level portrait videos fit into the bandwidth range of 20-40 kbps that GPRS and CDMA 1X can stably provide; therefore, portrait video is very promising for video broadcast and communication on 2.5-G wireless networks. With portrait video technology, we are the first to enable two-way video communication on pocket PCs and handheld PCs.
international conference on multimedia and expo | 2003
Keman Yu; Jiangbo Lv; Jiang Li; Shipeng Li
Real-time software-based video codec is widely used on PCs with relatively strong computing capability. However, mobile devices, such as pocket PCs and handheld PCs, still suffer from weak computational power, short battery lifetime and limited display capability. We developed a practical low-complexity real-time video codec for mobile devices. Several methods that can significantly reduce the computational cost are adopted in this codec and described in this paper, including a predictive algorithm for motion estimation, the integer discrete cosine transform (IntDCT), and a DCT/quantizer bypass technique. A real-time video communication implementation of the proposed coded is also introduced. Experiments show that substantial computation reduction is achieved while the loss in video quality is negligible. The proposed codec is very suitable for scenarios where low-complexity computing is required.
pacific rim conference on multimedia | 2002
Yong Li; Jiang Li; Keman Yu; Kaibo Wang; Shipeng Li; Ya-Qin Zhang
The demand of communicating anywhere, anytime on any device is growing with the development of modern information technology. Current instant messaging services enable people to be aware of each others presence and to exchange text information. However, almost all of these services are based on client/server architecture. If servers crash, all connections are lost. As an alternative, peer-to-peer architecture basically relies on clients, but existing systems usually emphasize sharing of computing resources and files. In this paper, we propose a peer-to-peer communication system, which combines the advantages of instant messaging services and peer-to-peer networking. In the system, in addition to being aware of each others presence, users can simultaneously attend multiple multimedia meetings with each meeting allowing multiple attendees. Users have full control over the exchange of real-time text/voice/video information in these meetings.
acm multimedia | 2001
Jiang Li; Keman Yu; Gang Chen; Yong Wang; Hanning Zhou; Jizheng Xu; King To Ng; Kaibo Wang; Lijie Wang; Heung-Yeung Shum
As the Internet and wirless networks are developed rapidly, the demand of communicating anywhere, anytime on any device emerges. However, most of the current wireless networks still work in low bandwidths, and mobile devices still suffer from weak computational power, short battery lifetime and limited display capability. We developed portrait video phone systems that can run on Pcs and Pocket Pcs at very low bit rates through the Internet. The core technology that portrait video phones employ is the so-called portrait video (or bi-level video) codec. Portrait video codec first converts a full-color video into a black/white image sequence and then compresses it into a black/white portrait-like video. Portrait video processes clearer shape, smoother motion, shorter initial latency, and cheaper computational cost than MPEG2, MPEG4 and H.263 for low bandwidths. Typically the portrait video phone provides QCIF-size video with a frame rate of 5-15 fps for a 9.6 Kbps video bandwidth. The portrait video is so small that it can even be transmitted through an HTTP proxy as text. Experiments show that the portrait video phones work well on ordinary GSM wireless telecommunication networks.
IEEE Transactions on Circuits and Systems for Video Technology | 2005
Keman Yu; Jiang Li; Cuizhu Shi; Shipeng Li
The rapid development of wireless networks and mobile devices has made mobile video communication a particularly promising service. We previously proposed an effective video form, scalable portrait video. In low-bandwidth conditions, portrait video possesses clearer shape, smoother motion, and much cheaper computational cost than discrete cosine transform (DCT)-based schemes. However, the bit rate of portrait video cannot be accurately modeled by a rate-distortion function as in DCT-based schemes. How to effectively control the bit rate is a hard challenge for portrait video. In this paper, we propose a novel model-based rate-control method. Although the coding parameters cannot be directly calculated from the target bit rate, we build a model between the bit-rate reduction and the percentage of less probable symbols (LPS) based on the principle of entropy coding, which is referred to as the LPS-rate model. We use this model to obtain the desired coding parameters. Experimental results show that the proposed method not only effectively controls the bit rate, but also significantly reduces the number of skipped frames. The principle of this method can also be applied to general bit plane coding in other image processing and video compression technologies.