Yangzhou Du
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yangzhou Du.
computer vision and pattern recognition | 2007
Yanjun Zhao; Tao Wang; Peng Wang; Wei Hu; Yangzhou Du; Yimin Zhang; Guangyou Xu
For video summarization and retrieval, one of the important modules is to group temporal-spatial coherent shots into high-level semantic video clips namely scene segmentation. In this paper, we propose a novel scene segmentation and categorization approach using normalized graph cuts(NCuts). Starting from a set of shots, we first calculate shot similarity from shot key frames. Then by modeling scene segmentation as a graph partition problem where each node is a shot and the weight of edge represents the similarity between two shots, we employ NCuts to find the optimal scene segmentation and automatically decide the optimum scene number by Q function. To discover more useful information from scenes, we analyze the temporal layout patterns of shots, and automatically categorize scenes into two different types, i.e. parallel event scenes and serial event scenes. Extensive experiments are tested on movie, and TV series. The promising results demonstrate that the proposed NCuts based scene segmentation and categorization methods are effective in practice.
conference on image and video retrieval | 2007
Yong Gao; Tao Wang; Jianguo Li; Yangzhou Du; Wei Hu; Yimin Zhang; Haizhou Ai
Cast indexing is an important video mining technique which provides audience the capability to efficiently retrieve interested scenes, events, and stories from a long video. This paper proposes a novel cast indexing approach based on Normalized Graph Cuts (NCuts) and Page Ranking. The system first adopts face tracker to group face images in each shot into face sets, and then extract local SIFT feature as the feature representation. There are two key problems for cast indexing. One is to find an optimal partition to cluster face sets into main cast. The other is how to exploit the latent relationships among characters to provide a more accurate cast ranking. For the first problem, we model each face set as a graph node, and adopt Normalized Graph Cuts (NCuts) to realize an optimal graph partition. A novel local neighborhood distance is proposed to measure the distance between face sets for NCuts, which is robust to outliers. For the second problem, we build a relation graph for characters by their co-occurrence information, and then adopt the PageRank algorithm to estimate the Important Factor (IF) of each character. The PageRank IF is fused with the content based retrieval score for final ranking. Extensive experiments are carried out on movies, TV series and home videos. Promising results demonstrate the effectiveness of proposed methods.
conference on image and video retrieval | 2007
Yangzhou Du; Wenyuan Bi; Tao Wang; Yimin Zhang; Haizhou Ai
Facial expressions are often classified into one of several basic emotion categories. This categorical approach seems improper to treat faces with blended emotion, as well as hard to measure the intensity of an emotion. In this paper facial expressions are evaluated with dimensional approach of affect that was originally introduced by psycho-physiologic study. An expressional face can be represented as a point in a two-dimensional (2-D) emotional space characterized by arousal and valence factors. To link low-level face features with emotional factors, we propose a simple method that builds an emotional mapping by a coarse labeling on Cohn-Kanade database and a linear fitting on the labeled data. Our preliminary experimental result shows that the proposed emotional mapping can be used to visualize the distribution of affective content in a large face set and further retrieval expressional face images or relevant video shots by specifying a region in the 2-D emotional space.
international conference on multimedia and expo | 2012
Eric Q. Li; Bin Wang; Liu Yang; Ya-Ti Peng; Yangzhou Du; Yimin Zhang; Yi-Jen Chiu
Along with the inclusion of GPU cores within the same CPU die, the performance of Intels processor-graphics has been significantly improved over earlier generation of integrated graphics. The need to efficiently harness the computational power of the GPU in the same CPU die is more than ever. This paper presents a highly optimized Haar-based face detector which efficiently exploits both CPU and GPU computing power on the latest Sandy Bridge processor. The classification procedure of Haar-based cascade detector is partitioned to two phases in order to leverage both thread level and data level parallelism in the GPU. The image downscaling and integral image calculation running in the CPU core can work with the GPU in parallel. Compared to CPU-alone implementation, the experiments show that our proposed GPU accelerated implementation achieves a 3.07x speedup with more than 50% power reduction on the latest Sandy Bridge processor. On the other hand, our implementation is also more efficient than the CUDA implementation on the NVidia GT430 card in terms of performance as well as power. In addition, our proposed method presents a general approach for task partitioning between CPU and GPU, thus being beneficial not only for face detection but also for other multimedia and computer vision techniques.
international symposium on microarchitecture | 2008
Eric Q. Li; Wenlong Li; Xiaofeng Tong; Jianguo Li; Yurong Chen; Tao Wang; Patricia P. Wang; Wei Hu; Yangzhou Du; Yimin Zhang; Yen-Kuang Chen
Emerging video-mining applications such as image and video retrieval and indexing will require real-time processing capabilities. A many-core architecture with 64 small, in-order, general-purpose cores as the accelerator can help meet the necessary performance goals and requirements. The key video-mining modules can achieve parallel speedups of 19times to 62times from 64 cores and get an extra 2.3times speedup from 128-bit SIMD vectorization on the proposed architecture.
international conference on image processing | 2011
Ang Liu; Yangzhou Du; Tao Wang; Jianguo Li; Eric Q. Li; Yimin Zhang; Yong Zhao
Facial landmark detection is an essential module in many face related applications and it often appears as the most time consuming part in face processing pipeline. This paper proposes a fast and effective method for facial landmark detection using Haar cascade classifiers and a simple 3D head model, which not only detects the position of landmark points but also gives an estimation of head pose such as yaw and pitch angles. To reduce the amount of computation, only 7 landmark points are detected (4 eye corners, 2 mouth corners, 1 nose tip) that generally meets the requirement of face alignment and face recognition. Experiment on multiple datasets shows our algorithm can provide sufficient accuracy of facial landmark localization while being able to run in super real-time at Intel Atom 1.3 GHz embedded processors.
international conference on image processing | 2013
Lin Xu; Yangzhou Du; Yimin Zhang
Cosmetic makeup is a general event in our daily life, which improves womens beauties and attractions. But, it is difficult for ordinary users to make a wonderful makeup as the cover girls. Moreover, when you are in nude look and want to share better look with your friends, the fastest and easiest way is virtual makeup. However, current existing makeup software needs many user inputs to adjust face landmarks, which influence the user experience. And, it cannot remove the flaws on skin as good as the real cosmetic makeup. In this paper, we describe an automatic framework to apply a cosmetic makeup and skin beautification to your face, which can be selected from many example make-up face images. Our method detects the face landmarks with existing algorithm and adjusts the landmark with skin color Gaussian Mixture model based segmentation. Then, the skin color area is separated into three layers, and makeup is transferred with different method to different layers. The results look pretty good for some natural input face images.
Proceedings of the 2nd ACM TRECVid Video Summarization Workshop on | 2008
Tao Wang; Shangping Feng; Patricia P. Wang; Wei Hu; Shuang Zhang; Wei Zhang; Yangzhou Du; Jianguo Li; Jianmin Li; Yimin Zhang
Video summary is an active research field to help users to grasp a whole videos content for efficient browsing and editing. In this paper, we describe our THU-Intel rushes summarization system in TRECVID2008. In our approach, we first extract low-level audiovisual features and parse the video into shots, sub-shots and 1-second video clips. Then we remove junk video clips with color-bar, near uniform-color and clapboard frames etc. To select video clips with main objects and events, we evaluate each clips representative score by multimodal features of color, edge, motion, and audio etc. Finally, we construct the rushes video summary by iteratively selecting the most representative video clips and removing similar ones. Extensive experiments are carried out on 40 testing rushes videos. Good results demonstrate the effectiveness of the proposed method.
acm multimedia | 2011
Patricia P. Wang; Xiaofeng Tong; Yangzhou Du; Jianguo Li; Wei Hu; Yimin Zhang
Avatar is the virtual representation of users facial, body, and motion characteristics in computer game, social network, and augmented reality. Facial modeling needs enormous efforts to achieve immersive experience in applications like avatar chatting or online makeover. Great challenge exists in robust detection of 2D facial prominent points and mapping them to 3D models in a parameterized manner. Another challenge is how to characterize semantic components of eyes, mouth, nose, and cheek rather than low level mesh geometries. In this paper, we proposed an augmented makeover framework to deal with aforementioned challenges. Aiming to provide amateurs with flexible customizations, morphable model is constructed from a set of scanned 3D face data set. Appearance personalization is carried out in the offline phase where single image and multiple views are discussed respectively to generate deformative shape in a progressive manner. Augmentation is implemented in the online phase where a fast and robust 3D tracking is used to balance the tradeoff between accuracy and real-time requirements. By this means, immersive Human Computer Interaction such as virtual makeover and photo-realistic avatar chatting could be achieved.
parallel computing | 2012
Jiangbin Feng; Yurong Chen; Eric Q. Li; Yangzhou Du; Yimin Zhang
While the 3D-TV becomes widely available in the market, consumers will face the problem of serious shortage of 3D video content. Since the difficulty of 3D video capturing and manufacturing, the automatic video conversion from 2D serves as an important solution for producing 3D perception. However, 2D-to-3D video conversion is a compute-intensive task and real-time processing speed is required in online playing. Nowadays, with the multi-core processor becoming the mainstream, 2D-to-3D video conversion can be accelerated by fully utilizing the computing power of available multi-core processors. In this paper, we take a typical algorithm of automatic 2D-to-3D video conversion as reference and present typical optimization techniques to improve the implementation performance. The result shows our optimization can do the conversion on an average of 36 frames per second on an Intel Core i7 2.3 GHz processor, which meets the real-time processing requirement. We also conduct a scalability performance analysis on the multi-core system to identify the causes of bottlenecks, and make suggestion for optimization of this workload on large-scale multi-core systems.