Song-Hai Zhang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Song-Hai Zhang is active.

Explore More

Publication

Featured researches published by Song-Hai Zhang.

IEEE Transactions on Visualization and Computer Graphics | 2009

Vectorizing Cartoon Animations

Song-Hai Zhang; Tao Chen; Yi-Fei Zhang; Shi-Min Hu; Ralph Robert Martin

We present a system for vectorizing 2D raster format cartoon animations. The output animations are visually flicker free, smaller in file size, and easy to edit. We identify decorative lines separately from colored regions. We use an accurate and semantically meaningful image decomposition algorithm, supporting an arbitrary color model for each region. To ensure temporal coherence in the output, we reconstruct a universal background for all frames and separately extract foreground regions. Simple user-assistance is required to complete the background. Each region and decorative line is vectorized and stored together with their motions from frame to frame. The contributions of this paper are: 1) the new trapped-ball segmentation method, which is fast, supports nonuniformly colored regions, and allows robust region segmentation even in the presence of imperfectly linked region edges, 2) the separate handling of decorative lines as special objects during image decomposition, avoiding results containing multiple short, thin oversegmented regions, and 3) extraction of a single patch-based background for all frames, which provides a basis for consistent, flicker-free animations.

computer vision and pattern recognition | 2016

Traffic-Sign Detection and Classification in the Wild

Zhe Zhu; Dun Liang; Song-Hai Zhang; Xiaolei Huang; Baoli Li; Shi-Min Hu

Although promising results have been achieved in the areas of traffic-sign detection and classification, few works have provided simultaneous solutions to these two tasks for realistic real world images. We make two contributions to this problem. Firstly, we have created a large traffic-sign benchmark from 100000 Tencent Street View panoramas, going beyond previous benchmarks. It provides 100000 images containing 30000 traffic-sign instances. These images cover large variations in illuminance and weather conditions. Each traffic-sign in the benchmark is annotated with a class label, its bounding box and pixel mask. We call this benchmark Tsinghua-Tencent 100K. Secondly, we demonstrate how a robust end-to-end convolutional neural network (CNN) can simultaneously detect and classify trafficsigns. Most previous CNN image processing solutions target objects that occupy a large proportion of an image, and such networks do not work well for target objects occupying only a small fraction of an image like the traffic-signs here. Experimental results show the robustness of our network and its superiority to alternatives. The benchmark, source code and the CNN model introduced in this paper is publicly available1.

IEEE Transactions on Multimedia | 2011

Online Video Stream Abstraction and Stylization

Song-Hai Zhang; Xian-Ying Li; Shi-Min Hu; Ralph Robert Martin

This paper gives an automatic method for online video stream abstraction, producing a temporally coherent output video stream, in a style with large regions of constant color and highlighted bold edges. Our system includes two novel components. Firstly, to provide coherent and simplified output, we segment frames, and use optical flow to propagate segmentation information from frame to frame; an error control strategy is used to help ensure that the propagated information is reliable. Secondly, to achieve coherent and attractive coloring of the output, we use a color scheme replacement algorithm specifically designed for an online video stream. We demonstrate real-time performance for CIF videos, allowing our approach to be used for live communication and other related applications.

IEEE Transactions on Visualization and Computer Graphics | 2013

Timeline Editing of Objects in Video

Shao-Ping Lu; Song-Hai Zhang; Jin Wei; Shi-Min Hu; Ralph Robert Martin

We present a video editing technique based on changing the timelines of individual objects in video, which leaves them in their original places but puts them at different times. This allows the production of object-level slow motion effects, fast motion effects, or even time reversal. This is more flexible than simply applying such effects to whole frames, as new relationships between objects can be created. As we restrict object interactions to the same spatial locations as in the original video, our approach can produce high-quality results using only coarse matting of video objects. Coarse matting can be done efficiently using automatic video object segmentation, avoiding tedious manual matting. To design the output, the user interactively indicates the desired new life spans of objects, and may also change the overall running time of the video. Our method rearranges the timelines of objects in the video whilst applying appropriate object interaction constraints. We demonstrate that, while this editing technique is somewhat restrictive, it still allows many interesting results.

Science in China Series F: Information Sciences | 2009

Video-based running water animation in Chinese painting style

Song-Hai Zhang; Tao Chen; Yi-Fei Zhang; Shi-Min Hu; Ralph Robert Martin

This paper presents a novel algorithm for synthesizing animations of running water, such as waterfalls and rivers, in the style of Chinese paintings, for applications such as cartoon making. All video frames are first registered in a common coordinate system, simultaneously segmenting the water from background and computing optical flow of the water. Taking artists’ advice into account, we produce a painting structure to guide painting of brush strokes. Flow lines are placed in the water following an analysis of variance of optical flow, to cause strokes to be drawn where the water is flowing smoothly, rather than in turbulent areas: this allows a few moving strokes to depict the trends of the water flows. A variety of brush strokes is then drawn using a template determined from real Chinese paintings. The novel contributions of this paper are: a method for painting structure generation for flows in videos, and a method for stroke placement, with the necessary temporal coherence

non photorealistic animation and rendering | 2011

Hidden images

Qiang Tong; Song-Hai Zhang; Shi-Min Hu; Ralph Robert Martin

A hidden image is a form of artistic expression in which one or more secondary objects (or scenes) are hidden within a primary image. Features of the primary image, especially its edges and texture, are used to portray a secondary object. People can recognize both the primary and secondary intent in such pictures, although the time taken to do so depends on the prior experience of the viewer and the strength of the clues. Here, we present a system for creating such images. It relies on the ability of human perception to recognize an object, e.g. a human face, from incomplete edge information within its interior, rather than its outline. Our system detects edges of the object to be hidden, and then finds a place where it can be embedded within the scene, together with a suitable transformation for doing so, by optimizing an energy based on edge differences. Embedding is performed using a modified Poisson blending approach, which strengthens matched edges of the host image using edges of the object being embedded. We show various hidden images generated by our system.

Journal of Computer Science and Technology | 2017

Intelligent Visual Media Processing: When Graphics Meets Vision

Ming-Ming Cheng; Qibin Hou; Song-Hai Zhang; Paul L. Rosin

The computer graphics and computer vision communities have been working closely together in recent years, and a variety of algorithms and applications have been developed to analyze and manipulate the visual media around us. There are three major driving forces behind this phenomenon: 1) the availability of big data from the Internet has created a demand for dealing with the ever-increasing, vast amount of resources; 2) powerful processing tools, such as deep neural networks, provide effective ways for learning how to deal with heterogeneous visual data; 3) new data capture devices, such as the Kinect, the bridge between algorithms for 2D image understanding and 3D model analysis. These driving forces have emerged only recently, and we believe that the computer graphics and computer vision communities are still in the beginning of their honeymoon phase. In this work we survey recent research on how computer vision techniques benefit computer graphics techniques and vice versa, and cover research on analysis, manipulation, synthesis, and interaction. We also discuss existing problems and suggest possible further research directions.

Journal of Computer Science and Technology | 2016

Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks

Xi-Jin Zhang; Yi-Fan Lu; Song-Hai Zhang

In this paper, we proposed a multi-task system that can identify dish types, food ingredients, and cooking methods from food images with deep convolutional neural networks. We built up a dataset of 360 classes of different foods with at least 500 images for each class. To reduce the noises of the data, which was collected from the Internet, outlier images were detected and eliminated through a one-class SVM trained with deep convolutional features. We simultaneously trained a dish identifier, a cooking method recognizer, and a multi-label ingredient detector. They share a few low-level layers in the deep network architecture. The proposed framework shows higher accuracy than traditional method with handcrafted features, and the cooking method recognizer and ingredient detector can be applied to dishes which are not included in the training dataset to provide reference information for users.

Computational Visual Media | 2016

Comfort-driven disparity adjustment for stereoscopic video

Miao Wang; Xi-Jin Zhang; Jun-Bang Liang; Song-Hai Zhang; Ralph Robert Martin

Pixel disparity—the offset of corresponding pixels between left and right views—is a crucial parameter in stereoscopic three-dimensional (S3D) video, as it determines the depth perceived by the human visual system (HVS). Unsuitable pixel disparity distribution throughout an S3D video may lead to visual discomfort. We present a unified and extensible stereoscopic video disparity adjustment framework which improves the viewing experience for an S3D video by keeping the perceived 3D appearance as unchanged as possible while minimizing discomfort. We first analyse disparity and motion attributes of S3D video in general, then derive a wide-ranging visual discomfort metric from existing perceptual comfort models. An objective function based on this metric is used as the basis of a hierarchical optimisation method to find a disparity mapping function for each input video frame. Warping-based disparity manipulation is then applied to the input video to generate the output video, using the desired disparity mappings as constraints. Our comfort metric takes into account disparity range, motion, and stereoscopic window violation; the framework could easily be extended to use further visual comfort models. We demonstrate the power of our approach using both animated cartoons and real S3D videos.

Computer Graphics Forum | 2014

Learning Natural Colors for Image Recoloring

Hao-Zhi Huang; Song-Hai Zhang; Ralph Robert Martin; Shi-Min Hu

We present a data‐driven method for automatically recoloring a photo to enhance its appearance or change a viewers emotional response to it. A compact representation called a RegionNet summarizes color and geometric features of image regions, and geometric relationships between them. Correlations between color property distributions and geometric features of regions are learned from a database of well‐colored photos. A probabilistic factor graph model is used to summarize distributions of color properties and generate an overall probability distribution for color suggestions. Given a new input image, we can generate multiple recolored results which unlike previous automatic results, are both natural and artistic, and compatible with their spatial arrangements.

Explore More