Is this you? Create Your Porfile

Yi-Hsuan Tsai

University of California, Merced

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yi-Hsuan Tsai is active.

Explore More

Publication

Featured researches published by Yi-Hsuan Tsai.

computer vision and pattern recognition | 2016

Video Segmentation via Object Flow

Yi-Hsuan Tsai; Ming-Hsuan Yang; Michael J. Black

Video object segmentation is challenging due to fast moving objects, deforming shapes, and cluttered backgrounds. Optical flow can be used to propagate an object segmentation over time but, unfortunately, flow is often inaccurate, particularly around object boundaries. Such boundaries are precisely where we want our segmentation to be accurate. To obtain accurate segmentation across time, we propose an efficient algorithm that considers video segmentation and optical flow estimation simultaneously. For video segmentation, we formulate a principled, multiscale, spatio-temporal objective function that uses optical flow to propagate information between frames. For optical flow estimation, particularly at object boundaries, we compute the flow independently in the segmented regions and recompose the results. We call the process object flow and demonstrate the effectiveness of jointly optimizing optical flow and video segmentation using an iterative scheme. Experiments on the SegTrack v2 and Youtube-Objects datasets show that the proposed algorithm performs favorably against the other state-of-the-art methods.

european conference on computer vision | 2016

Semantic Co-segmentation in Videos

Yi-Hsuan Tsai; Guangyu Zhong; Ming-Hsuan Yang

Discovering and segmenting objects in videos is a challenging task due to large variations of objects in appearances, deformed shapes and cluttered backgrounds. In this paper, we propose to segment objects and understand their visual semantics from a collection of videos that link to each other, which we refer to as semantic co-segmentation. Without any prior knowledge on videos, we first extract semantic objects and utilize a tracking-based approach to generate multiple object-like tracklets across the video. Each tracklet maintains temporally connected segments and is associated with a predicted category. To exploit rich information from other videos, we collect tracklets that are assigned to the same category from all videos, and co-select tracklets that belong to true objects by solving a submodular function. This function accounts for object properties such as appearances, shapes and motions, and hence facilitates the co-segmentation process. Experiments on three video object segmentation datasets show that the proposed algorithm performs favorably against the other state-of-the-art methods.

international conference on computer graphics and interactive techniques | 2016

Sky is not the limit: semantic-aware sky replacement

Yi-Hsuan Tsai; Xiaohui Shen; Zhe Lin; Kalyan Sunkavalli; Ming-Hsuan Yang

Skies are common backgrounds in photos but are often less interesting due to the time of photographing. Professional photographers correct this by using sophisticated tools with painstaking efforts that are beyond the command of ordinary users. In this work, we propose an automatic background replacement algorithm that can generate realistic, artifact-free images with a diverse styles of skies. The key idea of our algorithm is to utilize visual semantics to guide the entire process including sky segmentation, search and replacement. First we train a deep convolutional neural network for semantic scene parsing, which is used as visual prior to segment sky regions in a coarse-to-fine manner. Second, in order to find proper skies for replacement, we propose a data-driven sky search scheme based on semantic layout of the input image. Finally, to re-compose the stylized sky with the original foreground naturally, an appearance transfer method is developed to match statistics locally and semantically. We show that the proposed algorithm can automatically generate a set of visually pleasing results. In addition, we demonstrate the effectiveness of the proposed algorithm with extensive user studies.

computer vision and pattern recognition | 2017

Deep Image Harmonization

Yi-Hsuan Tsai; Xiaohui Shen; Zhe Lin; Kalyan Sunkavalli; Xin Lu; Ming-Hsuan Yang

Compositing is one of the most common operations in photo editing. To generate realistic composites, the appearances of foreground and background need to be adjusted to make them compatible. Previous approaches to harmonize composites have focused on learning statistical relationships between hand-crafted appearance features of the foreground and background, which is unreliable especially when the contents in the two layers are vastly different. In this work, we propose an end-to-end deep convolutional neural network for image harmonization, which can capture both the context and semantic information of the composite images during harmonization. We also introduce an efficient way to collect large-scale and high-quality training data that can facilitate the training process. Experiments on the synthesized dataset and real composite images show that the proposed network outperforms previous state-of-the-art methods.

international conference on computer vision | 2013

Exemplar Cut

Jimei Yang; Yi-Hsuan Tsai; Ming-Hsuan Yang

We present a hybrid parametric and nonparametric algorithm, exemplar cut, for generating class-specific object segmentation hypotheses. For the parametric part, we train a pylon model on a hierarchical region tree as the energy function for segmentation. For the nonparametric part, we match the input image with each exemplar by using regions to obtain a score which augments the energy function from the pylon model. Our method thus generates a set of highly plausible segmentation hypotheses by solving a series of exemplar augmented graph cuts. Experimental results on the Graz and PASCAL datasets show that the proposed algorithm achieves favorable segmentation performance against the state-of-the-art methods in terms of visual quality and accuracy.

computer vision and pattern recognition | 2015

Adaptive region pooling for object detection

Yi-Hsuan Tsai; Onur C. Hamsici; Ming-Hsuan Yang

Learning models for object detection is a challenging problem due to the large intra-class variability of objects in appearance, viewpoints, and rigidity. We address this variability by a novel feature pooling method that is adaptive to segmented regions. The proposed detection algorithm automatically discovers a diverse set of exemplars and their distinctive parts which are used to encode the region structure by the proposed feature pooling method. Based on each exemplar and its parts, a regression model is learned with samples selected by a coarse region matching scheme. The proposed algorithm performs favorably on the PASCAL VOC 2007 dataset against existing algorithms. We demonstrate the benefits of our feature pooling method when compared to conventional spatial pyramid pooling features. We also show that object information can be transferred through exemplars for detected objects.

international conference on image processing | 2014

Locality preserving hashing

Yi-Hsuan Tsai; Ming-Hsuan Yang

The spectral hashing algorithm relaxes and solves an objective function for generating hash codes such that data similarity is preserved in the Hamming space. However, the assumption of uniform global data distribution limits its applicability. In the paper, we introduce locality preserving projection to determine the data distribution adaptively, and a spectral method is adopted to estimate the eigenfunctions of the underlying graph Laplacian. Furthermore, pairwise label similarity can be further incorporated in the weight matrix to bridge the semantic gap between data and hash codes. Experiments on three benchmark datasets show the proposed algorithm performs favorably against state-of-the-art hashing methods.

asian conference on computer vision | 2016

Weakly-Supervised Video Scene Co-parsing

Guangyu Zhong; Yi-Hsuan Tsai; Ming-Hsuan Yang

In this paper, we propose a scene co-parsing framework to assign pixel-wise semantic labels in weakly-labeled videos, i.e., only video-level category labels are given. To exploit rich semantic information, we first collect all videos that share the same video-level labels and segment them into supervoxels. We then select representative supervoxels for each category via a supervoxel ranking process. This ranking problem is formulated with a submodular objective function and a scene-object classifier is incorporated to distinguish scenes and objects. To assign each supervoxel a semantic label, we match each supervoxel to these selected representatives in the feature domain. Each supervoxel is then associated with a series of category potentials and assigned to a semantic label with the maximum one. The proposed co-parsing framework extends scene parsing from single images to videos and exploits mutual information among a video collection. Experimental results on the Wild-8 and SUNY-24 datasets show that the proposed algorithm performs favorably against the state-of-the-art approaches.

international conference on intelligent transportation systems | 2016

Learning to tell brake lights with convolutional features

Guangyu Zhong; Yi-Hsuan Tsai; Yi-Ting Chen; Xue Mei; Danil V. Prokhorov; Michael R. James; Ming-Hsuan Yang

In this paper, we present a learning-based brake light classification algorithm for intelligent driver-assistance systems. State-of-the-art approaches apply different image processing techniques with hand-crafted features to determine whether brake lights are on or off. In contrast, we learn a brake light classifier based on discriminative color descriptors and convolutional features fine-tuned for traffic scenes. We show how brake light regions can be segmented and classified in one framework. Numerous experimental results show that the proposed algorithm performs well against state-of-the-art alternatives in real-world scenes.

international conference on computer vision | 2017