Seunghoon Hong
Pohang University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Seunghoon Hong.
international conference on computer vision | 2015
Hyeonwoo Noh; Seunghoon Hong; Bohyung Han
We propose a novel semantic segmentation algorithm by learning a deep deconvolution network. We learn the network on top of the convolutional layers adopted from VGG 16-layer net. The deconvolution network is composed of deconvolution and unpooling layers, which identify pixelwise class labels and predict segmentation masks. We apply the trained network to each proposal in an input image, and construct the final semantic segmentation map by combining the results from all proposals in a simple manner. The proposed algorithm mitigates the limitations of the existing methods based on fully convolutional networks by integrating deep deconvolution network and proposal-wise prediction, our segmentation method typically identifies detailed structures and handles objects in multiple scales naturally. Our network demonstrates outstanding performance in PASCAL VOC 2012 dataset, and we achieve the best accuracy (72.5%) among the methods trained without using Microsoft COCO dataset through ensemble with the fully convolutional network.
computer vision and pattern recognition | 2016
Seunghoon Hong; Junhyuk Oh; Honglak Lee; Bohyung Han
We propose a novel weakly-supervised semantic segmentation algorithm based on Deep Convolutional Neural Network (DCNN). Contrary to existing weakly-supervised approaches, our algorithm exploits auxiliary segmentation annotations available for different categories to guide segmentations on images with only image-level class labels. To make segmentation knowledge transferrable across categories, we design a decoupled encoder-decoder architecture with attention model. In this architecture, the model generates spatial highlights of each category presented in images using an attention model, and subsequently performs binary segmentation for each highlighted region using decoder. Combining attention model, the decoder trained with segmentation annotations in different categories boosts accuracy of weakly-supervised semantic segmentation. The proposed algorithm demonstrates substantially improved performance compared to the state-of-theart weakly-supervised techniques in PASCAL VOC 2012 dataset when our model is trained with the annotations in 60 exclusive categories in Microsoft COCO dataset.
european conference on computer vision | 2014
Hyeonseob Nam; Seunghoon Hong; Bohyung Han
Tracking by sequential Bayesian filtering relies on a graphical model with temporally ordered linear structure based on temporal smoothness assumption. This framework is convenient to propagate the posterior through the first-order Markov chain. However, density propagation from a single immediately preceding frame may be unreliable especially in challenging situations such as abrupt appearance changes, fast motion, occlusion, and so on. We propose a visual tracking algorithm based on more general graphical models, where multiple previous frames contribute to computing the posterior in the current frame and edges between frames are created upon inter-frame trackability. Such data-driven graphical model reflects sequence structures as well as target characteristics, and is more desirable to implement a robust tracking algorithm. The proposed tracking algorithm runs online and achieves outstanding performance with respect to the state-of-the-art trackers. We illustrate quantitative and qualitative performance of our algorithm in all the sequences in tracking benchmark and other challenging videos.
international conference on computer vision | 2013
Seunghoon Hong; Suha Kwak; Bohyung Han
We propose a novel offline tracking algorithm based on model-averaged posterior estimation through patch matching across frames. Contrary to existing online and offline tracking methods, our algorithm is not based on temporally-ordered estimates of target state but attempts to select easy-to-track frames first out of the remaining ones without exploiting temporal coherency of target. The posterior of the selected frame is estimated by propagating densities from the already tracked frames in a recursive manner. The density propagation across frames is implemented by an efficient patch matching technique, which is useful for our algorithm since it does not require motion smoothness assumption. Also, we present a hierarchical approach, where a small set of key frames are tracked first and non-key frames are handled by local key frames. Our tracking algorithm is conceptually well-suited for the sequences with abrupt motion, shot changes, and occlusion. We compare our tracking algorithm with existing techniques in real videos with such challenges and illustrate its superior performance qualitatively and quantitatively.
computer vision and pattern recognition | 2017
Seunghoon Hong; Donghun Yeo; Suha Kwak; Honglak Lee; Bohyung Han
We propose a novel algorithm for weakly supervised semantic segmentation based on image-level class labels only. In weakly supervised setting, it is commonly observed that trained model overly focuses on discriminative parts rather than the entire object area. Our goal is to overcome this limitation with no additional human intervention by retrieving videos relevant to target class labels from web repository, and generating segmentation labels from the retrieved videos to simulate strong supervision for semantic segmentation. During this process, we take advantage of image classification with discriminative localization technique to reject false alarms in retrieved videos and identify relevant spatio-temporal volumes within retrieved videos. Although the entire procedure does not require any additional supervision, the segmentation annotations obtained from videos are sufficiently strong to learn a model for semantic segmentation. The proposed algorithm substantially outperforms existing methods based on the same level of supervision and is even as competitive as the approaches relying on extra annotations.
european conference on computer vision | 2014
Seunghoon Hong; Bohyung Han
Probabilistic tracking algorithms typically rely on graphical models based on the first-order Markov assumption. Although such linear structure models are simple and reasonable, it is not appropriate for persistent tracking since temporal failures by short-term occlusion, shot changes, and appearance changes may impair the remaining frames significantly. More general graphical models may be useful to exploit the intrinsic structure of input video and improve tracking performance. Hence, we propose a novel offline tracking algorithm by identifying a tree-structured graphical model, where we formulate a unified framework to optimize tree structure and track a target in a principled way, based on MCMC sampling. To reduce computational cost, we also introduce a technique to find the optimal tree for a small number of key frames first and employ a semi-supervised manifold alignment technique of tree construction for all frames. We evaluated our algorithm in many challenging videos and obtained outstanding results compared to the state-of-the-art techniques quantitatively and qualitatively.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016
Seunghoon Hong; Jonghyun Choi; Jan Feyereisl; Bohyung Han; Larry S. Davis
We propose a novel algorithm to cluster and annotate a set of input images jointly, where the images are clustered into several discriminative groups and each group is identified with representative labels automatically. For these purposes, each input image is first represented by a distribution of candidate labels based on its similarity to images in a labeled reference image database. A set of these label-based representations are then refined collectively through a non-negative matrix factorization with sparsity and orthogonality constraints; the refined representations are employed to cluster and annotate the input images jointly. The proposed approach demonstrates performance improvements in image clustering over existing techniques, and illustrates competitive image labeling accuracy in both quantitative and qualitative evaluation. In addition, we extend our joint clustering and labeling framework to solving the weakly-supervised image classification problem and obtain promising results.
international conference on computer vision | 2013
Taegyu Lim; Seunghoon Hong; Bohyung Han; Joon Hee Han
We propose an on-line algorithm to extract a human by foreground/background segmentation and estimate pose of the human from the videos captured by moving cameras. We claim that a virtuous cycle can be created by appropriate interactions between the two modules to solve individual problems. This joint estimation problem is divided into two sub problems, foreground/background segmentation and pose tracking, which alternate iteratively for optimization, segmentation step generates foreground mask for human pose tracking, and human pose tracking step provides fore-ground response map for segmentation. The final solution is obtained when the iterative procedure converges. We evaluate our algorithm quantitatively and qualitatively in real videos involving various challenges, and present its outstanding performance compared to the state-of-the-art techniques for segmentation and pose estimation.
workshop on applications of computer vision | 2017
Kayoung Park; Seunghoon Hong; Mooyeol Baek; Bohyung Han
We propose an image aesthetic quality assessment algorithm, which considers personal taste in addition to generally perceived preference. This problem is formulated by a combination of two different learning frameworks based on support vector machines—Support Vector Regression (SVR) and Ranking SVM (R-SVM), where SVR learns a general model based on public datasets and R-SVM adjusts the model to accommodate personal preference obtained from user interactions. The combined framework, called R-SVR, is represented by a single objective function, which is optimized jointly to learn a model for personalized image aesthetic quality assessment. For the optimization, we use only a small subset of public dataset identified by k-nearest neighbor search instead of using all available training data. This strategy is useful in practice because it reduces training time significantly and alleviates data imbalance problem between regression and ranking. The proposed algorithm is tested through simulation and user study, and we present that our interactive learning algorithm by R-SVR is effective to increase users satisfaction and improve prediction performance.
international conference on machine learning | 2015
Seunghoon Hong; Tackgeun You; Suha Kwak; Bohyung Han