Donggeun Yoo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Donggeun Yoo is active.

Explore More

Publication

Featured researches published by Donggeun Yoo.

international conference on computer vision | 2015

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

Donggeun Yoo; Sunggyun Park; Joon-Young Lee; Anthony S. Paek; In So Kweon

We present a novel detection method using a deep convolutional neural network (CNN), named AttentionNet. We cast an object detection problem as an iterative classification problem, which is the most suitable form of a CNN. AttentionNet provides quantized weak directions pointing a target object and the ensemble of iterative predictions from AttentionNet converges to an accurate object boundary box. Since AttentionNet is a unified network for object detection, it detects objects without any separated models from the object proposal to the post bounding-box regression. We evaluate AttentionNet by a human detection task and achieve the state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 with an 8-layered architecture only.

computer vision and pattern recognition | 2015

Multi-scale pyramid pooling for deep convolutional representation

Donggeun Yoo; Sunggyun Park; Joon-Young Lee; In So Kweon

Compared to image representation based on low-level local descriptors, deep neural activations of Convolutional Neural Networks (CNNs) are richer in mid-level representation, but poorer in geometric invariance properties. In this paper, we present a straightforward framework for better image representation by combining the two approaches. To take advantages of both representations, we extract a fair amount of multi-scale dense local activations from a pre-trained CNN. We then aggregate the activations by Fisher kernel framework, which has been modified with a simple scale-wise normalization essential to make it suitable for CNN activations. Our representation demonstrates new state-of-the-art performances on three public datasets: 80.78% (Acc.) on MIT Indoor 67, 83.20% (mAP) on PASCAL VOC 2007 and 91.28% (Acc.) on Oxford 102 Flowers. The results suggest that our proposal can be used as a primary image representation for better performances in wide visual recognition tasks.

european conference on computer vision | 2016

Pixel-Level Domain Transfer

Donggeun Yoo; Namil Kim; Sunggyun Park; Anthony S. Paek; In So Kweon

We present an image-conditional image generation model. The model transfers an input domain to a target domain in semantic level, and generates the target image in pixel level. To generate realistic target images, we employ the real/fake-discriminator as in Generative Adversarial Nets, but also introduce a novel domain-discriminator to make the generated image relevant to the input image. We verify our model through a challenging task of generating a piece of clothing from an input image of a dressed person. We present a high quality clothing dataset containing the two domains, and succeed in demonstrating decent results.

international conference on computer vision | 2015

Learning a Deep Convolutional Network for Light-Field Image Super-Resolution

Youngjin Yoon; Hae-Gon Jeon; Donggeun Yoo; Joon-Young Lee; In So Kweon

Commercial Light-Field cameras provide spatial and angular information, but its limited resolution becomes an important problem in practical use. In this paper, we present a novel method for Light-Field image super-resolution (SR) via a deep convolutional neural network. Rather than the conventional optimization framework, we adopt a datadriven learning method to simultaneously up-sample the angular resolution as well as the spatial resolution of a Light-Field image. We first augment the spatial resolution of each sub-aperture image to enhance details by a spatial SR network. Then, novel views between the sub-aperture images are generated by an angular super-resolution network. These networks are trained independently but finally finetuned via end-to-end training. The proposed method shows the state-of-the-art performance on HCI synthetic dataset, and is further evaluated by challenging real-world applications including refocusing and depth map estimation.

IEEE Signal Processing Letters | 2017

Light-Field Image Super-Resolution Using Convolutional Neural Network

Youngjin Yoon; Hae-Gon Jeon; Donggeun Yoo; Joon-Young Lee; In So Kweon

Commercial light field cameras provide spatial and angular information, but their limited resolution becomes an important problem in practical use. In this letter, we present a novel method for light field image super-resolution (SR) to simultaneously up-sample both the spatial and angular resolutions of a light field image via a deep convolutional neural network. We first augment the spatial resolution of each subaperture image by a spatial SR network, then novel views between super-resolved subaperture images are generated by three different angular SR networks according to the novel view locations. We improve both the efficiency of training and the quality of angular SR results by using weight sharing. In addition, we provide a new light field image dataset for training and validating the network. We train our whole network end-to-end, and show state-of-the-art performances on quantitative and qualitative evaluations.

pacific-rim symposium on image and video technology | 2017

Intelligent Assistant for People with Low Vision Abilities

Oleksandr Bogdan; Oleg Yurchenko; Oleksandr Bailo; Francois Rameau; Donggeun Yoo; In So Kweon

This paper proposes a wearable system for visually impaired people that can be utilized to obtain an extensive feedback about their surrounding environment. Our system consists of a stereo camera and smartglasses, communicating with a smartphone that is used as an intermediary computational device. Furthermore, the system is connected to a server where all the expensive computations are executed. The whole setup is capable of detecting obstacles in the nearest surrounding, recognizing faces and facial expressions, reading texts, providing a generic description and question answering of a particular input image. In addition, we propose a novel depth question answering system to estimate object size as well as objects relative position in an unconstrained environment in near real-time and in a fully automatic way requiring only stereo image pair and voice request as an input. We have conducted a series of experiments to evaluate the feasibility and practicality of the proposed system which shows promising results to assist visually impaired people.

international conference on ubiquitous robots and ambient intelligence | 2016

Sentence learning on deep convolutional networks for image Caption Generation

Dong-Jin Kim; Donggeun Yoo; Bonggeun Sim; In So Kweon

We propose a method to train image representation to improve the performance of image Caption Generation. We use transfer learning CNN on sentences, and extract image representation with deep Fisher kernel. With this representation, we generate sentence with gLSTM and show the improvements in performance. In the experiment, we validate each steps of our method by comparing with previous state-of-the-art models.

international conference on ubiquitous robots and ambient intelligence | 2015

Relative attributes with deep Convolutional Neural Network

Dong-Jin Kim; Donggeun Yoo; Sunghoon Im; Namil Kim; Tharatch Sirinukulwattana; In So Kweon

Our work is based on the idea of relative attributes, aiming to provide more descriptive information to the images. We propose the model that integrates relative-attribute framework with deep Convolutional Neural Networks (CNN) to increase the accuracy of attribute comparison. In addition, we analyzed the role of each network layer in the process. Our model uses features extracted from CNN and is learned by Rank SVM method with these feature vectors. As a result, our model outperforms the original relative attribute model in terms of significant improvement in accuracy.

international conference on ubiquitous robots and ambient intelligence | 2015

Rich feature hierarchies from omni-directional RGB-DI information for pedestrian detection

Seokju Lee; Sungsik Huh; Donggeun Yoo; In So Kweon; David Hyunchul Shim

In this paper, we propose an omni-directional pedestrian detection method from color, depth, and laser intensity (RGB-DI) information by fusing two different sensors, catadioptric camera and 3D LiDAR scanner. Our method is based on the use of Regions with Convolutional Neural Network (R-CNN) features, which is known as the state-of-the-art object detection method at this moment. The problem of R-CNN is that it takes long computation times over omni-directional searches. By fusing two sensors, we reduced the number of candidate regions and the whole computation time under half, and achieved better performances in the outdoor environment.

international world wide web conferences | 2014

PRISM: a system for weighted multi-color browsing of fashion products

Donggeun Yoo; Kyunghyun Paeng; Sunggyun Park; Jungin Lee; Seungwook Paek; Sung-Eui Yoon; In So Kweon

Multiple color search technology helps users find fashion products in a more intuitive manner. Although fashion product images can be represented not only by a set of dominant colors but also by the relative ratio of colors, current online fashion shopping malls often provide rather simple color filters. In this demo, we present PRISM (Perceptual Representation of Image SiMilarity), a weighted multi-color browsing system for fashion products retrieval. Our system combines widely accepted backend web service stacks and various computer vision techniques including a product area parsing and a compact yet effective multi-color description. Finally, we demonstrate the benefits of PRISM system via web service in which users freely browse fashion products.

Explore More