Hyungtae Lee
University of Maryland, College Park
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hyungtae Lee.
computer vision and pattern recognition | 2011
Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai
We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas with wide coverage. Previous datasets for action recognition are unrealistic for real-world surveillance because they consist of short clips showing one action by one individual [15, 8]. Datasets have been developed for movies [11] and sports [12], but, these actions and scene conditions do not apply effectively to surveillance videos. Our dataset consists of many outdoor scenes with actions occurring naturally by non-actors in continuously captured videos of the real world. The dataset includes large numbers of instances for 23 event types distributed throughout 29 hours of video. This data is accompanied by detailed annotations which include both moving object tracks and event examples, which will provide solid basis for large-scale evaluation. Additionally, we propose different types of evaluation modes for visual recognition tasks and evaluation metrics along with our preliminary experimental results. We believe that this dataset will stimulate diverse aspects of computer vision research and help us to advance the CVER tasks in the years ahead.
Optical Engineering | 2009
Hyungtae Lee; Pyeong Gang Heo; Jung-Yeop Suk; Bo-Yeoun Yeou; HyunWook Park
The object tracking method using the scale-invariant feature transform SIFT is applicable to rotated or scaled targets, and also maintains good performance in occluded or intensity-changed images. However, the SIFT algorithm has high computational complexity. In ad- dition, the template size has to be sufficiently large to extract enough features to match. This paper proposes a scale-invariant object tracking method using strong corner points in the scale domain. The proposed method makes it possible to track a smaller object than the SIFT tracker by extracting relatively more features from a target image. In the pro- posed method, strong features of the template image, which correspond to strong corner points in the scale domain, are selected. The strong features of the template image are then matched with all features of the target image. The matched features are used to find relations between the template and target images. In experimental results, the proposed method shows better performance than the existing SIFT tracker.
asian conference on computer vision | 2012
Hyungtae Lee; Vlad I. Morariu; Larry S. Davis
We present a discriminative deformable part model for the recovery of qualitative pose, inferring coarse pose labels (e.g., left, front-right, back), a task which we expect to be more robust to common confounding factors that hinder the inference of exact 2D or 3D joint locations. Our approach automatically selects parts that are predictive of qualitative pose and trains their appearance and deformation costs to best discriminate between qualitative poses. Unlike previous approaches, our parts are both selected and trained to improve qualitative pose discrimination and are shared by all the qualitative pose models. This leads to both increased accuracy and higher efficiency, since fewer parts models are evaluated for each image. In comparisons with two state-of-the-art approaches on a public dataset, our model shows superior performance.
computer vision and pattern recognition | 2014
Hyungtae Lee; Vlad I. Morariu; Larry S. Davis
We propose the use of a robust pose feature based on part based human detectors (Poselets) for the task of action recognition in relatively unconstrained videos, i.e., collected from the web. This feature, based on the original poselets activation vector, coarsely models pose and its transitions over time. Our main contributions are that we improve the original features compactness and discriminability by greedy set cover over subsets of joint configurations, and incorporate it into a unified video-based action recognition framework. Experiments shows that the pose feature alone is extremely informative, yielding performance that matches most state-of-the-art approaches but only using our proposed improvements to its compactness and discriminability. By combining our pose feature with motion and shape, we outperform state-of-the-art approaches on two public datasets.
advanced video and signal based surveillance | 2011
Sangmin Oh; Anthony Hoogs; A. G. Amitha Perera; Naresh P. Cuntoor; Chia-Chih Chen; Jong Taek Lee; Saurajit Mukherjee; Jake K. Aggarwal; Hyungtae Lee; Larry S. Davis; Eran Swears; Xiaoyang Wang; Qiang Ji; Kishore K. Reddy; Mubarak Shah; Carl Vondrick; Hamed Pirsiavash; Deva Ramanan; Jenny Yuen; Antonio Torralba; Bi Song; Anesco Fong; Amit K. Roy-Chowdhury; Mita Desai
Summary form only given. We present a concept for automatic construction site monitoring by taking into account 4D information (3D over time), that is acquired from highly-overlapping digital aerial images. On the one hand todays maturity of flying micro aerial vehicles (MAVs) enables a low-cost and an efficient image acquisition of high-quality data that maps construction sites entirely from many varying viewpoints. On the other hand, due to low-noise sensors and high redundancy in the image data, recent developments in 3D reconstruction workflows have benefited the automatic computation of accurate and dense 3D scene information. Having both an inexpensive high-quality image acquisition and an efficient 3D analysis workflow enables monitoring, documentation and visualization of observed sites over time with short intervals. Relating acquired 4D site observations, composed of color, texture, geometry over time, largely supports automated methods toward full scene understanding, the acquisition of both the change and the construction sites progress.
computer vision and pattern recognition | 2017
Christopher Reale; Hyungtae Lee; Heesung Kwon
In this work we present three methods to improve a deep convolutional neural network approach to near-infrared heterogeneous face recognition. We first present a method to distill extra information from a pre-trained visible face network through the output logits of the network. Next, we put forth an altered contrastive loss function that uses the ℓ1 norm instead of the ℓ2 norm as a distance metric. Finally, we propose to improve the initialization network by training it for more iterations. We present the results of experiments of these methods on two widely used near-infrared heterogeneous face recognition datasets and compare them to the state-of-the-art.
british machine vision conference | 2015
Sungmin Eum; Hyungtae Lee; David S. Doermann
Imagine being in an art museum where there are paintings or pictures held inside glass-frames for protection. There are pieces which you wish to capture using a camera, but you experience difficulties avoiding highlights which are generated by indoor lighting reflected off the glossy surfaces. Similar problems occur when capturing contents off of whiteboards, documents printed on glossy surfaces, objects such as books or CDs with plastic covers. In this work, we address the problem of removing unwanted highlight regions in images generated by reflections of light sources on glossy surfaces. Although there have been efforts made to synthetically fill in the missing regions using the neighboring patterns by applying methods like inpainting [3, 4], it is impossible to recover the missing information in completely saturated regions. Therefore, we need to use multiple images where corresponding regions are not covered by the saturated highlights. Unlike other methods, our method uses the relationship between the highlight regions resulting in more robust removal of saturated highlights. Our method Overview Our method was motivated by a widely acknowledged physical phenomenon referred to as the ‘motion parallax’. Without loss of generality, we can similarly view the relationship between the desired content (e.g., a painting) and the highlights. Since the highlights caused by the light source are the result of the reflection on the glossy surface before they reach the camera, the light source can be modeled to virtually exist on the other side of the content. Note that, the distance from the light source is always larger than the distance from the content (D > d, in Figure 1).
Signal Processing, Sensor/Information Fusion, and Target Recognition XXVII | 2018
Hyungtae Lee; Sungmin Eum; Heesung Kwon
Terror attacks are often targeted towards the civilians gathered in one location (e.g., Boston Marathon bombing). Distinguishing such ’malicious’ scenes from the ’normal’ ones, which are semantically different, is a difficult task as both scenes contain large groups of people with high visual similarity. To overcome the difficulty, previous methods exploited various contextual information, such as language-driven keywords or relevant objects. Although useful, they require additional human effort or dataset. In this paper, we show that using more sophisticated and deeper Convolutional Neural Networks (CNNs) can achieve better classification accuracy even without using any additional information outside the image domain. We have conducted a comparative study where we train and compare seven different CNN architectures (AlexNet, VGG-M, VGG16, GoogLeNet, ResNet- 50, ResNet-101, and ResNet-152). Based on the experimental analyses, we found out that deeper networks typically show better accuracy, and that GoogLeNet is the most favorable among the seven architectures for the task of malicious event classification.
workshop on applications of computer vision | 2015
Hyungtae Lee; Vlad I. Morariu; Larry S. Davis
arXiv: Computer Vision and Pattern Recognition | 2016
Joel Levis; Hyungtae Lee; Heesung Kwon; James Michaelis; Michael Kolodny; Sungmin Eum