Tae Eun Choe
University of Southern California
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tae Eun Choe.
IEEE MultiMedia | 2014
Kewei Tu; Meng Meng; Mun Wai Lee; Tae Eun Choe; Song-Chun Zhu
This article proposes a multimedia analysis framework to process video and text jointly for understanding events and answering user queries. The framework produces a parse graph that represents the compositional structures of spatial information (objects and scenes), temporal information (actions and events), and causal information (causalities between events and fluents) in the video and text. The knowledge representation of the framework is based on a spatial-temporal-causal AND-OR graph (S/T/C-AOG), which jointly models possible hierarchical compositions of objects, scenes, and events as well as their interactions and mutual contexts, and specifies the prior probabilistic distribution of the parse graphs. The authors present a probabilistic generative model for joint parsing that captures the relations between the input video/text, their corresponding parse graphs, and the joint parse graph. Based on the probabilistic model, the authors propose a joint parsing system consisting of three modules: video parsing, text parsing, and joint inference. Video parsing and text parsing produce two parse graphs from the input video and text, respectively. The joint inference module produces a joint parse graph by performing matching, deduction, and revision on the video and text parse graphs. The proposed framework has the following objectives: to provide deep semantic parsing of video and text that goes beyond the traditional bag-of-words approaches; to perform parsing and reasoning across the spatial, temporal, and causal dimensions based on the joint S/T/C-AOG representation; and to show that deep joint parsing facilitates subsequent applications such as generating narrative text descriptions and answering queries in the forms of who, what, when, where, and why. The authors empirically evaluated the system based on comparison against ground-truth as well as accuracy of query answering and obtained satisfactory results.
international conference on pattern recognition | 2006
Tae Eun Choe; Isaac Cohen; Mun Wai Lee; Gérard G. Medioni
We present a method to construct a mosaic from multiple color and fluorescein retinal images. A set of images taken from different views at different times is difficult to register sequentially due to variations in color and intensity across images. We propose a method to register images globally in order to minimize the registration error and to find optimal registration pairs. The reference frame that gives the minimum registration error is found by the Floyd-Warshalls all-pairs shortest path algorithm, and all other images are registered to this reference frame using an affine transformation model. We present experimental results to validate the proposed method
computer vision and pattern recognition | 2010
Tae Eun Choe; Mun Wai Lee; Niels Haering
We propose a new traffic analysis framework using existing traffic camera networks. The framework integrates vehicle detection and image-based matching methods with geographic context to match vehicles across different views and analyze traffic. This is a challenging problem due to the low frame-rate of traffic-cams and the large distance between views. A vehicle may not always appear in a camera due to large inter-frame interval or inter-occlusion. We applied the proposed method to a traffic camera network to detect and track key vehicles to analyze traffic condition. Vehicles are detected using a multi-view approach. By integrating camera calibration information and GIS data, we extract traffic lane information and prior knowledge of expected vehicle orientation and image size at each image location. This improves detection speed and reduces false alarms by discarding unlikely scale and orientation. Subsequently, detected vehicles are matched across cameras using a view-invariant appearance model. For more accurate vehicle matching, traffic patterns observed at two sets of cameras are temporally aligned. Finally, key vehicles are globally tracked across cameras using the max-flow/min-cut network tracking algorithm. Traffic conditions at each camera location are presented on a map.
computer vision and pattern recognition | 2006
Tae Eun Choe; Isaac Cohen; Gérard G. Medioni
We present a method for 3-D shape reconstruction of retinal fundus from fluorescein images. Our method extracts the location of vessels’ bifurcation as a reliable feature for estimating the epipolar geometry using a plane-and-parallax approach. The proposed solution robustly estimates the fundamental matrix for nearly planar surfaces, such as the retinal fundus. We propose the use of mutual information criteria for accurate estimation of the disparity maps, where the matched Y-features are used for automatically estimating the bounds of the disparities range space. Our experimental results validate the proposed method on sets of difficult fluorescein image pairs.
International Journal of Pattern Recognition and Artificial Intelligence | 2010
Xiaochun Cao; Lin Wu; Zeeshan Rasheed; Haiying Liu; Tae Eun Choe; Feng Guo; Niels Haering
This paper proposes a new solution to geo-register the nearly feature-less maritime video feeds. We detect the horizon using sizable or uniformly moving vessels, and estimate the vertical apex using water reflections of the street lamps. The computed horizon and apex provide a metric rectification that removes the affine distortions and reduces the searching space for geo-registration. Geo-registration is obtained by searching the best orientation where the estimated water masks on satellite images and camera views are matched. The proposed solution has the following contributions: first, water and coastlines are used as features for registration between horizontally looking maritime views and satellite images. Second, water reflections are proposed to estimate the vertical vanishing point. Third, we give algorithms for the detection of water areas in both satellite images and camera views. Experimental results and applications on cross camera tracking are demonstrated. We also discuss several observations, as well as limitations of the proposed approach.
international conference on computer vision | 2011
Tae Eun Choe; Zeeshan Rasheed; Geoffrey Taylor; Niels Haering
We propose a general framework for multiple target tracking across multiple cameras using max-flow networks. The framework integrates target detection, tracking, and classification from each camera and obtains the cross-camera trajectory of each target. The global data association problem is formed as a maximum a posteriori (MAP) problem and represented by a flow network. Similarities of time, location, size, and appearance (classification and color histogram) of the target across cameras are provided as inputs to the network and the targets optimal cross-camera trajectory is found using the max-flow algorithm. The implemented system is designed for real-time process with high-resolution videos (10MB per frame). The framework is validated on high resolution camera networks with both overlapping and non-overlapping fields of view in urban scenes.
Medical Image Analysis | 2008
Tae Eun Choe; Gérard G. Medioni; Isaac Cohen; Alexander C. Walsh; Srinivas R. Sadda
This study presents methods to 2-D registration of retinal image sequences and 3-D shape inference from fluorescein images. The Y-feature is a robust geometric entity that is largely invariant across modalities as well as across the temporal grey level variations induced by the propagation of the dye in the vessels. We first present a Y-feature extraction method that finds a set of Y-feature candidates using local image gradient information. A gradient-based approach is then used to align an articulated model of the Y-feature to the candidates more accurately while optimizing a cost function. Using mutual information, fitted Y-features are subsequently matched across images, including colors and fluorescein angiographic frames, for registration. To reconstruct the retinal fundus in 3-D, the extracted Y-features are used to estimate the epipolar geometry with a plane-and-parallax approach. The proposed solution provides a robust estimation of the fundamental matrix suitable for plane-like surfaces, such as the retinal fundus. The mutual information criterion is used to accurately estimate the dense disparity map. Our experimental results validate the proposed method on a set of difficult fluorescein image pairs.
medical image computing and computer assisted intervention | 2006
Tae Eun Choe; Isaac Cohen; Gérard G. Medioni; Alexander C. Walsh; Srinivas R. Sadda
We present a method for the 3-D shape reconstruction of the retinal fundus from stereo paired images. Detection of retinal elevation plays a critical role in the diagnosis and management of many retinal diseases. However, since the shape of ocular fundus is nearly planar, its 3-D depth range is very narrow. Therefore, we use the location of vascular bifurcations and a plane+parallax approach to provide a robust estimation of the epipolar geometry. Matching is then performed using a mutual information algorithm for accurate estimation of the disparity maps. To validate our results, in the absence of camera calibration, we compared the results with measurements from the current clinical gold standard, optical coherence tomography (OCT).
international conference on pattern recognition | 2008
Tae Eun Choe; Krishnan Ramnath; Mun Wai Lee; Niels Haering
We propose a new method for warping high-resolution images to efficiently track objects on the ground plane in real time. Recently, the emergence of high resolution video cameras (> 5 megapixels) has enabled surveillance over a much larger area using only a single camera. However, real-time processing of high resolution video for automatic detection and tracking of multiple targets is a challenge. When the surveillance camera covers greater depth of ground regions, due to perspective effect, the image size of a target varies significantly depending on the distance between the camera and the target. In this study, we propose a framework to transform high resolution images into warped images using a plane homography to make the target size uniform regardless of the position. The method not only reduces the number of pixels to be processed for speed-up, but also improves the tracking performance. We provide experimental results on object tracking in high-resolution maritime videos to demonstrate the validity of our method.
international conference on distributed smart cameras | 2008
Zeeshan Rasheed; Xiaochun Cao; Khurram Shafique; Haiying Liu; Li Yu; Mun Wai Lee; Krishnan Ramnath; Tae Eun Choe; Omar Javed; Niels Haering
Modern automated video analysis systems consist of large networks of heterogeneous sensors. These systems must extract, integrate and present relevant information from the sensors in real-time. This paper addresses some of the major challenges such systems face: efficient video processing for high-resolution sensors; data fusion across multiple modalities; robustness to changing environmental conditions and video processing errors; and intuitive user interfaces for visualization and analysis. The paper discusses enabling technologies to overcome these challenges and presents a case study of a wide area video analysis system deployed at a port in the state of Florida, USA. The components of the system are also detailed and justified using quantitative and qualitative results.