David M. Chen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David M. Chen is active.

Explore More

Publication

Featured researches published by David M. Chen.

IEEE Signal Processing Magazine | 2011

Mobile Visual Search

Bernd Girod; Vijay Chandrasekhar; David M. Chen; Ngai-Man Cheung; Radek Grzeszczuk; Yuriy Reznik; Gabriel Takacs; Sam S. Tsai; Ramakrishna Vedantham

Mobile phones have evolved into powerful image and video processing devices equipped with high-resolution cameras, color displays, and hardware-accelerated graphics. They are also increasingly equipped with a global positioning system and connected to broadband wireless networks. All this enables a new class of applications that use the camera phone to initiate search queries about objects in visual proximity to the user (Figure 1). Such applications can be used, e.g., for identifying products, comparison shopping, finding information about movies, compact disks (CDs), real estate, print media, or artworks.

international conference on image processing | 2011

Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions

Huizhong Chen; Sam S. Tsai; Georg Schroth; David M. Chen; Radek Grzeszczuk; Bernd Girod

Detecting text in natural images is an important prerequisite. In this paper, we propose a novel text detection algorithm, which employs edge-enhanced Maximally Stable Extremal Regions as basic letter candidates. These candidates are then filtered using geometric and stroke width information to exclude non-text objects. Letters are paired to identify text lines, which are subsequently separated into words. We evaluate our system using the ICDAR competition dataset and our mobile document database. The experimental results demonstrate the excellent performance of the proposed method.

computer vision and pattern recognition | 2011

City-scale landmark identification on mobile devices

David M. Chen; Georges Baatz; Kevin Köser; Sam S. Tsai; Ramakrishna Vedantham; Timo Pylvänäinen; Kimmo Roimela; Xin Chen; Jeff Bach; Marc Pollefeys; Bernd Girod; Radek Grzeszczuk

With recent advances in mobile computing, the demand for visual localization or landmark identification on mobile devices is gaining interest. We advance the state of the art in this area by fusing two popular representations of street-level image data — facade-aligned and viewpoint-aligned — and show that they contain complementary information that can be exploited to significantly improve the recall rates on the city scale. We also improve feature detection in low contrast parts of the street-level data, and discuss how to incorporate priors on a users position (e.g. given by noisy GPS readings or network cells), which previous approaches often ignore. Finally, and maybe most importantly, we present our results according to a carefully designed, repeatable evaluation scheme and make publicly available a set of 1.7 million images with ground truth labels, geotags, and calibration data, as well as a difficult set of cell phone query images. We provide these resources as a benchmark to facilitate further research in the area.

computer vision and pattern recognition | 2009

CHoG: Compressed histogram of gradients A low bit-rate feature descriptor

Vijay Chandrasekhar; Gabriel Takacs; David M. Chen; Sam S. Tsai; Radek Grzeszczuk; Bernd Girod

Establishing visual correspondences is an essential component of many computer vision problems, and is often done with robust, local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile distributed camera networks and large indexing problems. We propose a framework for computing low bit-rate feature descriptors with a 20× reduction in bit rate. The framework is low complexity and has significant speed-up in the matching stage. We represent gradient histograms as tree structures which can be efficiently compressed. We show how to efficiently compute distances between descriptors in their compressed representation eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes.

International Journal of Computer Vision | 2012

Compressed Histogram of Gradients: A Low-Bitrate Descriptor

Vijay Chandrasekhar; Gabriel Takacs; David M. Chen; Sam S. Tsai; Yuriy Reznik; Radek Grzeszczuk; Bernd Girod

Establishing visual correspondences is an essential component of many computer vision problems, which is often done with local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile visual search applications. We propose a framework for computing low bit-rate feature descriptors with a 20× reduction in bit rate compared to state-of-the-art descriptors. The framework offers low complexity and has significant speed-up in the matching stage. We show how to efficiently compute distances between descriptors in the compressed domain eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, BRIEF, MPEG-7 image signatures and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes significantly over a wide range of bitrates. We implement the descriptor in a mobile image retrieval system and for a database of 1 million CD, DVD and book covers, we achieve 96% retrieval accuracy using only 4 KB of data per query image.

computer vision and pattern recognition | 2010

Unified Real-Time Tracking and Recognition with Rotation-Invariant Fast Features

Gabriel Takacs; Vijay Chandrasekhar; Sam S. Tsai; David M. Chen; Radek Grzeszczuk; Bernd Girod

We present a method that unifies tracking and video content recognition with applications to Mobile Augmented Reality (MAR). We introduce the Radial Gradient Transform (RGT) and an approximate RGT, yielding the Rotation-Invariant, Fast Feature (RIFF) descriptor. We demonstrate that RIFF is fast enough for real-time tracking, while robust enough for large scale retrieval tasks. At 26× the speed, our tracking-scheme obtains a more accurate global affine motionmodel than the Kanade Lucas Tomasi (KLT) tracker. The same descriptors can achieve 94% retrieval accuracy from a database of 104 images.

data compression conference | 2009

Tree Histogram Coding for Mobile Image Matching

David M. Chen; Sam S. Tsai; Vijay Chandrasekhar; Gabriel Takacs; Jatinder Pal Singh; Bernd Girod

For mobile image matching applications, a mobile device captures a query image, extracts descriptive features, and transmits these features wirelessly to a server. The server recognizes the query image by comparing the extracted features to its database and returns information associated with the recognition result. For slow links, query feature compression is crucial for low-latency retrieval. Previous image retrieval systems transmit compressed feature descriptors, which is well suited for pairwise image matching. For fast retrieval from large databases, however, scalable vocabulary trees are commonly employed. In this paper, we propose a rate-efficient codec designed for tree-based retrieval. By encoding a tree histogram, our codec can achieve a more than 5x rate reduction compared to sending compressed feature descriptors. By discarding the order amongst a list of features, histogram coding requires 1.5x lower rate than sending a tree node index for every feature. A statistical analysis is performed to study how the entropy of encoded symbols varies with tree depth and the number of features.

IEEE Signal Processing Magazine | 2011

Mobile Visual Location Recognition

Georg Schroth; Robert Huitl; David M. Chen; Mohammad Abu-Alqumsan; Anas Al-Nuaimi; Eckehard G. Steinbach

With recent advances in CBIR, mobile visual location recognition becomes feasible. Using video recordings of a mobile device as a visual fingerprint of the environment and matching them to a georeferenced database provides pose information in a very natural way. Hence, LBSs can be provided without complex infrastructure in areas where the accuracy and availability of GPS is limited. This includes indoor environments where georeferenced data are just about to become publicly available.

Signal Processing-image Communication | 2008

Wyner-Ziv coding of video with unsupervised motion vector learning

David P. Varodayan; David M. Chen; Markus Flierl; Bernd Girod

Distributed source coding theory has long promised a new method of encoding video that is much lower in complexity than conventional methods. In the distributed framework, the decoder is tasked with exploiting the redundancy of the video signal. Among the difficulties in realizing a practical codec has been the problem of motion estimation at the decoder. In this paper, we propose a technique for unsupervised learning of forward motion vectors during the decoding of a frame with reference to its previous reconstructed frame. The technique, described for both pixel-domain and transform-domain coding, is an instance of the expectation maximization algorithm. The performance of our transform-domain motion learning video codec improves as GOP size grows. It is better than using motion-compensated temporal interpolation by 0.5dB when GOP size is 2, and by even more when GOP size is larger. It performs within about 0.25dB of a codec that knows the motion vectors through an oracle, but is hundreds of orders of magnitude less complex than a corresponding brute-force decoder motion search approach would be.

visual communications and image processing | 2009

Transform Coding of Image Feature Descriptors

Vijay Chandrasekhar; Gabriel Takacs; David M. Chen; Sam S. Tsai; Jatinder Pal Singh; Bernd Girod

We investigate transform coding to efficiently store and transmit SIFT and SURF image descriptors. We show that image and feature matching algorithms are robust to significantly compressed features. We achieve near-perfect image matching and retrieval for both SIFT and SURF using ~2 bits/dimension. When applied to SIFT and SURF, this provides a 16× compression relative to conventional floating point representation. We establish a strong correlation between MSE and matching error for feature points and images. Feature compression enables many application that may not otherwise be possible, especially on mobile devices.

Explore More