Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sam S. Tsai is active.

Publication


Featured researches published by Sam S. Tsai.


IEEE Signal Processing Magazine | 2011

Mobile Visual Search

Bernd Girod; Vijay Chandrasekhar; David M. Chen; Ngai-Man Cheung; Radek Grzeszczuk; Yuriy Reznik; Gabriel Takacs; Sam S. Tsai; Ramakrishna Vedantham

Mobile phones have evolved into powerful image and video processing devices equipped with high-resolution cameras, color displays, and hardware-accelerated graphics. They are also increasingly equipped with a global positioning system and connected to broadband wireless networks. All this enables a new class of applications that use the camera phone to initiate search queries about objects in visual proximity to the user (Figure 1). Such applications can be used, e.g., for identifying products, comparison shopping, finding information about movies, compact disks (CDs), real estate, print media, or artworks.


international conference on image processing | 2011

Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions

Huizhong Chen; Sam S. Tsai; Georg Schroth; David M. Chen; Radek Grzeszczuk; Bernd Girod

Detecting text in natural images is an important prerequisite. In this paper, we propose a novel text detection algorithm, which employs edge-enhanced Maximally Stable Extremal Regions as basic letter candidates. These candidates are then filtered using geometric and stroke width information to exclude non-text objects. Letters are paired to identify text lines, which are subsequently separated into words. We evaluate our system using the ICDAR competition dataset and our mobile document database. The experimental results demonstrate the excellent performance of the proposed method.


computer vision and pattern recognition | 2011

City-scale landmark identification on mobile devices

David M. Chen; Georges Baatz; Kevin Köser; Sam S. Tsai; Ramakrishna Vedantham; Timo Pylvänäinen; Kimmo Roimela; Xin Chen; Jeff Bach; Marc Pollefeys; Bernd Girod; Radek Grzeszczuk

With recent advances in mobile computing, the demand for visual localization or landmark identification on mobile devices is gaining interest. We advance the state of the art in this area by fusing two popular representations of street-level image data — facade-aligned and viewpoint-aligned — and show that they contain complementary information that can be exploited to significantly improve the recall rates on the city scale. We also improve feature detection in low contrast parts of the street-level data, and discuss how to incorporate priors on a users position (e.g. given by noisy GPS readings or network cells), which previous approaches often ignore. Finally, and maybe most importantly, we present our results according to a carefully designed, repeatable evaluation scheme and make publicly available a set of 1.7 million images with ground truth labels, geotags, and calibration data, as well as a difficult set of cell phone query images. We provide these resources as a benchmark to facilitate further research in the area.


computer vision and pattern recognition | 2009

CHoG: Compressed histogram of gradients A low bit-rate feature descriptor

Vijay Chandrasekhar; Gabriel Takacs; David M. Chen; Sam S. Tsai; Radek Grzeszczuk; Bernd Girod

Establishing visual correspondences is an essential component of many computer vision problems, and is often done with robust, local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile distributed camera networks and large indexing problems. We propose a framework for computing low bit-rate feature descriptors with a 20× reduction in bit rate. The framework is low complexity and has significant speed-up in the matching stage. We represent gradient histograms as tree structures which can be efficiently compressed. We show how to efficiently compute distances between descriptors in their compressed representation eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes.


International Journal of Computer Vision | 2012

Compressed Histogram of Gradients: A Low-Bitrate Descriptor

Vijay Chandrasekhar; Gabriel Takacs; David M. Chen; Sam S. Tsai; Yuriy Reznik; Radek Grzeszczuk; Bernd Girod

Establishing visual correspondences is an essential component of many computer vision problems, which is often done with local feature-descriptors. Transmission and storage of these descriptors are of critical importance in the context of mobile visual search applications. We propose a framework for computing low bit-rate feature descriptors with a 20× reduction in bit rate compared to state-of-the-art descriptors. The framework offers low complexity and has significant speed-up in the matching stage. We show how to efficiently compute distances between descriptors in the compressed domain eliminating the need for decoding. We perform a comprehensive performance comparison with SIFT, SURF, BRIEF, MPEG-7 image signatures and other low bit-rate descriptors and show that our proposed CHoG descriptor outperforms existing schemes significantly over a wide range of bitrates. We implement the descriptor in a mobile image retrieval system and for a database of 1 million CD, DVD and book covers, we achieve 96% retrieval accuracy using only 4 KB of data per query image.


computer vision and pattern recognition | 2010

Unified Real-Time Tracking and Recognition with Rotation-Invariant Fast Features

Gabriel Takacs; Vijay Chandrasekhar; Sam S. Tsai; David M. Chen; Radek Grzeszczuk; Bernd Girod

We present a method that unifies tracking and video content recognition with applications to Mobile Augmented Reality (MAR). We introduce the Radial Gradient Transform (RGT) and an approximate RGT, yielding the Rotation-Invariant, Fast Feature (RIFF) descriptor. We demonstrate that RIFF is fast enough for real-time tracking, while robust enough for large scale retrieval tasks. At 26× the speed, our tracking-scheme obtains a more accurate global affine motionmodel than the Kanade Lucas Tomasi (KLT) tracker. The same descriptors can achieve 94% retrieval accuracy from a database of 104 images.


data compression conference | 2009

Tree Histogram Coding for Mobile Image Matching

David M. Chen; Sam S. Tsai; Vijay Chandrasekhar; Gabriel Takacs; Jatinder Pal Singh; Bernd Girod

For mobile image matching applications, a mobile device captures a query image, extracts descriptive features, and transmits these features wirelessly to a server. The server recognizes the query image by comparing the extracted features to its database and returns information associated with the recognition result. For slow links, query feature compression is crucial for low-latency retrieval. Previous image retrieval systems transmit compressed feature descriptors, which is well suited for pairwise image matching. For fast retrieval from large databases, however, scalable vocabulary trees are commonly employed. In this paper, we propose a rate-efficient codec designed for tree-based retrieval. By encoding a tree histogram, our codec can achieve a more than 5x rate reduction compared to sending compressed feature descriptors. By discarding the order amongst a list of features, histogram coding requires 1.5x lower rate than sending a tree node index for every feature. A statistical analysis is performed to study how the entropy of encoded symbols varies with tree depth and the number of features.


visual communications and image processing | 2009

Transform Coding of Image Feature Descriptors

Vijay Chandrasekhar; Gabriel Takacs; David M. Chen; Sam S. Tsai; Jatinder Pal Singh; Bernd Girod

We investigate transform coding to efficiently store and transmit SIFT and SURF image descriptors. We show that image and feature matching algorithms are robust to significantly compressed features. We achieve near-perfect image matching and retrieval for both SIFT and SURF using ~2 bits/dimension. When applied to SIFT and SURF, this provides a 16× compression relative to conventional floating point representation. We establish a strong correlation between MSE and matching error for feature points and images. Feature compression enables many application that may not otherwise be possible, especially on mobile devices.


Signal Processing | 2013

Residual enhanced visual vector as a compact signature for mobile visual search

David M. Chen; Sam S. Tsai; Vijay Chandrasekhar; Gabriel Takacs; Ramakrishna Vedantham; Radek Grzeszczuk; Bernd Girod

Many mobile visual search (MVS) systems transmit query data from a mobile device to a remote server and search a database hosted on the server. In this paper, we present a new architecture for searching a large database directly on a mobile device, which can provide numerous benefits for network-independent, low-latency, and privacy-protected image retrieval. A key challenge for on-device retrieval is storing a large database in the limited RAM of a mobile device. To address this challenge, we develop a new compact, discriminative image signature called the Residual Enhanced Visual Vector (REVV) that is optimized for sets of local features which are fast to extract on mobile devices. REVV outperforms existing compact database constructions in the MVS setting and attains similar retrieval accuracy in large-scale retrieval as a Vocabulary Tree that uses 25x more memory. We have utilized REVV to design and construct a mobile augmented reality system for accurate, large-scale landmark recognition. Fast on-device search with REVV enables our system to achieve latencies around 1s per query regardless of external network conditions. The compactness of REVV allows it to also function well as a low-bitrate signature that can be transmitted to or from a remote server for an efficient expansion of the local database search when required.


acm multimedia | 2010

Mobile product recognition

Sam S. Tsai; David M. Chen; Vijay Chandrasekhar; Gabriel Takacs; Ngai-Man Cheung; Ramakrishna Vedantham; Radek Grzeszczuk; Bernd Girod

We present a mobile product recognition system for the camera-phone. By snapping a picture of a product with a camera-phone, the user can retrieve online information of the product. The product is recognized by an image-based retrieval system located on a remote server. Our database currently comprises more than one million entries, primarily products packaged in rigid boxes with printed labels, such as CDs, DVDs, and books. We extract low bit-rate descriptors from the query image and compress the location of the descriptors using location histogram coding on the camera-phone. We transmit the compressed query features, instead of a query image, to reduce the transmission delay. We use inverted index compression and fast geometric re-ranking on our database to provide a low delay image recognition response for large scale databases. Experimental timing results on different parts of the mobile product recognition system is reported in this work.

Collaboration


Dive into the Sam S. Tsai's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge