Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yi-Min Tsai is active.

Publication


Featured researches published by Yi-Min Tsai.


international conference on consumer electronics | 2009

A block-based 2D-to-3D conversion system with bilateral filter

Chao-Chung Cheng; Chung-Te Li; Po-Sen Huang; Tsung-Kai Lin; Yi-Min Tsai; Liang-Gee Chen

The three-dimensional (3D) displays provide a dramatic improvement of visual quality over the 2D displays. The conversion of existing 2D videos to 3D videos is necessary for multimedia application. This paper presents an automatic and robust system to convert 2D videos to 3D videos. The proposed 2D-to-3D conversion combines two major depth generation modules, the depth from motion and depth from geometrical perspective. A block-based algorithm is applied and cooperates with the bilateral filter to diminish block effect and generate comfortable depth map. After generating the depth map, the multi-view video is rendered to 3D display.


international symposium on intelligent signal processing and communication systems | 2006

Block-based Vanishing Line and Vanishing Point Detection for 3D Scene Reconstruction

Yi-Min Tsai; Yu-Lin Chang; Liang-Gee Chen

This paper presents a robust and reliable block-based method for vanishing line and vanishing point detection. The proposed method provides assistance in construction of a depth map, which is a necessity of 3D scenes. Moreover, the method focuses on the fundamental image structural element analysis, and divides them into six successive steps. Furthermore, complicated mathematic calculation and approximation are replaced by six block-based estimation algorithms. Also, the method is a regular algorithm design which is suitable for VLSI design on future 3D applications.


IEEE Journal of Solid-state Circuits | 2012

A 52 mW Full HD 160-Degree Object Viewpoint Recognition SoC With Visual Vocabulary Processor for Wearable Vision Applications

Yu-Chi Su; Keng-Yen Huang; Tse-Wei Chen; Yi-Min Tsai; Shao-Yi Chien; Liang-Gee Chen

A 1920 × 1080 160° object viewpoint recognition system-on-chip (SoC) is presented in this paper. The SoC design is dedicated to wearable vision applications, and we address several crucial issues including the low recognition accuracy due to the use of low resolution images and dramatic changes in object viewpoints, and the high power consumption caused by the complex computations in existing computer vision object recognition systems. The human-centered design (HCD) mechanism is proposed in order to maintain a high recognition rate in difficult situations. To overcome the degradation of accuracy when dramatic changes to the object viewpoint occur, the object viewpoint prediction (OVP) engine in the HCD provides 160° object viewpoint in- variance by synthesizing various object poses from predicted object viewpoints. To achieve low power consumption, the visual vocabulary processor (VVP), which is based on bag-of-words (BoW) matching algorithm, is used to advance the matching stage from the feature-level to the object-level and results in a 97% reduction in the required memory bandwidth compared to previous recognition systems. Moreover, the matching efficiency of the VVP enables the system to support real-time full HD (1920 × 1080) processing, thereby improving the recognition rate for detecting a traffic light at a distance of 50 m to 95% compared to the 29% recognition rate for VGA (640 × 480) processing. The real-time 1920 × 1080 visual recognition chip is realized on a 6.38 mm2 die with 65 nm CMOS technology. It achieves an average recognition rate of 94%, a power efficiency of 1.18 TOPS/W, and an area efficiency of 25.9 GOPS/mm2 while only dissipating 52 mW at 1.0 V.


international conference on image processing | 2010

Video stabilization for vehicular applications using SURF-like descriptor and KD-tree

Keng-Yen Huang; Yi-Min Tsai; Chih-Chung Tsai; Liang-Gee Chen

This paper describes a method to stabilize video for vehicular applications based on feature analysis. An investigation on camera motion model is conducted. Harris features are extracted under the proposed resolution adaptation scheme. Besides, features are described with SURF-like descriptor. For feature matching, KD-tree with best-bin-first search significantly reduces the matching time. A damping filer is utilized to model and predict the unwanted oscillation. 93.1% correct rate in average is achieved in divergent driving conditions. Only 0.114 second is required to process a frame at resolution 1280×960. The provided benchmark shows outperformance of the proposed method.


Proceedings of SPIE | 2009

Hybrid depth cueing for 2D-to-3D conversion system

Chao-Chung Cheng; Chung-Te Li; Yi-Min Tsai; Liang-Gee Chen

The three-dimensional (3D) displays provide a dramatic improvement of visual quality than the 2D displays do. The conversion of existing 2D videos to 3D videos is necessary for multimedia application. This paper presents a robust system to convert 2D videos to 3D videos. The main concepts are to extract the depth information from motion parallax of moving picture and to depth information from geometrical perspective in non-moving scene. In the first part, depthinduced motion information is reconstructed by motion vector to disparity mapping. By warping the consecutive video frames to parallel view angle with the current frame, the frame with suitable baseline is selected to generate depth using motion parallax information. However, video may not have the depth-induced motion information in every case. For scene without motion parallax, depth from geometrical perspective is applied to generate scene depth map. Scene depth map is assigned depending on the scene mode and analyzed line structure in the video. Combining these two depth cues, the stereo effect is enhanced and provide spectacular depth map. The depth map is then used to render the multi-view video for 3D display.


electronic imaging | 2008

Priority depth fusion for the 2D to 3D conversion system

Yu-Lin Chang; Wei-Yin Chen; Jing-Ying Chang; Yi-Min Tsai; Chia-Lin Lee; Liang-Gee Chen

For the sake of providing 3D contents for up-coming 3D display devices, a real-time automatic depth fusion 2D-to-3D conversion system is needed on the home multimedia platform. We proposed a priority depth fusion algorithm with a 2D-to-3D conversion system which generates the depth map from most of the commercial video sequences. The results from different kinds of depth reconstruction methods are integrated into one depth map by the proposed priority depth fusion algorithm. Then the depth map and the original 2D image are converted to stereo images for showing on the 3D display devices. In this paper, a 2D-to-3D conversion algorithm set is combined with the proposed depth fusion algorithm to show the improved results. With the converted 3D contents, the needs for 3D display devices will also increase. As long as the two technologies evolve, the 3D-TV era will come as soon as possible.


symposium on vlsi circuits | 2012

A 69mW 140-meter/60fps and 60-meter/300fps intelligent vision SoC for versatile automotive applications

Yi-Min Tsai; Tien-Ju Yang; Chih-Chung Tsai; Keng-Yen Huang; Liang-Gee Chen

A machine-learning based intelligent vision SoC implemented on a 9.3 mm2 die in a 40nm CMOS process is presented. The architecture realizes 140 meters active distance at 60fps and 60 meters at 300fps under Quad-VGA (1280×960) resolution while maintaining above 90% detection rate for versatile automotive applications. The system supports 64 object tracking and prediction. It raises 1.62× improvement on power efficiency and at least 1.79× increase on frame rate with the proposed knowledge-based tracking processor. The chip achieves 354.2fps/W and 3.01TOPS/W power efficiency with 69mW average power consumption.


IEEE Transactions on Very Large Scale Integration Systems | 2012

Visual Vocabulary Processor Based on Binary Tree Architecture for Real-Time Object Recognition in Full-HD Resolution

Tse-Wei Chen; Yu-Chi Su; Keng-Yen Huang; Yi-Min Tsai; Shao-Yi Chien; Liang-Gee Chen

Feature matching is an indispensable process for object recognition, which is an important issue for wearable devices with video analysis functionalities. To implement a low-power SoC for object recognition, the proposed visual vocabulary processor (VVP) is employed to accelerate the speed of feature matching. The VVP can transform hundreds of 128-D SIFT vectors into a 64-D histogram for object matching by using the binary-tree-based architecture, and 16 calculators for the computations of the Euclidean distances are designed for each of the two processors in each level. A total of 126 visual words can be saved in the six-level hierarchical memory, which instantly offers the data required for the matching process, and more than 5 times of bandwidth can be saved compared with the non-binary-tree-based architecture. As a part of the recognition SoC, the VVP is implemented with the 65-nm CMOS technology, and the experimental results show that the gate count and the average power consumption are 280 K and 5.6 mW, respectively.


international symposium on consumer electronics | 2010

Feature-based video stabilization for vehicular applications

Keng-Yen Huang; Yi-Min Tsai; Chih-Chung Tsai; Liang-Gee Chen

This paper describes a method to stabilize video for vehicular applications based on Harris features and adaptive resolution. Lucas-Kanade method is applied to match feature points of consecutive frames and construct the feature motion flow. A damping filer is utilized to model the unwanted motion and global motion is separated by extracting oscillation. 92% correct rate with 0.54 second per frame is achieved. The provided benchmark shows outperformance of the proposed method.


international conference on pattern recognition | 2010

Learning-Based Vehicle Detection Using Up-Scaling Schemes and Predictive Frame Pipeline Structures

Yi-Min Tsai; Keng-Yen Huang; Chih-Chung Tsai; Liang-Gee Chen

This paper aims at detecting preceding vehicles in a variety of distance. A sub-region up-scaling scheme significantly raises far distance detection capability. Three frame pipeline structures involving object predictors are explored to further enhance accuracy and efficiency. It claims a 140-meter detecting distance along proposed methodology. 97.1% detection rate with 4.2% false alarm rate is achieved. At last, the benchmark of several learning-based vehicle detection approaches is provided.

Collaboration


Dive into the Yi-Min Tsai's collaboration.

Top Co-Authors

Avatar

Liang-Gee Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Keng-Yen Huang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chih-Chung Tsai

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Tien-Ju Yang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Yu-Lin Chang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chao-Chung Cheng

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Chung-Te Li

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Shao-Yi Chien

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Jing-Ying Chang

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Tse-Wei Chen

National Taiwan University

View shared research outputs
Researchain Logo
Decentralizing Knowledge