Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kota Iwamoto is active.

Publication


Featured researches published by Kota Iwamoto.


international conference on image processing | 2006

Image Signature Robust to Caption Superimposition for Video Sequence Identification

Kota Iwamoto; Eiji Kasutani; Akio Yamada

This paper proposes an image signature robust to caption superimposition for video sequence identification. A new image signature which is a set of local features is developed for a high-speed frame-by-frame matching of video sequences. The signature of a frame is obtained by partitioning the image into blocks and extracting the local feature representing the dominant type of edge direction from each block. The similarity between the signatures is calculated by comparing the edge types of the corresponding blocks, and counting the number of the blocks having the same edge type. A weighting scheme based on the probability of caption superimposition for each block can be applied to the similarity calculation to improve the matching performance. The experimental results of the video sequence identification show that the proposed signature achieves precision of 99.65% and recall of 99.45%, improving both the precision and the recall by more than 30% compared with the conventional signature.


IEEE Transactions on Circuits and Systems for Video Technology | 2012

The MPEG-7 Video Signature Tools for Content Identification

Stavros Paschalakis; Kota Iwamoto; Paul Brasnett; Nikola Sprljan; Ryoma Oami; Toshiyuki Nomura; Akio Yamada; Miroslaw Bober

This paper presents the core technologies of the video signature tools recently standardized by ISO/IEC Moving Picture Experts Group (MPEG) as an amendment to the MPEG-7 Standard (ISO/IEC 15938). The video signature is a high-performance content fingerprint that is suitable for desktop scale to web-scale deployment and provides high levels of robustness to common video editing operations and high temporal localization accuracy at extremely low false alarm rates, achieving a detection rate in the order of 96% at a false alarm rate in the order of five false matches per million comparisons. The applications of the video signature are numerous and include rights management and monetization, distribution management, usage monitoring, metadata association, and corporate or personal database management. In this paper, we review the prior work in the field, explain the standardization process and status, and provide details and evaluation results for the video signature tools.


international conference on image processing | 2013

BRIGHT: A scalable and compact binary descriptor for low-latency and high accuracy object identification

Kota Iwamoto; Ryota Mase; Toshiyuki Nomura

This paper proposes a new scalable and compact binary local descriptor, named the BRIGHT (Binary ResIzable Gradient HisTogram) descriptor, for low-latency and high accuracy identification of real-world objects in images. The BRIGHT descriptor is extracted by first creating a hierarchical HOG (Histogram of Oriented Gradients) of a local patch centered around keypoints detected from an image. The elements of the histogram are then binarized, and the subset of bits is progressively selected forming a progressively scalable descriptor with a size ranging from only 32 bits to 150 bits. Experiment using images with objects taken under various camera viewpoints, lighting conditions, and occlusions, shows that the BRIGHT descriptor can robustly match objects with an identification accuracy comparable with that of SIFT descriptor, but at a descriptor size smaller than 1/10 of SIFT. With the reduced descriptor size, transmission of descriptors from a mobile device to a database server can be dramatically speeded up, enabling low-latency response in mobile search services.


international conference on image processing | 2007

Detection of Wipes and Digital Video Effects Based on a Pattern-Independent Model of Image Boundary Line Characteristics

Kota Iwamoto; Kyoji Hirata

This paper proposes detection of wipes and digital video effects (DVEs) in a video sequence based on a new pattern-independent model. This model is based on the characteristics of image boundary lines dividing the two image regions in the transitional frames. Wipes and DVEs are modeled as frame sequences where either (A) a single boundary line moves continuously in a time sequence, or (B) multiple boundary lines form a quadrilateral within a frame. The model is applied to the image boundary lines extracted from a video sequence to detect wipes and DVEs. An evaluation using news programs containing various patterns of wipes and DVEs shows that the proposed method achieves recall of 91.5 % and precision of 60.7 %, improving the conventional twin-comparison method by 29.6 % in recall and 46.5 % in precision.


asian conference on computer vision | 2014

Local Feature Based Multiple Object Instance Identification Using Scale and Rotation Invariant Implicit Shape Model

Ruihan Bao; Kyota Higa; Kota Iwamoto

In this paper, we propose a Scale and Rotation Invariant Implicit Shape Model (SRIISM), and develop a local feature matching based system using the model to accurately locate and identify large numbers of object instances in an image. Due to repeated instances and cluttered background, conventional methods for multiple object instance identification suffer from poor identification results. In the proposed SRIISM, we model the joint distribution of object centers, scale, and orientation computed from local feature matches in Hough voting, which is not only invariant to scale changes and rotation of objects, but also robust to false feature matches. In the multiple object instance identification system using SRIISM, we apply a fast 4D bin search method in Hough space with complexity \(O(n)\), where \(n\) is the number of feature matches, in order to segment and locate each instance. Furthermore, we apply maximum likelihood estimation (MLE) for accurate object pose detection. In the evaluation, we created datasets simulating various industrial applications such as pick-and-place and inventory management. Experiment results on the datasets show that our method outperforms conventional methods in both accuracy (5 %–30 % gain) and speed (2x speed up).


international conference on image processing | 2013

Multiple object identification using grid voting of object center estimated from keypoint matches

Kyota Higa; Kota Iwamoto; Toshiyuki Nomura

This paper proposes a method to detect and identify multiple objects in an image using grid voting of object center positions estimated from local descriptor keypoint matches. For each keypoint match, the proposed method estimates the object center position using scale and orientation associated with the keypoints. Then, it casts a vote for an image grid where the estimated object center is located. For the grids with high number of votes, geometric verification of the keypoint matches is carried out to accurately localize multiple objects in the image. Since the computational complexity of the grid voting is O(n), where n is the number of estimated object centers, the proposed method runs faster than a conventional method using mean shift clustering with O(n2) complexity. Experimental results using images with 52 objects show that the proposed method reduces the computational time by approximately 60% compared to the conventional method, while identification accuracy is comparable. With the reduced computational complexity, industrial applications such as an efficient inventory management in retail using images are enabled in practical computational time.


international conference on consumer electronics | 2006

Linking TV programs with Internet contents based on video fingerprinting

Eiji Kasutani; Kota Iwamoto; Akio Yamada

This paper proposes a system that provides TV/Internet converged services by linking video segments in TV programs with their related Internet contents without forcing any changes to TV contents. This system allows users to access the Internet contents related to the video clips currently played back. A robust video fingerprinting method is employed to link video segments with its related contents. Experimental results show that this system makes it possible to create those links fully automatically with no errors


international conference on consumer electronics | 2013

Visual duplicate based topic linking using a robust video signature

Kota Iwamoto; Takami Sato; Ryoma Oami; Toshiyuki Nomura

This paper proposes a topic linking using a robust video signature to detect visual duplicates for grouping coherent topics in a video archive. The proposed video signature, which was accepted as part of a new ISO/IEC standard “MPEG-7 Video Signature Tools”, is designed for robust and high-speed detection of visual duplicates in a large database. It represents intensity differences between various sub-regions in a frame, which provides robustness to various modifications to videos, including caption overlay and compression. We show that the proposed video signature significantly improves the detection rate of visual duplicate segments, by more than 40% under caption overlay, compared with conventional visual features. We also present our topic linking system with its visual presentation of topic groups for efficient browsing and viewing of videos.


international conference on image processing | 2016

Fast 2D-to-3D matching with camera pose voting for 3D object identification

Ruihan Bao; Kota Iwamoto

In this paper, we propose a fast non-iterative camera pose voting method for 3D object identification. The proposed method improves the accuracy and speed upon the conventional local feature based 2D-to-3D matching between a 2D image and a 3D model reconstructed by the structure-from-motion (SfM) pipeline. Instead of performing iterative RANSAC based method for geometric verification, the proposed method computes a hypothesis of camera pose from each feature correspondence between the query image and the 3D model. The camera pose is computed using scale, orientation and coordinate of the local features, calibrated by the camera matrix of the database image used to construct the 3D model. Then the most likely hypothesis is found by carrying out a two-stage clustering on the estimated camera poses in the parameter space. Experiment results on the 3D machine datasets show that our method improves the identification accuracy from 82.8% to 84.9% when FPR is 1%, compared with conventional RANSAC based method. In addition, the processing speed for the geometric verification is improved up to 25 times compared to the conventional method.


Archive | 2005

Image similarity calculation system, image search system, image similarity calculation method, and image similarity calculation program

Kota Iwamoto

Researchain Logo
Decentralizing Knowledge