Toshihiko Yamasaki | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Toshihiko Yamasaki is active.

Explore More

Publication

Featured researches published by Toshihiko Yamasaki.

acm workshop on continuous archival and retrieval of personal experiences | 2005

Practical experience recording and indexing of Life Log video

Datchakorn Tancharoen; Toshihiko Yamasaki; Kiyoharu Aizawa

This paper presents an experience recording system and proposes practical video retrieval techniques based on Life Log content and context analysis. We summarize our effective indexing methods including content based talking scene detection and context based key frame extraction based on GPS data. The voice annotation and detection is proposed for practical indexing method. Moreover, we apply an additional body sensor to record our life style and analyze humans physiological data for Life Log retrieval system. In the experiments, we demonstrated various video indexing results which provided their semantic key frames and Life Log interfaces to retrieve and index our life experiences effectively.

acm multimedia | 2005

Evaluation of video summarization for a large number of cameras in ubiquitous home

Gamhewage C. de Silva; Toshihiko Yamasaki; Kiyoharu Aizawa

A system for video summarization in a ubiquitous environment is presented. Data from pressure-based floor sensors are clustered to segment footsteps of different persons. Video handover has been implemented to retrieve a continuous video showing a person moving in the environment. Several methods for extracting key frames from the resulting video sequences have been implemented, and evaluated by experiments. It was found that most of the key frames the human subjects desire to see could be retrieved using an adaptive algorithm based on camera changes and the number of footsteps within the view of the same camera. The system consists of a graphical user interface that can be used to retrieve video summaries interactively using simple queries.

IEEE Transactions on Circuits and Systems for Video Technology | 2007

Time-Varying Mesh Compression Using an Extended Block Matching Algorithm

Seung-Ryong Han; Toshihiko Yamasaki; Kiyoharu Aizawa

Time-varying mesh, which is attracting a lot of attention as a new multimedia representation method, is a sequence of 3-D models that are composed of vertices, edges, and some attribute components such as color. Among these components, vertices require large storage space. In conventional 2-D video compression algorithms, motion compensation (MC) using a block matching algorithm is frequently employed to reduce temporal redundancy between consecutive frames. However, there has been no such technology for 3-D time-varying mesh so far. Therefore, in this paper, we have developed an extended block matching algorithm (EBMA) to reduce the temporal redundancy of the geometry information in the time-varying mesh by extending the idea of the 2-D block matching algorithm to 3-D space. In our EBMA, a cubic block is used as a matching unit. MC in the 3-D space is achieved efficiently by matching the mean normal vectors calculated from partial surfaces in cubic blocks, which our experiments showed to be a suboptimal matching criterion. After MC, residuals are transformed by the discrete cosine transform, uniformly quantized, and then encoded. The extracted motion vectors are also entropy coded after differential pulse code modulation. As a result of our experiments, 10%-18% compression has been achieved.

acm workshop on continuous archival and retrieval of personal experiences | 2005

Experience retrieval in a ubiquitous home

Gamhewage C. de Silva; Byoungjun Oh; Toshihiko Yamasaki; Kiyoharu Aizawa

We present a system for retrieval and summarization of continuously archived multimedia data from a home-like ubiquitous environment. Data from pressure-based floor sensors are analyzed to index video and audio from a large number of sources. Video and audio handover are implemented to retrieve continuous video streams with sound as a person is moving in the environment. Key frame extraction is proposed and several algorithms are implemented to obtain compact summaries corresponding to the activity of each person. Clustering algorithms and image analysis are used to identify actions and events. The system consists of a graphical user interface that can be used to retrieve video summaries interactively using simple queries.

Multimedia Tools and Applications | 2012

Determination of emotional content of video clips by low-level audiovisual features

René Marcelino Abritta Teixeira; Toshihiko Yamasaki; Kiyoharu Aizawa

Affective analysis of video content has greatly increased the possibilities of the way we perceive and deal with media. Different kinds of strategies have been tried, but results are still opened to improvements. Most of the problems come from the lack of standardized test set and real affective models. In order to cope with these issues, in this paper we describe the results of our work on the determination of affective models for evaluation of video clips using audiovisual low-level features. The affective models were developed following two classes of psychological theories of affect: categorial and dimensional. The affective models were created from real data, acquired through a series of user experiments. They reflect the affective state of a viewer after watching a certain scene from a movie. We evaluate the detection of Pleasure, Arousal and Dominance coefficients as well as the detection rate of six affective categories. For this end, two Bayesian network topologies are used, a Hidden Markov Model and an Autoregressive Hidden Markov Model. The measurements were done using audio-only models, video-only models and fused models. Fusion is done using two different methods, a Decision Level Fusion and Feature Level Fusion. All tests were conducted using localized affective models, both categorial and dimensional. Results are presented in terms of detection rate and accuracy for affective families, affective dimensions and probabilistic networks. Arousal was the best detected dimension, followed by dominance and pleasure.

Multimedia Tools and Applications | 2017

Sketch-based manga retrieval using manga109 dataset

Yusuke Matsui; Kota Ito; Yuji Aramaki; Azuma Fujimoto; Toru Ogawa; Toshihiko Yamasaki; Kiyoharu Aizawa

Manga (Japanese comics) are popular worldwide. However, current e-manga archives offer very limited search support, i.e., keyword-based search by title or author. To make the manga search experience more intuitive, efficient, and enjoyable, we propose a manga-specific image retrieval system. The proposed system consists of efficient margin labeling, edge orientation histogram feature description with screen tone removal, and approximate nearest-neighbor search using product quantization. For querying, the system provides a sketch-based interface. Based on the interface, two interactive reranking schemes are presented: relevance feedback and query retouch. For evaluation, we built a novel dataset of manga images, Manga109, which consists of 109 comic books of 21,142 pages drawn by professional manga artists. To the best of our knowledge, Manga109 is currently the biggest dataset of manga images available for research. Experimental results showed that the proposed framework is efficient and scalable (70 ms from 21,142 pages using a single computer with 204 MB RAM).

international conference on multimedia and expo | 2010

Image processing based approach to food balance analysis for personal food logging

Keigo Kitamura; Chaminda de Silva; Toshihiko Yamasaki; Kiyoharu Aizawa

Food images have been receiving increased attention in recent dietary control methods. We present the current status of our web-based system that can be used as a dietary management support system by ordinary Internet users. The system analyzes image archives of the user to identify images of meals. Further image analysis determines the nutritional composition of these meals and stores the data to form a Foodlog. The user can view the data in different formats, and also edit the data to correct any mistakes that occurred during image analysis. This paper presents detailed analysis of the performance of the current system and proposes an improvement of analysis by pre-classification and personalization. As a result, the accuracy of food balance estimation is significantly improved.

international conference on computer vision | 2015

PQTable: Fast Exact Asymmetric Distance Neighbor Search for Product Quantization Using Hash Tables

Yusuke Matsui; Toshihiko Yamasaki; Kiyoharu Aizawa

We propose the product quantization table (PQTable), a product quantization-based hash table that is fast and requires neither parameter tuning nor training steps. The PQTable produces exactly the same results as a linear PQ search, and is 102 to 105 times faster when tested on the SIFT1B data. In addition, although state-of-the-art performance can be achieved by previous inverted-indexing-based approaches, such methods do require manually designed parameter setting and much training, whereas our method is free from them. Therefore, PQTable offers a practical and useful solution for real-world problems.

acm multimedia | 2013

Personalized intra- and inter-city travel recommendation using large-scale geotags

Toshihiko Yamasaki; Andrew C. Gallagher; Tsuhan Chen

In this paper, a geotag-based inter- and intra-city travel recommendation system that considers both the personal preference and the seasonal/temporal popularity is presented. For the inter-city recommendation, a combination of two similarity measure among users is proposed. Accurate intra-city recommendation is achieved by incorporating the seasonal and temporal information into a Markov model. The effectiveness of the proposed algorithm has been experimentally demonstrated by using more than 6 million geotags downloaded from Flickr.

Proceedings of the First International Workshop on Internet-Scale Multimedia Management | 2014

Social Popularity Score: Predicting Numbers of Views, Comments, and Favorites of Social Photos Using Only Annotations

Toshihiko Yamasaki; Shumpei Sano; Kiyoharu Aizawa

In this paper, we propose an algorithm to predict the social popularity (i.e., the numbers of views, comments, and favorites) of content on social networking services using only text annotations. Instead of analyzing image/video content, we try to estimate social popularity by a combination of weight vectors obtained from a support vector regression (SVR) and tag frequency. Since our proposed algorithm uses text annotations instead of image/video features, its computational cost is small. As a result, we can estimate social popularity more efficiently than previously proposed methods. Furthermore, tags that significantly affect social popularity can be extracted using our algorithm. Our experiments involved using one million photos on the social networking website Flickr, and the results showed a high correlation between actual social popularity and the determination thereof using our algorithm. Moreover, the proposed algorithm can achieve high classification accuracy with regard to a classification between popular and unpopular content.

Explore More