Takashi Satou | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takashi Satou is active.

Explore More

Publication

Featured researches published by Takashi Satou.

IEEE Transactions on Multimedia | 2010

Affective Audio-Visual Words and Latent Topic Driving Model for Realizing Movie Affective Scene Classification

Go Irie; Takashi Satou; Akira Kojima; Toshihiko Yamasaki; Kiyoharu Aizawa

This paper presents a novel method for movie affective scene classification that outputs the emotion (in the form of labels) that the scene is likely to arouse in viewers. Since the affective preferences of users play an important role in movie selection, affective scene classification has the potential to develop more attractive user-centric movie search and browsing applications. Two main issues in designing movie affective scene classification are considered. One is “how to extract features that are strongly related to the viewers emotions”, and the other is “how to map the extracted features to the emotion categories”. For the former, we propose a method to extract emotion-category-specific audio-visual features named affective audio-visual words (AAVWs). For the latter issue, we propose a classification model named latent topic driving model (LTDM). Assuming that viewers emotions are dynamically changed by the movie scene sequences, LTDM models emotions as Markovian dynamic systems driven by the sequential stimuli of the movie content. Experiments on 206 movie scenes extracted from 24 movie titles and the corresponding labels of eight emotion categories given by 16 subjects show that our method outperforms conventional approaches in terms of the subject agreement rate.

acm multimedia | 2009

Latent topic driving model for movie affective scene classification

Go Irie; Kota Hidaka; Takashi Satou; Akira Kojima; Toshihiko Yamasaki; Kiyoharu Aizawa

This paper proposes a latent topic driving model (LTDM) as a novel approach to movie affective scene classification. LTDM is a discriminative model of emotions driven by movie affective contents. Unlike existing methods, our approach is based on movie topic extraction via the latent Dirichlet allocation (LDA) and emotion dynamics modeling with reference to Plutchiks emotion theory. The classification procedure starts by segmenting movie scenes into movie shots, each of which is represented by a histogram of quantized affect-related audio-visual features. LDA is applied to detect topics of each movie shot. Emotions for the current movie shot are estimated based on both the topics of the shot and emotion transition weights determined by Plutchiks emotion theory. We conduct experiments using 206 movie scenes extracted from 24 movie titles (total 6 hours 20 min. 12 sec.) and the labels of eight emotion categories given by 16 subjects are collected. The results show that LTDM outperforms conventional modeling approaches in terms of the subject agreement rate.

international conference on multimedia and expo | 2009

Affective video segment retrieval for consumer generated videos based on correlation between emotions and emotional audio events

Go Irie; Kota Hidaka; Takashi Satou; Toshihiko Yamasaki; Kiyoharu Aizawa

A novel affective video segment retrieval method based on the correlation between emotion and emotional audio events (EAEs) is presented. The proposed method focuses on retrieving three types of affective video segments, joy, sadness and excitement, by utilizing correlations between emotions and EAEs. The correlation between these emotions and EAEs is investigated by a subjective evaluation. The proposed method detects EAEs and rates each EAE in terms of emotion levels. The EAEs are detected by using the Generalized State-Space Model (GSSM) and low-level audio features. Experiments conducted on Consumer Generated Videos (CGVs) show that the proposed EAE detection outperforms conventional HMM and GMM based methods in terms of accuracy, the agreement rate of the retrieved affective video segments reaches 73.3%.

conference on multimedia modeling | 2012

Improving item recommendation based on social tag ranking

Taiga Yoshida; Go Irie; Takashi Satou; Akira Kojima; Suguru Higashino

Content-based filtering is a popular framework for item recommendation. Typical methods determine items to be recommended by measuring the similarity between items based on the tags provided by users. However, because the usefulness of tags depends on the annotators skills, vocabulary and feelings, many tags are irrelevant. This fact degrades the accuracy of simple content-based recommendation methods. To tackle this issue, this paper enhances content-based filtering by introducing the idea of tag ranking, a state-of-the-art framework that ranks tags according to their relevance levels. We conduct experiments on videos from a video-sharing site. The results show that tag ranking significantly improves item recommendation performance, despite its simplicity.

international conference on multimedia and expo | 2009

A degree-of-edit ranking for consumer generated video retrieval

Go Irie; Kota Hidaka; Takashi Satou; Toshihiko Yamasaki; Kiyoharu Aizawa

We introduce degree-of-edit (DoE) ranking to focus on “how much a CGV is edited” as a ranking measure for consumer generated video (CGV) retrieval; a method to estimate DoE ranking is proposed. In the proposed method, the DoE score of a CGV is estimated by using low-level features such as the number of shot boundaries and time ratio of music. We evaluate the rank correlation between DoE ranking determined by subjects and by our method. To demonstrate its performance in a practical scenario, a user test is performed on over 22,000 CGVs in the context of CGV search. The obtained results show that our method significantly improves conventional CGV ranking results in terms of availabilities of interesting and high-quality CGVs.

acm multimedia | 2010

Automatic trailer generation

Go Irie; Takashi Satou; Akira Kojima; Toshihiko Yamasaki; Kiyoharu Aizawa

This paper presents a content-based movie trailer generation method, named Vid2Trailer (V2T). Since trailers are intended to advertise movies, they must show specific symbols such as the title logo and the main theme music. Moreover, it is expected to attract viewers by its visual and audio content. V2T satisfies these two requirements when creating a trailer from the original movie content. First, the title logo and the main theme music are extracted. Second, impressive speech and video segments are extracted by using an affective content analysis technique. Third, all of the extracted components are concatenated into the form of a trailer; to realize this, we propose a method that estimates the affective impact of shot sequences, and introduce an algorithm that arranges a set of shots so as to maximize the affective impact of the sequence. Experiments show that our V2T is more appropriate to trailer generation than conventional techniques.

acm multimedia | 2011

Image collection summarization for search result overviewing on mobile devices

Go Irie; Takashi Satou; Akira Kojima; Toshihiko Yamasaki; Kiyoharu Aizawa

Due to small displays of mobile devices, overviewing an image search result that contains many and various images is difficult. To provide an overview of thousands of images, recent studies have tried to develop a framework for image collection summarization that extracts a smaller set of representative images from the original set. Most existing methods take (a) relevance and (b) coverage of each image into account. However, for the use on mobile devices, several important issues remain: generated summaries must be compact enough so as to suit the small mobile displays but the legibility of the summaries should be sufficient -- but how? Our focus in this paper is to extend the framework of image collection summarization to fit the context of overviewing image search results on mobile devices. The key advances of this paper are to introduce two primary factors of (c) compactness and (d) legibility when generating summaries. Our solution is a two-stage optimization method. Given a keyword query and display size, its first stage ranks the images by taking (a) relevance and (b) coverage into account. The second optimization stage takes into account (c) compactness and (d) legibility and determines the number and sizes of images included in the final summary so as to satisfy the display size constraint. Experiments conducted on over 240,000 images demonstrate the effectiveness of our method.

conference on human interface | 2007

A video digest and delivery system: ChocoParaTV

Kota Hidaka; Naoya Miyashita; Masaru Fujikawa; Masahiro Yuguchi; Takashi Satou; Katsuhiko Ogawa

This paper proposes a new video digest and delivery system, ChocoParaTV, for applications such as blogs that include video content, and CGM sites. The video digests are generated based on automatic extraction of emphasized speech portions. To deliver the digest information and the digest to audiences, we also propose a new video digest description: CH-RSS; it can generate digested contents dynamically to match the users requested digest time. With the ChocoParaTV browser, the users first view the digest and then visit the original web site if they want to watch the original content. The effectiveness of the proposed system was confirmed by a field experiment; about 40% of the users accessed the original web sites through the ChocoParaTV browser digests. This indicates that our method is very effective in making video content attractive to the audience.

The Journal of The Institute of Image Information and Television Engineers | 2010

R; a statistical analysis software

Go Irie; Takashi Satou

Proceedings of the Annual Conference of the Institute of Image Electronics Engineers of Japan Media Computing Conference 2006 - Proceedings of the 34th Annual Conference of the Institute of Image Electronics Engineers of Japan 2006 - | 2006