Katsuhiko Ishiguro
Nippon Telegraph and Telephone
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Katsuhiko Ishiguro.
international conference on pervasive computing | 2010
Takuya Maekawa; Yutaka Yanagisawa; Yasue Kishino; Katsuhiko Ishiguro; Koji Kamei; Yasushi Sakurai; Takeshi Okadome
This paper describes how we recognize activities of daily living (ADLs) with our designed sensor device, which is equipped with heterogeneous sensors such as a camera, a microphone, and an accelerometer and attached to a users wrist. Specifically, capturing a space around the users hand by employing the camera on the wrist mounted device enables us to recognize ADLs that involve the manual use of objects such as making tea or coffee and watering plant. Existing wearable sensor devices equipped only with a microphone and an accelerometer cannot recognize these ADLs without object embedded sensors. We also propose an ADL recognition method that takes privacy issues into account because the camera and microphone can capture aspects of a users private life. We confirmed experimentally that the incorporation of a camera could significantly improve the accuracy of ADL recognition.
IEEE Transactions on Audio, Speech, and Language Processing | 2014
Takuma Otsuka; Katsuhiko Ishiguro; Hiroshi Sawada; Hiroshi G. Okuno
Sound source localization and separation from a mixture of sounds are essential functions for computational auditory scene analysis. The main challenges are designing a unified framework for joint optimization and estimating the sound sources under auditory uncertainties such as reverberation or unknown number of sounds. Since sound source localization and separation are mutually dependent, their simultaneous estimation is required for better and more robust performance. A unified model is presented for sound source localization and separation based on Bayesian nonparametrics. Experiments using simulated and recorded audio mixtures show that a method based on this model achieves state-of-the-art sound source separation quality and has more robust performance on the source number estimation under reverberant environments.
international conference on data mining | 2013
Koh Takeuchi; Ryota Tomioka; Katsuhiko Ishiguro; Akisato Kimura; Hiroshi Sawada
Non-negative Tensor Factorization (NTF) is a widely used technique for decomposing a non-negative value tensor into sparse and reasonably interpretable factors. However, NTF performs poorly when the tensor is extremely sparse, which is often the case with real-world data and higher-order tensors. In this paper, we propose Non-negative Multiple Tensor Factorization (NMTF), which factorizes the target tensor and auxiliary tensors simultaneously. Auxiliary data tensors compensate for the sparseness of the target data tensor. The factors of the auxiliary tensors also allow us to examine the target data from several different aspects. We experimentally confirm that NMTF performs better than NTF in terms of reconstructing the given data. Furthermore, we demonstrate that the proposed NMTF can successfully extract spatio-temporal patterns of peoples daily life such as leisure, drinking, and shopping activity by analyzing several tensors extracted from online review data sets.
international conference on data mining | 2012
Katsuhiko Ishiguro; Akisato Kimura; Koh Takeuchi
The amount and variety of multimedia data such as images, movies and music available on over social networks are increasing rapidly. However, the ability to analyze and exploit these unorganized multimedia data remains inadequate, even with state-of-the-art media processing techniques. Our finding in this paper is that the emerging social curation service is a promising information source for the automatic understanding and mining of images distributed and exchanged via social media. One remarkable virtue of social curation service datasets is that they are weakly supervised: the content in the service is manually collected, selected and maintained by users. This is very different from other social information sources, and we can utilize this characteristics for media content mining without expensive media processing techniques. In this paper we present a machine learning system for predicting view counts of images in social curation data as the first step to automatic image content evaluation. Our experiments confirm that the simple features extracted from a social curation corpus are much superior in terms of count prediction than the gold-standard image features of computer vision research.
acm multimedia | 2013
Akisato Kimura; Katsuhiko Ishiguro; Makoto Yamada; Alejandro Marcos Alvarez; Kaori Kataoka; Kazuhiko Murasaki
This paper proposes a novel method of discovering a set of image contents sharing a specific context (attributes or implicit meaning) with the help of image collections obtained from social curation platforms. Socially curated contents are promising to analyze various kinds of multimedia information, since they are manually filtered and organized based on specific individual preferences, interests or perspectives. Our proposed method fully exploits the process of social curation: (1) How image contents are manually grouped together by users, and (2) how image contents are distributed in the platform. Our method reveals the fact that image contents with a specific context are naturally grouped together and every image content includes really various contexts that cannot necessarily be verbalized by texts.% A preliminary experiment with a small collection of a million of images yields a promising result.
british machine vision conference | 2010
Katsuhiko Ishiguro; Hiroshi Sawada; Hitoshi Sakano
Consider the problem of driver behavior recognition from images captured by a camera installed in a vehicle [4]. Recognition of driver behavior is crucial for driver assistance systems that make driving comfortable and safe. One notable requirement for real applicatioins is that we would like to predict and classify a behavior as quickly as possible: if we detect a sign of dangerous movements such as mobile phone use while driving, we would like to warn the driver quickly before the behavior causes any accidents. This kind of classification task is called “early classification (recognition),” and is important for many practical problems including on-line handwritten character recognition, and speech recognition systems. In this paper, we focus one of the most famous discriminative models, i.e. Adaboost [1, 2], and extend it for early classification of sequences. While existing researches (e.g. [5, 6]) have studied only a binary classification problem, we present a multi-class extension of Adaboost for early classification, called Earlyboost.MH (Fig. 1). In this paper, we propose an efficient multi-class Adaboost for early classification by combining multi-class Adaboost.MH [3] and the early classification Boosting (Earlyboost [6]),
computer vision and pattern recognition | 2008
Katsuhiko Ishiguro; Takeshi Yamada; Naonori Ueda
In this paper, we present a novel on-line probabilistic generative model that simultaneously deals with both the clustering and the tracking of an unknown number of moving objects. The proposed model assumes that i) time series data are composed of a time-varying number of objects and that ii) each object is governed by a mixture of an unknown number of different patterns of dynamics. The problem of learning patterns of dynamics is formulated as the clustering of tracked objects based on a nonparametric Bayesian model with conjugate priors, and this clustering in turn improves the tracking. We present a particle filter for posterior estimation of simultaneous clustering and tracking. Through experiments with synthetic and real movie data, we confirmed that the proposed model successfully learned the hidden cluster patterns and obtained better tracking results than conventional models without clustering.
Publications of the Astronomical Society of Japan | 2016
Mikio Morii; Shiro Ikeda; Nozomu Tominaga; Masaomi Tanaka; Katsuhiko Ishiguro; Junji Yamato; Naonori Ueda; Naotaka Suzuki; Naoki Yasuda; Naoki Yoshida
We present an application of machine-learning (ML) techniques to source selection in the optical transient survey data with Hyper Suprime-Cam (HSC) on the Subaru telescope. Our goal is to select real transient events accurately and in a timely manner out of a large number of false candidates, obtained with the standard difference-imaging method. We have developed the transient selector which is based on majority voting of three ML machines of AUC Boosting, Random Forest, and Deep Neural Network. We applied it to our observing runs of Subaru-HSC in 2015 May and August, and proved it to be efficient in selecting optical transients. The false positive rate was 1.0% at the true positive rate of 90% in the magnitude range of 22.0–25.0 mag for the former data. For the latter run, we successfully detected and reported ten candidates of supernovae within the same day as the observation. From these runs, we learned the following lessons: (1) the training using artificial objects is effective in filtering out false candidates, especially for faint objects, and (2) combination of ML by majority voting is advantageous.
IEEE Transactions on Audio, Speech, and Language Processing | 2014
Takuma Otsuka; Katsuhiko Ishiguro; Takuya Yoshioka; Hiroshi Sawada; Hiroshi G. Okuno
Multichannel signal processing using a microphone array provides fundamental functions for coping with multi-source situations, such as sound source localization and separation, that are needed to extract the auditory information for each source. Auditory uncertainties about the degree of reverberation and the number of sources are known to degrade performance or limit the practical application of microphone array processing. Such uncertainties must therefore be overcome to realize general and robust microphone array processing. These uncertainty issues have been partly addressed-existing methods focus on either source number uncertainty or the reverberation issue, where joint separation and dereverberation has been achieved only for the overdetermined conditions. This paper presents an all-round method that achieves source separation and dereverberation for an arbitrary number of sources including underdetermined conditions. Our method uses Bayesian nonparametrics that realize an infinitely extensible modeling flexibility so as to bypass the model selection in the separation and dereverberation problem, which is caused by the source number uncertainty. Evaluation using a dereverberation and separation task with various numbers of sources including underdetermined conditions demonstrates that (1) our method is applicable to the separation and dereverberation of underdetermined mixtures, and that (2) the source extraction performance is comparable to that of a state-of-the-art method suitable only for overdetermined conditions.
intelligent robots and systems | 2012
Takuma Otsuka; Katsuhiko Ishiguro; Hiroshi Sawada; Hiroshi G. Okuno
Existing auditory functions for robots such as sound source localization and separation have been implemented in a cascaded framework whose overall performance may be degraded by any failure in its subsystems. These approaches often require a careful and environment-dependent tuning for each subsystems to achieve better performance. This paper presents a unified framework for sound source localization and separation where the whole system is integrated as a Bayesian topic model. This method improves both localization and separation with a common configuration under various environments by iterative inference using Gibbs sampling. Experimental results from three environments of different reverberation times confirm that our method outperforms state-of-the-art sound source separation methods, especially in the reverberant environments, and shows localization performance comparable to that of the existing robot audition system.