Tetsuro Morimura | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tetsuro Morimura is active.

Explore More

Publication

Featured researches published by Tetsuro Morimura.

Neural Computation | 2010

Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning

Tetsuro Morimura; Eiji Uchibe; Junichiro Yoshimoto; Jan Peters; Kenji Doya

Most conventional policy gradient reinforcement learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the policy parameter. That term involves the derivative of the stationary state distribution that corresponds to the sensitivity of its distribution to changes in the policy parameter. Although the bias introduced by this omission can be reduced by setting the forgetting rate for the value functions close to 1, these algorithms do not permit to be set exactly at 1. In this article, we propose a method for estimating the log stationary state distribution derivative (LSD) as a useful form of the derivative of the stationary state distribution through backward Markov chain formulation and a temporal difference learning framework. A new policy gradient (PG) framework with an LSD is also proposed, in which the average reward gradient can be estimated by setting 0, so it becomes unnecessary to learn the value functions. We also test the performance of the proposed algorithms using simple benchmark tasks and show that these can improve the performances of existing PG methods.

international conference on robotics and automation | 2009

Least absolute policy iteration for robust value function approximation

Masashi Sugiyama; Hirotaka Hachiya; Hisashi Kashima; Tetsuro Morimura

Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers in observed rewards. In this paper, we propose an alternative method that employs the absolute loss for enhancing robustness and reliability. The proposed method is formulated as a linear programming problem which can be solved efficiently by standard optimization software, so the computational advantage is not sacrificed for gaining robustness and reliability. We demonstrate the usefulness of the proposed approach through simulated robot-control tasks.

IEEE Transactions on Intelligent Transportation Systems | 2017

City-Wide Traffic Flow Estimation From a Limited Number of Low-Quality Cameras

Tsuyoshi Idé; Takayuki Katsuki; Tetsuro Morimura; Robert J. T. Morris

We present a new approach to lightweight intelligent transportation systems. Our approach does not rely on traditional expensive infrastructures, but rather on advanced machine learning algorithms. It takes images from traffic cameras at a limited number of locations and estimates the traffic over the entire road network. Our approach features two main algorithms. The first is a probabilistic vehicle counting algorithm from low-quality images that falls into the category of unsupervised learning. The other is a network inference algorithm based on an inverse Markov chain formulation that infers the traffic at arbitrary links from a limited number of observations. We evaluated our approach on two different traffic data sets, one acquired in Nairobi, Kenya, and the other in Kyoto, Japan.

international conference on pattern recognition | 2016

Unsupervised object counting without object recognition

Takayuki Katsuki; Tetsuro Morimura; Tsuyoshi Idé

This paper addresses the problem of object counting, which is to estimate the number of objects of interest from an input observation. We formalize the problem as a posterior inference of the count by introducing a particular type of Gaussian mixture for the input observation, whose mixture indexes correspond to the count. Unlike existing approaches in image analysis, which typically perform explicit object detection using labeled training images, our approach does not need any labeled training data. Our idea is to use the stick-breaking process as a constraint to make it possible to interpret the mixture indexes as the count. We apply our method to the problem of counting vehicles in real-world web camera images and demonstrate that the accuracy and robustness of the proposed approach without any labeled training data are comparable to those of supervised alternatives.

Artificial Life and Robotics | 2008

Natural actor-critic with baseline adjustment for variance reduction

Tetsuro Morimura; Eiji Uchibe; Kenji Doya

In this study, we discuss a baseline function for the estimation of a natural policy gradient with respect to variance, and demonstrate a condition in which an optimal baseline function that reduces the variance is equivalent to the state value function. However, outside of this condition, the state value could be considerably different from the optimal baseline. For such cases, an extended version of the NTD algorithm is proposed, where an auxiliary function is estimated to adjust the baseline, being state value estimates in the original NTD version, to the optimal baseline. The proposed algorithm is applied to simple MDPs and a challenging pendulum swing-up problem.

winter simulation conference | 2014

Frugal signal control using low resolution web-camera and traffic flow estimation

Kumiko Maeda; Tetsuro Morimura; Takayuki Katsuki; Masayoshi Teraguchi

Due to rapid urbanization, large cities in developing countries have problems with heavy traffic congestion. International aid is being provided to construct modern traffic signal infrastructure. But often such an infrastructure does not work well due to the high operating and maintenance costs and the limited knowledge of the local engineers. In this paper, we propose a frugal signal control framework that uses image analysis to estimate traffic flows. It requires only low-cost Web cameras to support a signal control strategy based on the current traffic volume. We can estimate the traffic volumes of the roads near the traffic signals from a few observed points and then adjust the signal control. Through numerical experiments, we confirmed that the proposed framework can reduce an average travel time 20.6% compared to a fixed-time signal control even though the Web cameras are located at 500 m away from intersections.

IEEE Transactions on Intelligent Transportation Systems | 2017

Traffic Velocity Estimation From Vehicle Count Sequences

Takayuki Katsuki; Tetsuro Morimura; Masato Inoue

Traffic velocity is a fundamental metric for inferring traffic conditions. This paper proposes a new velocity estimation approach from temporal sequences of vehicle count that does not require tracking any vehicles or using any labeled data. It is useful for measuring traffic velocities with low quality and inexpensive sensors such as web cameras in general use. We formalize the task as a density estimation problem by introducing a new model for temporal sequences of vehicle counts wherein the correlation between the sequences is directly related to the traffic velocity. We also derive a sampling-based algorithm for the density estimation. We show the effectiveness of our method on artificial and real-world data sets.

international conference on pattern recognition | 2016

Automated help system for novice older users from touchscreen gestures

Daisuke Sato; Tetsuro Morimura; Takayuki Katsuki; Yosuke Toyota; Tsuneo Kato; Hironobu Takagi

Older adults who have never used smartphone often suffers from getting used to smartphone gestures because of their lack of basic knowledge or skills with the latest technologies like gesture-oriented touchscreens. In this paper, we propose a user modeling method for inferring problems novice users face for smartphone from their touchscreen gestures. The output of user model is used by automated help enabling them to acquire touchscreen gestures. We apply a feature extraction approach based on the frequent pattern mining of gesture sequence to the user modeling. The learned user model detects types of problems in real time and is used for automated help. To optimize of instruction timing and its selection, we use a Bayesian reinforcement learning approach, which balances the exploration-exploitation trade-off. We evaluate the effectiveness of the method by using a prototype assistant system for a map application. The evaluation with older (60+) novice users showed positive results. The performance of the prototype system and the potential for further application is discussed.

winter simulation conference | 2014

A multi-objective genetic algorithm using intermediate features of simulations

Hidemasa Muta; Rudy Raymond; Satoshi Hara; Tetsuro Morimura

This paper proposes using intermediate features of traffic simulations in a genetic algorithm designed to find the best scenarios in regulating traffic with multiple objectives. A challenge in genetic algorithms for multi-objective optimization is how to find various optimal scenarios within a limited decision time. Typical evolutionary algorithms usually maintain a population of diversified scenarios whose diversity is measured only by the final objectives available at the end of their simulations. We propose measuring the diversity by also the time series of the objectives during the simulations. The intuition is that simulation scenarios with similar final objective values may contain different series of discrete events that, when combined, can result in better scenarios. We provide empirical evidence by experimenting with agent-based traffic simulations showing the superiority of the proposed genetic algorithm over standard approaches in approximating Pareto fronts.

european conference on artificial intelligence | 2014

Probabilistic two-level anomaly detection for correlated systems

Bin Tong; Tetsuro Morimura; Einoshin Suzuki; Tsuyoshi Idé

We propose a novel probabilistic semi-supervised anomaly detection framework for multi-dimensional systems with high correlation among variables. Our method is able to identify both abnormal instances and abnormal variables of an instance.

Explore More