Ryohei Fujimaki
University of Tokyo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ryohei Fujimaki.
ieee international conference on space mission challenges for information technology | 2006
Takehisa Yairi; Yoshinobu Kawahara; Ryohei Fujimaki; Yuichi Sato; Kazuo Machida
For any space mission, safety and reliability are the most important issues. To tackle this problem, we have studied anomaly detection and fault diagnosis methods for spacecraft systems based on machine learning (ML) and data mining (DM) technology. In these methods, the knowledge or model which is necessary for monitoring a spacecraft system is (semi-)automatically acquired from the spacecraft telemetry data. In this paper, we first overview the anomaly detection/diagnosis problem in the spacecraft systems and conventional techniques such as limit-check, expert systems and model-based diagnosis. Then we explain the concept of ML/DM-based approach to this problem, and introduce several anomaly detection/diagnosis methods which have been developed by us
knowledge discovery and data mining | 2005
Ryohei Fujimaki; Takehisa Yairi; Kazuo Machida
This paper proposes a novel anomaly detection system for spacecrafts based on data mining techniques. It constructs a nonlinear probabilistic model w.r.t. behavior of a spacecraft by applying the relevance vector regression and autoregression to massive telemetry data, and then monitors the on-line telemetry data using the model and detects anomalies. A major advantage over conventional anomaly detection methods is that this approach requires little a priori knowledge on the system.
knowledge discovery and data mining | 2015
Jialei Wang; Ryohei Fujimaki; Yosuke Motohashi
Model interpretability has been recognized to play a key role in practical data mining. Interpretable models provide significant insights on data and model behaviors and may convince end-users to employ certain models. In return for these advantages, however, there is generally a sacrifice in accuracy, i.e., flexibility of model representation (e.g., linear, rule-based, etc.) and model complexity needs to be restricted in order for users to be able to understand the results. This paper proposes oblique treed sparse additive models (OT-SpAMs). Our main focus is on developing a model which sacrifices a certain degree of interpretability for accuracy but achieves entirely sufficient accuracy with such fully non-linear models as kernel support vector machines (SVMs). OT-SpAMs are instances of region-specific predictive models. They divide feature spaces into regions with sparse oblique tree splitting and assign local sparse additive experts to individual regions. In order to maintain OT-SpAM interpretability, we have to keep the overall model structure simple, and this produces simultaneous model selection issues for sparse oblique region structures and sparse local experts. We address this problem by extending factorized asymptotic Bayesian inference. We demonstrate, on simulation, benchmark, and real world datasets that, in terms of accuracy, OT-SpAMs outperform state-of-the-art interpretable models and perform competitively with kernel SVMs, while still providing results that are highly understandable.
Journal of Visualization | 2015
Yunzhu Zheng; Haruka Suematsu; Takayuki Itoh; Ryohei Fujimaki; Satoshi Morinaga; Yoshinobu Kawahara
Multi-dimensional data visualization is an important research topic that has been receiving increasing attention. Several techniques that apply scatterplot matrices have been proposed to represent multi-dimensional data as a collection of two-dimensional data visualization spaces. Typically, when using the scatterplot-based approach it is easier to understand relations between particular pairs of dimensions, but it often requires too large display spaces to display all possible scatterplots. This paper presents a technique to display meaningful sets of scatterplots generated from high-dimensional datasets. Our technique first evaluates all possible scatterplots generated from high-dimensional datasets, and selects meaningful sets. It then calculates the similarity between arbitrary pairs of the selected scatterplots, and places relevant scatterplots closer together in the display space while they never overlap each other. This design policy makes users easier to visually compare relevant sets of scatterplots. This paper presents algorithms to place the scatterplots by the combination of ideal position calculation and rectangle packing algorithms, and two examples demonstrating the effectiveness of the presented technique.Graphical Abstract
knowledge discovery and data mining | 2017
Shinji Ito; Ryohei Fujimaki
This paper addresses a novel data science problem, prescriptive price optimization, which derives the optimal price strategy to maximize future profit/revenue on the basis of massive predictive formulas produced by machine learning. The prescriptive price optimization first builds sales forecast formulas of multiple products, on the basis of historical data, which reveal complex relationships between sales and prices, such as price elasticity of demand and cannibalization. Then, it constructs a mathematical optimization problem on the basis of those predictive formulas. We present that the optimization problem can be formulated as an instance of binary quadratic programming (BQP). Although BQP problems are NP-hard in general and computationally intractable, we propose a fast approximation algorithm using a semi-definite programming (SDP) relaxation. Our experiments on simulation and real retail datasets show that our prescriptive price optimization simultaneously derives the optimal prices of tens/hundreds products with practical computational time, that potentially improve approximately 30% of gross profit of those products.
asian conference on machine learning | 2009
Ryohei Fujimaki; Satoshi Morinaga; Michinari Momma; Kenji Aoki; Takayuki Nakata
Our main contribution is to propose a novel model selection methodology, expectation minimization of description length (EMDL), based on the minimum description length (MDL) principle. EMDL makes a significant impact on the combinatorial scalability issue pertaining to the model selection for mixture models having types of components. A goal of such problems is to optimize types of components as well as the number of components. One key idea in EMDL is to iterate calculations of the posterior of latent variables and minimization of expected description length of both observed data and latent variables. This enables EMDL to compute the optimal model in linear time with respect to both the number of components and the number of available types of components despite the fact that the number of model candidates exponentially increases with the numbers. We prove that EMDL is compliant with the MDL principle and enjoys its statistical benefits.
international joint conference on artificial intelligence | 2017
Akihiro Yabe; Shinji Ito; Ryohei Fujimaki
The goal of price optimization is to maximize total revenue by adjusting the prices of products, on the basis of predicted sales numbers that are functions of pricing strategies. Recent advances in demand modeling using machine learning raise a new challenge in price optimization, i.e., how to manage statistical errors in estimation. In this paper, we show that uncertainty in recently-proposed prescriptive price optimization frameworks can be represented by a matrix normal distribution. For this particular uncertainty, we propose novel robust quadratic programming algorithms for conservative lower-bound maximization. We offer an asymptotic probabilistic guarantee of conservativeness of our formulation. Our experiments on both artificial and actual price data show that our robust price optimization allows users to determine best risk-return trade-offs and to explore safe, profitable price strategies.
knowledge discovery and data mining | 2016
Haichuan Yang; Ryohei Fujimaki; Yukitaka Kusumura; Ji Liu
This paper considers the feature selection scenario where only a few features are accessible at any time point. For example, features are generated sequentially and visible one by one. Therefore, one has to make an online decision to identify key features after all features are only scanned once or twice. The optimization based approach is a powerful tool for the online feature selection. However, most existing optimization based algorithms explicitly or implicitly adopt L1 norm regularization to identify important features, and suffer two main disadvantages: 1) the penalty term for L1 norm term is hard to choose; and 2) the memory usage is hard to control or predict. To overcome these two drawbacks, this paper proposes a limited-memory and model parameter free online feature selection algorithm, namely online substitution (OS) algorithm. To improve the selection efficiency, an asynchronous parallel extension for OS (Asy-OS) is proposed. Convergence guarantees are provided for both algorithms. Empirical study suggests that the performance of OS and Asy-OS is comparable to the benchmark algorithm Grafting, but requires much less memory cost and can be easily extended to the parallel implementation.
knowledge discovery and data mining | 2005
Ryohei Fujimaki; Takehisa Yairi; Kazuo Machida
international conference on artificial intelligence and statistics | 2012
Ryohei Fujimaki; Satoshi Morinaga