Takafumi Kanamori | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takafumi Kanamori is active.

Explore More

Publication

Featured researches published by Takafumi Kanamori.

Neural Computation | 2004

Information geometry of U-Boost and Bregman divergence

Noboru Murata; Takafumi Kanamori; Shinto Eguchi

We aim at an extension of AdaBoost to U-Boost, in the paradigm to build a stronger classification machine from a set of weak learning machines. A geometric understanding of the Bregman divergence defined by a generic convex function U leads to the U-Boost method in the framework of information geometry extended to the space of the finite measures over a label set. We propose two versions of U-Boost learning algorithms by taking account of whether the domain is restricted to the space of probability functions. In the sequential step, we observe that the two adjacent and the initial classifiers are associated with a right triangle in the scale via the Bregman divergence, called the Pythagorean relation. This leads to a mild convergence property of the U-Boost algorithm as seen in the expectation-maximization algorithm. Statistical discussions for consistency and robustness elucidate the properties of the U-Boost methods based on a stochastic assumption for training data.

Knowledge and Information Systems | 2011

Statistical outlier detection using direct density ratio estimation

Shohei Hido; Yuta Tsuboi; Hisashi Kashima; Masashi Sugiyama; Takafumi Kanamori

We propose a new statistical approach to the problem of inlier-based outlier detection, i.e., finding outliers in the test set based on the training set consisting only of inliers. Our key idea is to use the ratio of training and test data densities as an outlier score. This approach is expected to have better performance even in high-dimensional problems since methods for directly estimating the density ratio without going through density estimation are available. Among various density ratio estimation methods, we employ the method called unconstrained least-squares importance fitting (uLSIF) since it is equipped with natural cross-validation procedures, allowing us to objectively optimize the value of tuning parameters such as the regularization parameter and the kernel width. Furthermore, uLSIF offers a closed-form solution as well as a closed-form formula for the leave-one-out error, so it is computationally very efficient and is scalable to massive datasets. Simulations with benchmark and real-world datasets illustrate the usefulness of the proposed approach.

Archive | 2012

Density Ratio Estimation in Machine Learning

Masashi Sugiyama; Taiji Suzuki; Takafumi Kanamori

Machine learning is an interdisciplinary field of science and engineering that studies mathematical theories and practical applications of systems that learn. This book introduces theories, methods, and applications of density ratio estimation, which is a newly emerging paradigm in the machine learning community. Various machine learning problems such as non-stationarity adaptation, outlier detection, dimensionality reduction, independent component analysis, clustering, classification, and conditional density estimation can be systematically solved via the estimation of probability density ratios. The authors offer a comprehensive introduction of various density ratio estimators including methods via density estimation, moment matching, probabilistic classification, density fitting, and density ratio fitting as well as describing how these can be applied to machine learning. The book also provides mathematical theories for density ratio estimation including parametric and non-parametric convergence analysis and numerical stability analysis to complete the first and definitive treatment of the entire framework of density ratio estimation in machine learning.

BMC Bioinformatics | 2009

Mutual information estimation reveals global associations between stimuli and biological processes

Taiji Suzuki; Masashi Sugiyama; Takafumi Kanamori; Jun Sese

BackgroundAlthough microarray gene expression analysis has become popular, it remains difficult to interpret the biological changes caused by stimuli or variation of conditions. Clustering of genes and associating each group with biological functions are often used methods. However, such methods only detect partial changes within cell processes. Herein, we propose a method for discovering global changes within a cell by associating observed conditions of gene expression with gene functions.ResultsTo elucidate the association, we introduce a novel feature selection method called Least-Squares Mutual Information (LSMI), which computes mutual information without density estimaion, and therefore LSMI can detect nonlinear associations within a cell. We demonstrate the effectiveness of LSMI through comparison with existing methods. The results of the application to yeast microarray datasets reveal that non-natural stimuli affect various biological processes, whereas others are no significant relation to specific cell processes. Furthermore, we discover that biological processes can be categorized into four types according to the responses of various stimuli: DNA/RNA metabolism, gene expression, protein metabolism, and protein localization.ConclusionWe proposed a novel feature selection method called LSMI, and applied LSMI to mining the association between conditions of yeast and biological processes through microarray datasets. In fact, LSMI allows us to elucidate the global organization of cellular process control.

international conference on data mining | 2008

Inlier-Based Outlier Detection via Direct Density Ratio Estimation

Shohei Hido; Yuta Tsuboi; Hisashi Kashima; Masashi Sugiyama; Takafumi Kanamori

We propose a new statistical approach to the problem of inlier-based outlier detection, i.e.,finding outliers in the test set based on the training set consisting only of inliers. Our key idea is to use the ratio of training and test data densities as an outlier score; we estimate the ratio directly in a semi-parametric fashion without going through density estimation. Thus our approach is expected to have better performance in high-dimensional problems. Furthermore, the applied algorithm for density ratio estimation is equipped with a natural cross-validation procedure, allowing us to objectively optimize the value of tuning parameters such as the regularization parameter and the kernel width. The algorithm offers a closed-form solution as well as a closed-form formula for the leave-one-out error. Thanks to this, the proposed outlier detection method is computationally very efficient and is scalable to massive datasets. Simulations with benchmark and real-world datasets illustrate the usefulness of the proposed approach.

Journal of Statistical Planning and Inference | 2003

Active learning algorithm using the maximum weighted log-likelihood estimator

Takafumi Kanamori; Hidetoshi Shimodaira

Abstract We study the problems of constructing designs for the regression problems. Our aim is to estimate the mean value of the response variable. The distribution of the independent variable is appropriately chosen from among the continuous designs so as to decrease the integrated mean square error (IMSE) of the fitted values. When we use the design, we face obstacles such that the true regression function may not belong to the statistical model, that is the model is misspecified. In the case of misspecification, the estimation of the mean value of response variable by using the design has bias. We suggest a new method to construct the design which does not have the bias even when the statistical model is misspecified. On the standard construction of the design, the maximum log-likelihood estimator (mle) is used. On the other hand, we use the maximum weighted log-likelihood estimator (mwle). The design with mle increase the bias in the case of misspecification. The mwle corrects the mle and decreases the bias term of IMSE. We give some numerical experiments and illustrate the efficiency of the proposed methods.

Neural Computation | 2009

Nonparametric conditional density estimation using piecewise-linear solution path of kernel quantile regression

Ichiro Takeuchi; Kaname Nomura; Takafumi Kanamori

The goal of regression analysis is to describe the stochastic relationship between an input vector x and a scalar output y. This can be achieved by estimating the entire conditional density p(y x). In this letter, we present a new approach for nonparametric conditional density estimation. We develop a piecewise-linear path-following method for kernel-based quantile regression. It enables us to estimate the cumulative distribution function of p(y x) in piecewise-linear form for all x in the input domain. Theoretical analyses and experimental results are presented to show the effectiveness of the approach.

Machine Learning | 2012

Statistical analysis of kernel-based least-squares density-ratio estimation

Takafumi Kanamori; Taiji Suzuki; Masashi Sugiyama

The ratio of two probability densities can be used for solving various machine learning tasks such as covariate shift adaptation (importance sampling), outlier detection (likelihood-ratio test), feature selection (mutual information), and conditional probability estimation. Several methods of directly estimating the density ratio have recently been developed, e.g., moment matching estimation, maximum-likelihood density-ratio estimation, and least-squares density-ratio fitting. In this paper, we propose a kernelized variant of the least-squares method for density-ratio estimation, which is called kernel unconstrained least-squares importance fitting (KuLSIF). We investigate its fundamental statistical properties including a non-parametric convergence rate, an analytic-form solution, and a leave-one-out cross-validation score. We further study its relation to other kernel-based density-ratio estimators. In experiments, we numerically compare various kernel-based density-ratio estimation methods, and show that KuLSIF compares favorably with other approaches.

Ipsj Transactions on Computer Vision and Applications | 2009

A Density-ratio Framework for Statistical Data Processing

Masashi Sugiyama; Takafumi Kanamori; Taiji Suzuki; Shohei Hido; Jun Sese; Ichiro Takeuchi; Liwei Wang

In statistical pattern recognition, it is important to avoid density estimation since density estimation is often more difficult than pattern recognition itself. Following this idea—known as Vapnik’s principle, a statistical data processing framework that employs the ratio of two probability density functions has been developed recently and is gathering a lot of attention in the machine learning and data mining communities. The purpose of this paper is to introduce to the computer vision community recent advances in density ratio estimation methods and their usage in various statistical data processing tasks such as nonstationarity adaptation, outlier detection, feature selection, and independent component analysis.

neural information processing systems | 2012

Density-Difference Estimation

Masashi Sugiyama; Takafumi Kanamori; Taiji Suzuki; Marthinus Christoffel du Plessis; Song Liu; Ichiro Takeuchi

We address the problem of estimating the difference between two probability densities. A naive approach is a two-step procedure of first estimating two densities separately and then computing their difference. However, this procedure does not necessarily work well because the first step is performed without regard to the second step, and thus a small estimation error incurred in the first stage can cause a big error in the second stage. In this letter, we propose a single-shot procedure for directly estimating the density difference without separately estimating two densities. We derive a nonparametric finite-sample error bound for the proposed single-shot density-difference estimator and show that it achieves the optimal convergence rate. We then show how the proposed density-difference estimator can be used in L2-distance approximation. Finally, we experimentally demonstrate the usefulness of the proposed method in robust distribution comparison such as class-prior estimation and change-point detection.

Explore More