Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shinto Eguchi is active.

Publication


Featured researches published by Shinto Eguchi.


Neural Computation | 2004

Information geometry of U-Boost and Bregman divergence

Noboru Murata; Takafumi Kanamori; Shinto Eguchi

We aim at an extension of AdaBoost to U-Boost, in the paradigm to build a stronger classification machine from a set of weak learning machines. A geometric understanding of the Bregman divergence defined by a generic convex function U leads to the U-Boost method in the framework of information geometry extended to the space of the finite measures over a label set. We propose two versions of U-Boost learning algorithms by taking account of whether the domain is restricted to the space of probability functions. In the sequential step, we observe that the two adjacent and the initial classifiers are associated with a right triangle in the scale via the Bregman divergence, called the Pythagorean relation. This leads to a mild convergence property of the U-Boost algorithm as seen in the expectation-maximization algorithm. Statistical discussions for consistency and robustness elucidate the properties of the U-Boost methods based on a stochastic assumption for training data.


Neural Computation | 2002

Robust blind source separation by beta divergence

Minami Mihoko; Shinto Eguchi

Blind source separation is aimed at recovering original independent signals when their linear mixtures are observed. Various methods for estimating a recovering matrix have been proposed and applied to data in many fields, such as biological signal processing, communication engineering, and financial market data analysis. One problem these methods have is that they are often too sensitive to outliers, and the existence of a few outliers might change the estimate drastically. In this article, we propose a robust method of blind source separation based on the divergence. Shift parameters are explicitly included in our model instead of the conventional way which assumes that original signals have zero mean. The estimator gives smaller weights to possible outliers so that their influence on the estimate is weakened. Simulation results show that the proposed estimator significantly improves the performance over the existing methods when outliers exist; it keeps equal performance otherwise.


Journal of The Royal Statistical Society Series B-statistical Methodology | 2001

Local sensitivity approximations for selectivity bias

John Copas; Shinto Eguchi

Observational data analysis is often based on tacit assumptions of ignorability or randomness. The paper develops a general approach to local sensitivity analysis for selectivity bias, which aims to study the sensitivity of inference to small departures from such assumptions. If M is a model assuming ignorability, we surround M by a small neighbourhood N defined in the sense of Kullback–Leibler divergence and then compare the inference for models in N with that for M. Interpretable bounds for such differences are developed. Applications to missing data and to observational comparisons are discussed. Local approximations to sensitivity analysis are model robust and can be applied to a wide range of statistical problems.


Neural Computation | 2004

Robustifying AdaBoost by Adding the Naive Error Rate

Shinto Eguchi

AdaBoost can be derived by sequential minimization of the exponential loss function. It implements the learning process by exponentially reweighting examples according to classification results. However, weights are often too sharply tuned, so that AdaBoost suffers from the nonrobustness and overlearning. We propose a new boosting method that is a slight modification of AdaBoost. The loss function is defined by a mixture of the exponential loss and naive error loss functions. As a result, the proposed method incorporates the effect of forgetfulness into AdaBoost. The statistical significance of our method is discussed, and simulations are presented for confirmation.


Journal of The Royal Statistical Society Series B-statistical Methodology | 1998

A class of local likelihood methods and near‐parametric asymptotics

Shinto Eguchi; John Copas

The local maximum likelihood estimate θ^t of a parameter in a statistical model f(x, θ) is defined by maximizing a weighted version of the likelihood function which gives more weight to observations in the neighbourhood of t. The paper studies the sense in which f(t, θ^t) is closer to the true distribution g(t) than the usual estimate f(t, θ^) is. Asymptotic results are presented for the case in which the model misspecification becomes vanishingly small as the sample size tends to ∞. In this setting, the relative entropy risk of the local method is better than that of maximum likelihood. The form of optimum weights for the local likelihood is obtained and illustrated for the normal distribution.


BMC Bioinformatics | 2010

A boosting method for maximizing the partial area under the ROC curve

Osamu Komori; Shinto Eguchi

BackgroundThe receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration.ResultsWe have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis.ConclusionsThe proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker.


international geoscience and remote sensing symposium | 2005

Supervised image classification by contextual AdaBoost based on posteriors in neighborhoods

Ryuei Nishii; Shinto Eguchi

AdaBoost, a machine learning technique, is employed for supervised classification of land-cover categories of geostatistical data. We introduce contextual classifiers based on neighboring pixels. First, posterior probabilities are calculated at all pixels. Then, averages of the log posteriors are calculated in different neighborhoods and are then used as contextual classification functions. Weights for the classification functions can be determined by minimizing the empirical risk with multiclass. Finally, a convex combination of classification functions is obtained. The classification is performed by a noniterative maximization procedure. The proposed method is applied to artificial multispectral images and benchmark datasets. The performance of the proposed method is excellent and is similar to the Markov-random-field-based classifier, which requires an iterative maximization procedure.


Neural Computation | 2009

Robust kernel principal component analysis

Su-Yun Huang; Yi-Ren Yeh; Shinto Eguchi

This letter discusses the robustness issue of kernel principal component analysis. A class of new robust procedures is proposed based on eigenvalue decomposition of weighted covariance. The proposed procedures will place less weight on deviant patterns and thus be more resistant to data contamination and model deviation. Theoretical influence functions are derived, and numerical examples are presented as well. Both theoretical and numerical results indicate that the proposed robust method outperforms the conventional approach in the sense of being less sensitive to outliers. Our robust method and results also apply to functional principal component analysis.


Neural Computation | 2007

Robust Loss Functions for Boosting

Takafumi Kanamori; Shinto Eguchi; Noboru Murata

Boosting is known as a gradient descent algorithm over loss functions. It is often pointed out that the typical boosting algorithm, Adaboost, is highly affected by outliers. In this letter, loss functions for robust boosting are studied. Based on the concept of robust statistics, we propose a transformation of loss functions that makes boosting algorithms robust against extreme outliers. Next, the truncation of loss functions is applied to contamination models that describe the occurrence of mislabels near decision boundaries. Numerical experiments illustrate that the proposed loss functions derived from the contamination models are useful for handling highly noisy data in comparison with other loss functions.


Bioinformatics | 2004

Genotyping of single nucleotide polymorphism using model-based clustering

Hironori Fujisawa; Shinto Eguchi; Masaru Ushijima; Satoshi Miyata; Yoshio Miki; Tetsuichiro Muto; Masaaki Matsuura

MOTIVATION Single nucleotide polymorphisms have been investigated as biological markers and the representative high-throughput genotyping method is a combination of the Invader assay and a statistical clustering method. A typical statistical clustering method is the k-means method, but it often fails because of the lack of flexibility. An alternative fast and reliable method is therefore desirable. RESULTS This paper proposes a model-based clustering method using a normal mixture model and a well-conceived penalized likelihood. The proposed method can judge unclear genotypings to be re-examined and also work well even when the number of clusters is unknown. Some results are illustrated and then satisfactory genotypings are shown. Even when the conventional maximum likelihood method and the typical k-means clustering method failed, the proposed method succeeded.

Collaboration


Dive into the Shinto Eguchi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hironori Fujisawa

Graduate University for Advanced Studies

View shared research outputs
Top Co-Authors

Avatar

Masaaki Matsuura

Japanese Foundation for Cancer Research

View shared research outputs
Top Co-Authors

Avatar

Masaru Ushijima

Japanese Foundation for Cancer Research

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Akifumi Notsu

Oita University of Nursing and Health Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge