Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marthinus Christoffel du Plessis is active.

Publication


Featured researches published by Marthinus Christoffel du Plessis.


Neural Networks | 2014

Semi-supervised learning of class balance under class-prior change by distribution matching

Marthinus Christoffel du Plessis; Masashi Sugiyama

In real-world classification problems, the class balance in the training dataset does not necessarily reflect that of the test dataset, which can cause significant estimation bias. If the class ratio of the test dataset is known, instance re-weighting or resampling allows systematical bias correction. However, learning the class ratio of the test dataset is challenging when no labeled data is available from the test domain. In this paper, we propose to estimate the class ratio in the test dataset by matching probability distributions of training and test input data. We demonstrate the utility of the proposed approach through experiments.


neural information processing systems | 2012

Density-Difference Estimation

Masashi Sugiyama; Takafumi Kanamori; Taiji Suzuki; Marthinus Christoffel du Plessis; Song Liu; Ichiro Takeuchi

We address the problem of estimating the difference between two probability densities. A naive approach is a two-step procedure of first estimating two densities separately and then computing their difference. However, this procedure does not necessarily work well because the first step is performed without regard to the second step, and thus a small estimation error incurred in the first stage can cause a big error in the second stage. In this letter, we propose a single-shot procedure for directly estimating the density difference without separately estimating two densities. We derive a nonparametric finite-sample error bound for the proposed single-shot density-difference estimator and show that it achieves the optimal convergence rate. We then show how the proposed density-difference estimator can be used in L2-distance approximation. Finally, we experimentally demonstrate the usefulness of the proposed method in robust distribution comparison such as class-prior estimation and change-point detection.


Machine Learning | 2017

Class-prior estimation for learning from positive and unlabeled data

Marthinus Christoffel du Plessis; Gang Niu; Masashi Sugiyama

We consider the problem of estimating the class prior in an unlabeled dataset. Under the assumption that an additional labeled dataset is available, the class prior can be estimated by fitting a mixture of class-wise data distributions to the unlabeled data distribution. However, in practice, such an additional labeled dataset is often not available. In this paper, we show that, with additional samples coming only from the positive class, the class prior of the unlabeled dataset can be estimated correctly. Our key idea is to use properly penalized divergences for model fitting to cancel the error caused by the absence of negative samples. We further show that the use of the penalized


international conference on technologies and applications of artificial intelligence | 2013

Clustering Unclustered Data: Unsupervised Binary Labeling of Two Datasets Having Different Class Balances

Marthinus Christoffel du Plessis; Gang Niu; Masashi Sugiyama


Neural Computation | 2018

Bias Reduction and Metric Learning for Nearest-Neighbor Estimation of Kullback-Leibler Divergence

Yung-Kyun Noh; Masashi Sugiyama; Song Liu; Marthinus Christoffel du Plessis; Frank C. Park; Daniel D. Lee

L_1


Neural Computation | 2015

Online direct density-ratio estimation applied to inlier-based outlier detection

Marthinus Christoffel du Plessis; Hiroaki Shiino; Masashi Sugiyama


neural information processing systems | 2014

Analysis of Learning from Positive and Unlabeled Data

Marthinus Christoffel du Plessis; Gang Niu; Masashi Sugiyama

L1-distance gives a computationally efficient algorithm with an analytic solution. The consistency, stability, and estimation error are theoretically analyzed. Finally, we experimentally demonstrate the usefulness of the proposed method.


IEICE Transactions on Information and Systems | 2014

Class Prior Estimation from Positive and Unlabeled Data

Marthinus Christoffel du Plessis; Masashi Sugiyama

We consider the unsupervised learning problem of assigning labels to unlabeled data. A naive approach is to use clustering methods, but this works well only when data is properly clustered and each cluster corresponds to an underlying class. In this paper, we first show that this unsupervised labeling problem in balanced binary cases can be solved if two unlabeled datasets having different class balances are available. More specifically, estimation of the sign of the difference between probability densities of two unlabeled datasets gives the solution. We then introduce a new method to directly estimate the sign of the density difference without density estimation. Finally, we demonstrate the usefulness of the proposed method against several clustering methods on various toy problems and real-world datasets.


international conference on machine learning | 2012

Semi-Supervised Learning of Class Balance under Class-Prior Change by Distribution Matching

Marthinus Christoffel du Plessis; Masashi Sugiyama

Nearest-neighbor estimators for the Kullback-Leiber (KL) divergence that are asymptotically unbiased have recently been proposed and demonstrated in a number of applications. However, with a small number of samples, nonparametric methods typically suffer from large estimation bias due to the nonlocality of information derived from nearest-neighbor statistics. In this letter, we show that this estimation bias can be mitigated by modifying the metric function, and we propose a novel method for learning a locally optimal Mahalanobis distance function from parametric generative models of the underlying density distributions. Using both simulations and experiments on a variety of data sets, we demonstrate that this interplay between approximate generative models and nonparametric techniques can significantly improve the accuracy of nearest-neighbor-based estimation of the KL divergence.


neural information processing systems | 2016

Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

Gang Niu; Marthinus Christoffel du Plessis; Tomoya Sakai; Yao Ma; Masashi Sugiyama

Many machine learning problems, such as nonstationarity adaptation, outlier detection, dimensionality reduction, and conditional density estimation, can be effectively solved by using the ratio of probability densities. Since the naive two-step procedure of first estimating the probability densities and then taking their ratio performs poorly, methods to directly estimate the density ratio from two sets of samples without density estimation have been extensively studied recently. However, these methods are batch algorithms that use the whole data set to estimate the density ratio, and they are inefficient in the online setup, where training samples are provided sequentially and solutions are updated incrementally without storing previous samples. In this letter, we propose two online density-ratio estimators based on the adaptive regularization of weight vectors. Through experiments on inlier-based outlier detection, we demonstrate the usefulness of the proposed methods.

Collaboration


Dive into the Marthinus Christoffel du Plessis's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gang Niu

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Song Liu

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Tuan Duong Nguyen

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Hiroaki Shiino

Tokyo Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Ichiro Takeuchi

Nagoya Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Taiji Suzuki

Tokyo Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge