Yuta Tsuboi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuta Tsuboi is active.

Explore More

Publication

Featured researches published by Yuta Tsuboi.

Knowledge and Information Systems | 2011

Statistical outlier detection using direct density ratio estimation

Shohei Hido; Yuta Tsuboi; Hisashi Kashima; Masashi Sugiyama; Takafumi Kanamori

We propose a new statistical approach to the problem of inlier-based outlier detection, i.e., finding outliers in the test set based on the training set consisting only of inliers. Our key idea is to use the ratio of training and test data densities as an outlier score. This approach is expected to have better performance even in high-dimensional problems since methods for directly estimating the density ratio without going through density estimation are available. Among various density ratio estimation methods, we employ the method called unconstrained least-squares importance fitting (uLSIF) since it is equipped with natural cross-validation procedures, allowing us to objectively optimize the value of tuning parameters such as the regularization parameter and the kernel width. Furthermore, uLSIF offers a closed-form solution as well as a closed-form formula for the leave-one-out error, so it is computationally very efficient and is scalable to massive datasets. Simulations with benchmark and real-world datasets illustrate the usefulness of the proposed approach.

Journal of Information Processing | 2009

Direct Density Ratio Estimation for Large-scale Covariate Shift Adaptation

Yuta Tsuboi; Hisashi Kashima; Shohei Hido; Steffen Bickel; Masashi Sugiyama

Covariate shift is a situation in supervised learning where training and test inputs follow different distributions even though the functional relation remains unchanged. A common approach to compensating for the bias caused by covariate shift is to reweight the loss function according to the importance, which is the ratio of test and training densities. We propose a novel method that allows us to directly estimate the importance from samples without going through the hard task of density estimation. An advantage of the proposed method is that the computation time is nearly independent of the number of test input samples, which is highly beneficial in recent applications with large numbers of unlabeled samples. We demonstrate through experiments that the proposed method is computationally more efficient than existing approaches with comparable accuracy. We also describe a promising result for large-scale covariate shift adaptation in a natural language processing task.

international conference on data mining | 2008

Inlier-Based Outlier Detection via Direct Density Ratio Estimation

Shohei Hido; Yuta Tsuboi; Hisashi Kashima; Masashi Sugiyama; Takafumi Kanamori

We propose a new statistical approach to the problem of inlier-based outlier detection, i.e.,finding outliers in the test set based on the training set consisting only of inliers. Our key idea is to use the ratio of training and test data densities as an outlier score; we estimate the ratio directly in a semi-parametric fashion without going through density estimation. Thus our approach is expected to have better performance in high-dimensional problems. Furthermore, the applied algorithm for density ratio estimation is equipped with a natural cross-validation procedure, allowing us to objectively optimize the value of tuning parameters such as the regularization parameter and the kernel width. The algorithm offers a closed-form solution as well as a closed-form formula for the leave-one-out error. Thanks to this, the proposed outlier detection method is computationally very efficient and is scalable to massive datasets. Simulations with benchmark and real-world datasets illustrate the usefulness of the proposed approach.

international conference on computational linguistics | 2008

Training Conditional Random Fields Using Incomplete Annotations

Yuta Tsuboi; Hisashi Kashima; Shinsuke Mori; Hiroki Oda; Yuji Matsumoto

We address corpus building situations, where complete annotations to the whole corpus is time consuming and unrealistic. Thus, annotation is done only on crucial part of sentences, or contains unresolved label ambiguities. We propose a parameter estimation method for Conditional Random Fields (CRFs), which enables us to use such incomplete annotations. We show promising results of our method as applied to two types of NLP tasks: a domain adaptation task of a Japanese word segmentation using partial annotations, and a part-of-speech tagging task using ambiguous tags in the Penn treebank corpus.

international conference on machine learning | 2004

Kernel-based discriminative learning algorithms for labeling sequences, trees, and graphs

Hisashi Kashima; Yuta Tsuboi

We introduce a new perceptron-based discriminative learning algorithm for labeling structured data such as sequences, trees, and graphs. Since it is fully kernelized and uses pointwise label prediction, large features, including arbitrary number of hidden variables, can be incorporated with polynomial time complexity. This is in contrast to existing labelers that can handle only features of a small number of hidden variables, such as Maximum Entropy Markov Models and Conditional Random Fields. We also introduce several kernel functions for labeling sequences, trees, and graphs and efficient algorithms for them.

north american chapter of the association for computational linguistics | 2003

Learning sequence-to-sequence correspondences from parallel corpora via sequential pattern mining

Kaoru Yamamoto; Taku Kudo; Yuta Tsuboi; Yuji Matsumoto

We present an unsupervised extraction of sequence-to-sequence correspondences from parallel corpora by sequential pattern mining. The main characteristics of our method are two-fold. First, we propose a systematic way to enumerate all possible translation pair candidates of rigid and gapped sequences without falling into combinatorial explosion. Second, our method uses an efficient data structure and algorithm for calculating frequencies in a contingency table for each translation pair candidate. Our method is empirically evaluated using English-Japanese parallel corpora of 6 million words. Results indicate that it works well for multi-word translations, giving 56--84% accuracy at 19% token coverage and 11% type coverage.

empirical methods in natural language processing | 2014

Neural Networks Leverage Corpus-wide Information for Part-of-speech Tagging

Yuta Tsuboi

We propose a neural network approach to benefit from the non-linearity of corpuswide statistics for part-of-speech (POS) tagging. We investigated several types of corpus-wide information for the words, such as word embeddings and POS tag distributions. Since these statistics are encoded as dense continuous features, it is not trivial to combine these features comparing with sparse discrete features. Our tagger is designed as a combination of a linear model for discrete features and a feed-forward neural network that captures the non-linear interactions among the continuous features. By using several recent advances in the activation functions for neural networks, the proposed method marks new state-of-the-art accuracies for English POS tagging tasks.

empirical methods in natural language processing | 2016

Addressee and Response Selection for Multi-Party Conversation.

Hiroki Ouchi; Yuta Tsuboi

To create conversational systems working in actual situations, it is crucial to assume that they interact with multiple agents. In this work, we tackle addressee and response selection for multi-party conversation, in which systems are expected to select whom they address as well as what they say. The key challenge of this task is to jointly model who is talking about what in a previous context. For the joint modeling, we propose two modeling frameworks: 1) static modeling and 2) dynamic modeling. To show benchmark results of our frameworks, we created a multi-party conversation corpus. Our experiments on the dataset show that the recurrent neural network based models of our frameworks robustly predict addressees and responses in conversations with a large number of agents.

empirical methods in natural language processing | 2014

Learning from a Neighbor: Adapting a Japanese Parser for Korean Through Feature Transfer Learning

Hiroshi Kanayama; Youngja Park; Yuta Tsuboi; Dongmook Yi

We present a new dependency parsing method for Korean applying cross-lingual transfer learning and domain adaptation techniques. Unlike existing transfer learning methods relying on aligned corpora or bilingual lexicons, we propose a feature transfer learning method with minimal supervision, which adapts an existing parser to the target language by transferring the features for the source language to the target language. Specifically, we utilize the Triplet/Quadruplet Model, a hybrid parsing algorithm for Japanese, and apply a delexicalized feature transfer for Korean. Experiments with Penn Korean Treebank show that even using only the transferred features from Japanese achieves a high accuracy (81.6%) for Korean dependency parsing. Further improvements were obtained when a small annotated Korean corpus was combined with the Japanese training corpus, confirming that efficient crosslingual transfer learning can be achieved without expensive linguistic resources.

international conference on pattern recognition | 2008

A new objective function for sequence labeling

Yuta Tsuboi; Hisashi Kashima

We propose a new loss function for discriminative learning of Markov random fields, which is an intermediate loss function between the sequential loss and the pointwise loss. We show this loss function has ldquoMarkov propertyrdquo, that is, the importance of correct labeling for a particular position depends on the numbers of the correct labels around there. This property works to keep local consistencies among the assigned labels, and is useful for optimizing systems identifying structural segments, such as information extraction systems.

Explore More