Akinori Fujino | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Akinori Fujino is active.

Explore More

Publication

Featured researches published by Akinori Fujino.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2008

Semisupervised Learning for a Hybrid Generative/Discriminative Classifier based on the Maximum Entropy Principle

Akinori Fujino; Naonori Ueda; Kazumi Saito

This paper presents a method for designing semisupervised classifiers trained on labeled and unlabeled samples. We focus on a probabilistic semisupervised classifier design for multiclass and single-labeled classification problems and propose a hybrid approach that takes advantage of generative and discriminative approaches. In our approach, we first consider a generative model trained by using labeled samples and introduce a bias correction model, where these models belong to the same model family but have different parameters. Then, we construct a hybrid classifier by combining these models based on the maximum entropy principle. To enable us to apply our hybrid approach to text classification problems, we employed naive Bayes models as the generative and bias correction models. Our experimental results for four text data sets confirmed that the generalization ability of our hybrid classifier was much improved by using a large number of unlabeled samples for training when there were too few labeled samples to obtain good performance. We also confirmed that our hybrid approach significantly outperformed the generative and discriminative approaches when the performance of the generative and discriminative approaches was comparable. Moreover, we examined the performance of our hybrid classifier when the labeled and unlabeled data distributions were different.

european conference on machine learning | 2015

Convex Factorization Machines

Mathieu Blondel; Akinori Fujino; Naonori Ueda

Factorization machines are a generic framework which allows to mimic many factorization models simply by feature engineering. In this way, they combine the high predictive accuracy of factorization models with the flexibility of feature engineering. Unfortunately, factorization machines involve a non-convex optimization problem and are thus subject to bad local minima. In this paper, we propose a convex formulation of factorization machines based on the nuclear norm. Our formulation imposes fewer restrictions on the learned model and is thus more general than the original formulation. To solve the corresponding optimization problem, we present an efficient globally-convergent two-block coordinate descent algorithm. Empirically, we demonstrate that our approach achieves comparable or better predictive accuracy than the original factorization machines on 4 recommendation tasks and scales to datasets with 10 million samples.

Information Processing and Management | 2007

A hybrid generative/discriminative approach to text classification with additional information

Akinori Fujino; Naonori Ueda; Kazumi Saito

This paper presents a classifier for text data samples consisting of main text and additional components, such as Web pages and technical papers. We focus on multiclass and single-labeled text classification problems and design the classifier based on a hybrid composed of probabilistic generative and discriminative approaches. Our formulation considers individual component generative models and constructs the classifier by combining these trained models based on the maximum entropy principle. We use naive Bayes models as the component generative models for the main text and additional components such as titles, links, and authors, so that we can apply our formulation to document and Web page classification problems. Our experimental results for four test collections confirmed that our hybrid approach effectively combined main text and additional components and thus improved classification performance.

Knowledge and Information Systems | 2013

Adaptive semi-supervised learning on labeled and unlabeled data with different distributions

Akinori Fujino; Naonori Ueda; Masaaki Nagata

Developing methods for designing good classifiers from labeled samples whose distribution is different from that of test samples is an important and challenging research issue in the fields of machine learning and its application. This paper focuses on designing semi-supervised classifiers with a high generalization ability by using unlabeled samples drawn by the same distribution as the test samples and presents a semi-supervised learning method based on a hybrid discriminative and generative model. Although JESS-CM is one of the most successful semi-supervised classifier design frameworks based on a hybrid approach, it has an overfitting problem in the task setting that we consider in this paper. We propose an objective function that utilizes both labeled and unlabeled samples for the discriminative training of hybrid classifiers and then expect the objective function to mitigate the overfitting problem. We show the effect of the objective function by theoretical analysis and empirical evaluation. Our experimental results for text classification using four typical benchmark test collections confirmed that with our task setting in most cases, the proposed method outperformed the JESS-CM framework. We also confirmed experimentally that the proposed method was useful for obtaining better performance when classifying data samples into either known or unknown classes, which were included in given labeled samples or not, respectively.

conference on information and knowledge management | 2010

A robust semi-supervised classification method for transfer learning

Akinori Fujino; Naonori Ueda; Masaaki Nagata

The transfer learning problem of designing good classifiers with a high generalization ability by using labeled samples whose distribution is different from that of test samples is an important and challenging research issue in the fields of machine learning and data mining. This paper focuses on designing a semi-supervised classifier trained by using unlabeled samples drawn by the same distribution as test samples, and presents a semi-supervised classification method to deal with the transfer learning problem, based on a hybrid discriminative and generative model. Although JESS-CM is one of the most successful semi-supervised classifier design frameworks and has achieved the best published results in NLP tasks, it has an overfitting problem in transfer learning settings that we consider in this paper. We expect the overfitting problem to be mitigated with the proposed method, which utilizes both labeled and unlabeled samples for the discriminative training of classifiers. We also present a refined objective that formalizes the training algorithm and classifier form. Our experimental results for text classification using three typical benchmark test collections confirmed that the proposed method outperformed the JESS-CM framework with most transfer learning settings.

Information Processing and Management | 2012

Flexible sample selection strategies for transfer learning in ranking

Kevin Duh; Akinori Fujino

Ranking is a central component in information retrieval systems; as such, many machine learning methods for building rankers have been developed in recent years. An open problem is transfer learning, i.e. how labeled training data from one domain/market can be used to build rankers for another. We propose a flexible transfer learning strategy based on sample selection. Source domain training samples are selected if the functional relationship between features and labels do not deviate much from that of the target domain. This is achieved through a novel application of recent advances from density ratio estimation. The approach is flexible, scalable, and modular. It allows many existing supervised rankers to be adapted to the transfer learning setting. Results on two datasets (Yahoos Learning to Rank Challenge and Microsofts LETOR data) show that the proposed method gives robust improvements.

international conference on data mining | 2016

A Semi-Supervised AUC Optimization Method with Generative Models

Akinori Fujino; Naonori Ueda

This paper presents a semi-supervised learning method for improving the performance of AUC-optimized classifiers by using both labeled and unlabeled samples. In actual binary classification tasks, there is often an imbalance between the numbers of positive and negative samples. For such imbalanced tasks, the area under the ROC curve (AUC) is an effective measure with which to evaluate binary classifiers. The proposed method utilizes generative models to assist the incorporation of unlabeled samples in AUC-optimized classifiers. The generative models provide prior knowledge that helps learn the distribution of unlabeled samples. To evaluate the proposed method in text classification, we employed naive Bayes models as the generative models. Our experimental results using three test collections confirmed that the proposed method provided better classifiers for imbalanced tasks than supervised AUC-optimized classifiers and semi-supervised classifiers trained to maximize the classification accuracy of labeled samples. Moreover, the proposed method improved the effect of using unlabeled samples for AUC optimization especially when we used appropriate generative models.

international conference on pattern recognition | 2014

Large-Scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex

Mathieu Blondel; Akinori Fujino; Naonori Ueda

Dual decomposition methods are the current state-of-the-art for training multiclass formulations of Support Vector Machines (SVMs). At every iteration, dual decomposition methods update a small subset of dual variables by solving a restricted optimization problem. In this paper, we propose an exact and efficient method for solving the restricted problem. In our method, the restricted problem is reduced to the well-known problem of Euclidean projection onto the positive simplex, which we can solve exactly in expected O(k) time, where k is the number of classes. We demonstrate that our method empirically achieves state-of-the-art convergence on several large-scale high-dimensional datasets.

Archive | 2014

Robust Naive Bayes Combination of Multiple Classifications

Naonori Ueda; Yusuke Tanaka; Akinori Fujino

When we face new complex classification tasks, since it is difficult to design a good feature set for observed raw data, we often obtain an unsatisfactorily biased classifier. Namely, the trained classifier can only successfully classify certain classes of samples owing to its poor feature set. To tackle the problem, we propose a robust naive Bayes combination scheme in which we effectively combine classifier predictions that we obtained from different classifiers and/or different feature sets. Since we assume that the multiple classifier predictions are given, any type of classifier and any feature set are available in our scheme. In our combination scheme each prediction is regarded as an independent realization of a categorical random variable (i.e., class label) and a naive Bayes model is trained by using a set of the predictions within a supervised learning framework. The key feature of our scheme is the introduction of a class-specific variable selection mechanism to avoid overfitting to poor classifier predictions. We demonstrate the practical benefit of our simple combination scheme with both synthetic and real data sets, and show that it can achieve much higher classification accuracy than conventional ensemble classifiers.

asia information retrieval symposium | 2005

A classifier design based on combining multiple components by maximum entropy principle

Akinori Fujino; Naonori Ueda; Kazumi Saito

Designing high performance classifiers for structured data consisting of multiple components is an important and challenging research issue in the field of machine learning. Although the main component of structured data plays an important role when designing classifiers, additional components may contain beneficial information for classification. This paper focuses on a probabilistic classifier design for multiclass classification based on the combination of main and additional components. Our formulation separately considers component generative models and constructs the classifier by combining these trained models based on the maximum entropy principle. We use naive Bayes models as the component generative models for text and link components so that we can apply our classifier design to document and web page classification problems. Our experimental results for three test collections confirmed that the proposed method effectively combined the main and additional components to improve classification performance.

Explore More