Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hidekazu Oiwa is active.

Publication


Featured researches published by Hidekazu Oiwa.


european conference on machine learning | 2011

Frequency-aware truncated methods for sparse online learning

Hidekazu Oiwa; Shin Matsushima; Hiroshi Nakagawa

Online supervised learning with L1-regularization has gained attention recently because it generally requires less computational time and a smaller space of complexity than batch-type learning methods. However, a simple L1-regularization method used in an online setting has the side effect that rare features tend to be truncated more than necessary. In fact, feature frequency is highly skewed in many applications. We developed a new family of L1-regularization methods based on the previous updates for loss minimization in linear online learning settings. Our methods can identify and retain low-frequency occurrence but informative features at the same computational cost and convergence rate as previous works. Moreover, we combined our methods with a cumulative penalty model to derive more robust models over noisy data. We applied our methods to several datasets and empirically evaluated the performance of our algorithms. Experimental results showed that our frequency-aware truncated models improved the prediction accuracy.


international conference on data mining | 2012

Healing Truncation Bias: Self-Weighted Truncation Framework for Dual Averaging

Hidekazu Oiwa; Shin Matsushima; Hiroshi Nakagawa

We propose a new truncation framework for online supervised learning. Learning a compact predictive model in an online setting has recently attracted a great deal of attention. The combination of online learning with sparsity-inducing regularization enables faster learning with a smaller memory space than a conventional learning framework. However, a simple combination of these triggers the truncation of weights whose corresponding features rarely appear, even if these features are crucial for prediction. Furthermore, it is difficult to emphasize these features in advance while preserving the advantages of online learning. We develop an extensional truncation framework to Dual Averaging, which retains rarely occurring but informative features. Our proposed framework integrates information on all previous sub gradients of the loss functions into a regularization term. Our enhancement of a conventional L1-regularization accomplishes the automatic adjustment of each features truncations. This extension enables us to identify and retain rare but informative features without preprocessing. In addition, our framework achieves the same computational complexity and regret bound as standard Dual Averaging. Experiments demonstrated that our framework outperforms other sparse online learning algorithms.


Science in China Series F: Information Sciences | 2014

Feature-aware regularization for sparse online learning

Hidekazu Oiwa; Shin Matsushima; Hiroshi Nakagawa

Learning a compact predictive model in an online setting has recently gained a great deal of attention. The combination of online learning with sparsity-inducing regularization enables faster learning with a smaller memory space than the previous learning frameworks. Many optimization methods and learning algorithms have been developed on the basis of online learning with L1-regularization. L1-regularization tends to truncate some types of parameters, such as those that rarely occur or have a small range of values, unless they are emphasized in advance. However, the inclusion of a pre-processing step would make it very difficult to preserve the advantages of online learning. We propose a new regularization framework for sparse online learning. We focus on regularization terms, and we enhance the state-of-the-art regularization approach by integrating information on all previous subgradients of the loss function into a regularization term. The resulting algorithms enable online learning to adjust the intensity of each feature’s truncations without pre-processing and eventually eliminate the bias of L1-regularization. We show theoretical properties of our framework, the computational complexity and upper bound of regret. Experiments demonstrated that our algorithms outperformed previous methods in many classification tasks.


Journal of Information Processing | 2018

Mining Words in the Minds of Second Language Learners for Learner-specific Word Difficulty

Yo Ehara; Issei Sato; Hidekazu Oiwa; Hiroshi Nakagawa

While there have been many studies on measuring the size of learners’ vocabulary or the vocabulary they should learn, there have been few studies on what kind of words learners actually know. Therefore, we investigated theoretically and practically important models for predicting second language learners’ vocabulary and propose another model for this vocabulary prediction task. With the current models, the same word difficulty measure is shared by all learners. This is unrealistic because some learners have special interests. A learner interested in music may know special music-related terms regardless of their difficulty. To solve this problem, our model can define a learner-specific word difficulty measure. Our model is also an extension of these current models in the sense that these models are special cases of our model. In a qualitative evaluation, we defined a measure for how learner-specific a word is. Interestingly, the word with the highest learner-specificity was “twitter”. Although “twitter” is a difficult English word, some low-ability learners presumably knew this word through the famous micro-blogging service. Our qualitative evaluation successfully extracted such interesting and suggestive examples. Our model achieved an accuracy competitive with the current models.


european conference on machine learning | 2014

Robust distributed training of linear classifiers based on divergence minimization principle

Junpei Komiyama; Hidekazu Oiwa; Hiroshi Nakagawa

We study a distributed training of a linear classifier in which the data is separated into many shards and each worker only has access to its own shard. The goal of this distributed training is to utilize the data of all shards to obtain a well-performing linear classifier. The iterative parameter mixture (IPM) framework (Mann et al., 2009) is a state-of-the-art distributed learning framework that has a strong theoretical guarantee when the data is clean. However, contamination on shards, which sometimes arises in real world environments, largely deteriorates the performances of the distributed training. To remedy the negative effect of the contamination, we propose a divergence minimization principle for the weight determination in IPM. From this principle, we can naturally derive the Beta-IPM scheme, which leverages the power of robust estimation based on the beta divergence. A mistake/loss bound analysis indicates the advantage of our Beta-IPM in contaminated environments. Experiments with various datasets revealed that, even when 80% of the shards are contaminated, Beta-IPM can suppress the influence of the contamination.


empirical methods in natural language processing | 2014

Formalizing Word Sampling for Vocabulary Prediction as Graph-based Active Learning

Yo Ehara; Yusuke Miyao; Hidekazu Oiwa; Issei Sato; Hiroshi Nakagawa

Predicting vocabulary of second language learners is essential to support their language learning; however, because of the large size of language vocabularies, we cannot collect information on the entire vocabulary. For practical measurements, we need to sample a small portion of words from the entire vocabulary and predict the rest of the words. In this study, we propose a novel framework for this sampling method. Current methods rely on simple heuristic techniques involving inflexible manual tuning by educational experts. We formalize these heuristic techniques as a graph-based non-interactive active learning method as applied to a special graph. We show that by extending the graph, we can support additional functionality such as incorporating domain specificity and sampling from multiple corpora. In our experiments, we show that our extended methods outperform other methods in terms of vocabulary prediction accuracy when the number of samples is small.


neural information processing systems | 2014

Partition-wise Linear Models

Hidekazu Oiwa; Ryohei Fujimaki


Pacific-basin Finance Journal | 2012

The Economic Impact of Herd Behavior in the Japanese Loan Market

Ryuichi Nakagawa; Hidekazu Oiwa; Fumiko Takeda


graph based methods for natural language processing | 2013

Understanding seed selection in bootstrapping

Yo Ehara; Issei Sato; Hidekazu Oiwa; Hiroshi Nakagawa


international conference on computational linguistics | 2012

Mining Words in the Minds of Second Language Learners: Learner-Specific Word Difficulty

Yo Ehara; Issei Sato; Hidekazu Oiwa; Hiroshi Nakagawa

Collaboration


Dive into the Hidekazu Oiwa's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yo Ehara

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yusuke Miyao

National Institute of Informatics

View shared research outputs
Researchain Logo
Decentralizing Knowledge