Shin Matsushima | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shin Matsushima is active.

Explore More

Publication

Featured researches published by Shin Matsushima.

knowledge discovery and data mining | 2012

Linear support vector machines via dual cached loops

Shin Matsushima; S. V. N. Vishwanathan; Alexander J. Smola

Modern computer hardware offers an elaborate hierarchy of storage subsystems with different speeds, capacities, and costs associated with them. Furthermore, processors are now inherently parallel offering the execution of several diverse threads simultaneously. This paper proposes StreamSVM, the first algorithm for training linear Support Vector Machines (SVMs) which takes advantage of these properties by integrating caching with optimization. StreamSVM works by performing updates in the dual, thus obviating the need to rebalance frequently visited examples. Furthermore we trade off file I/O with data expansion on the fly by generating features on demand. This significantly increases throughput. Experiments show that StreamSVM outperforms other linear SVM solvers, including the award winning work of [38], by orders of magnitude and produces more accurate solutions within a shorter amount of time.

european conference on machine learning | 2011

Frequency-aware truncated methods for sparse online learning

Hidekazu Oiwa; Shin Matsushima; Hiroshi Nakagawa

Online supervised learning with L1-regularization has gained attention recently because it generally requires less computational time and a smaller space of complexity than batch-type learning methods. However, a simple L1-regularization method used in an online setting has the side effect that rare features tend to be truncated more than necessary. In fact, feature frequency is highly skewed in many applications. We developed a new family of L1-regularization methods based on the previous updates for loss minimization in linear online learning settings. Our methods can identify and retain low-frequency occurrence but informative features at the same computational cost and convergence rate as previous works. Moreover, we combined our methods with a cumulative penalty model to derive more robust models over noisy data. We applied our methods to several datasets and empirically evaluated the performance of our algorithms. Experimental results showed that our frequency-aware truncated models improved the prediction accuracy.

international conference on data mining | 2012

Healing Truncation Bias: Self-Weighted Truncation Framework for Dual Averaging

Hidekazu Oiwa; Shin Matsushima; Hiroshi Nakagawa

We propose a new truncation framework for online supervised learning. Learning a compact predictive model in an online setting has recently attracted a great deal of attention. The combination of online learning with sparsity-inducing regularization enables faster learning with a smaller memory space than a conventional learning framework. However, a simple combination of these triggers the truncation of weights whose corresponding features rarely appear, even if these features are crucial for prediction. Furthermore, it is difficult to emphasize these features in advance while preserving the advantages of online learning. We develop an extensional truncation framework to Dual Averaging, which retains rarely occurring but informative features. Our proposed framework integrates information on all previous sub gradients of the loss functions into a regularization term. Our enhancement of a conventional L1-regularization accomplishes the automatic adjustment of each features truncations. This extension enables us to identify and retain rare but informative features without preprocessing. In addition, our framework achieves the same computational complexity and regret bound as standard Dual Averaging. Experiments demonstrated that our framework outperforms other sparse online learning algorithms.

Science in China Series F: Information Sciences | 2014

Feature-aware regularization for sparse online learning

Hidekazu Oiwa; Shin Matsushima; Hiroshi Nakagawa

Learning a compact predictive model in an online setting has recently gained a great deal of attention. The combination of online learning with sparsity-inducing regularization enables faster learning with a smaller memory space than the previous learning frameworks. Many optimization methods and learning algorithms have been developed on the basis of online learning with L1-regularization. L1-regularization tends to truncate some types of parameters, such as those that rarely occur or have a small range of values, unless they are emphasized in advance. However, the inclusion of a pre-processing step would make it very difficult to preserve the advantages of online learning. We propose a new regularization framework for sparse online learning. We focus on regularization terms, and we enhance the state-of-the-art regularization approach by integrating information on all previous subgradients of the loss function into a regularization term. The resulting algorithms enable online learning to adjust the intensity of each feature’s truncations without pre-processing and eventually eliminate the bias of L1-regularization. We show theoretical properties of our framework, the computational complexity and upper bound of regret. Experiments demonstrated that our algorithms outperformed previous methods in many classification tasks.

ieee international conference on data science and advanced analytics | 2016

Web Behavior Analysis Using Sparse Non-Negative Matrix Factorization

Akihiro Demachi; Shin Matsushima; Kenji Yamanishi

We are concerned with the issue of discovering behavioral patterns on the web. When a large amount of web access logs are given, we are interested in how they are categorized and how they are related to activities in real life. In order to conduct that analysis, we develop a novel algorithm for sparse non-negative matrix factorization (SNMF), which can discover patterns of web behaviors. Although there exist a number of variants of SNMFs, our algorithm is novel in that it updates parameters in a multiplicative way with performance guaranteed, thereby works more robustly than existing ones, even when the rank of factorized matrices is large. We demonstrate the effectiveness of our algorithm using artificial data sets. We then apply our algorithm into a large scale web log data obtained from 70,000 monitors to discover meaningful relations among web behavioral patterns and real life activities. We employ the information-theoretic measure to demonstrate that our algorithm is able to extract more significant relations among web behavior patterns and real life activities than competitive methods.

european conference on machine learning | 2017

Distributed Stochastic Optimization of Regularized Risk via Saddle-Point Problem

Shin Matsushima; Hyokun Yun; Xinhua Zhang; S. V. N. Vishwanathan

Many machine learning algorithms minimize a regularized risk, and stochastic optimization is widely used for this task. When working with massive data, it is desirable to perform stochastic optimization in parallel. Unfortunately, many existing stochastic optimization algorithms cannot be parallelized efficiently. In this paper we show that one can rewrite the regularized risk minimization problem as an equivalent saddle-point problem, and propose an efficient distributed stochastic optimization (DSO) algorithm. We prove the algorithm’s rate of convergence; remarkably, our analysis shows that the algorithm scales almost linearly with the number of processors. We also verify with empirical evaluations that the proposed algorithm is competitive with other parallel, general purpose stochastic and batch optimization algorithms for regularized risk minimization.

european conference on machine learning | 2016

Asynchronous Feature Extraction for Large-Scale Linear Predictors

Shin Matsushima

Learning from datasets with a massive number of possible features to obtain more accurate predictors is being intensively studied. In this paper, we aim to perform effective learning by using the L1 regularized risk minimization problems regarding both time and space computational resources. This is accomplished by concentrating on the effective features from among a large number of unnecessary features. To achieve this, we propose a multithreaded scheme that simultaneously runs processes for developing seemingly important features in the main memory and updating parameters regarding only the important features. We verified our method through computational experiments, showing that our proposed scheme can handle terabyte-scale optimization problems with one machine.

cross-language evaluation forum | 2010