Kazuto Kubota
Toshiba
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kazuto Kubota.
ieee international conference on high performance computing data and analytics | 2000
Kazuto Kubota; Akihiko Nakase; Hiroshi Sakai; Shigeru Oyanagi
Data mining is a typical application of high performance computing in the business field. An efficient data mining system which can deal with huge amount of data is desired. This paper describes the parallel processing of decision tree which is a typical algorithm for classification of large database. A free software C4.5 is parallelized for SMP machine using thread library. Parallelism in generating a decision tree can be classified into intra-node parallelism and inter-node parallelism. Intra-node parallelism can be further classified into record parallelism, attribute parallelism, and their combination. We have implemented these four kinds of parallelizing methods, and evaluated their effects with four kinds of test data. The result shows that there is a relation between the characteristics of data and the parallelizing methods, and combination of multiple parallelizing methods is the most effective one.
knowledge discovery and data mining | 2002
Shigeru Oyanagi; Kazuto Kubota; Akihiko Nakase
Sequence pattern mining is one of the most important methods for mining WWW access log. The Apriori algorithm is well known as a typical algorithm for sequence pattern mining. However, it suffers from inherent difficulties in finding long sequential patterns and in extracting interesting patterns among a huge amount of results. This article proposes a new method for finding generalized sequence pattern by matrix clustering. This method decomposes a sequence into a set of sequence elements, each of which corresponds to an ordered pair of items. Then matrix clustering is applied to extract a cluster of similar sequences. The resulting sequence elements are composed into a generalized sequence. Our method is evaluated with practical WWW access log, which shows that it is practically useful in finding long sequences and in presenting the generalized sequence in a graph.
international parallel and distributed processing symposium | 2001
Kazuto Kubota; Akihiko Nakase; Shigeru Oyanagi
This paper proposes a parallel data-mining algorithm and its implementation on a PC cluster. The decision tree is a widely used data-mining algorithm for classifying records in a database. Simple parallelization of decision tree generation is not efficient because of the load imbalance caused by the form of the generated tree. The SPRINT algorithm solves this problem by grouping a set of nodes in the same level of the tree and balancing the load; however, frequent disk access is required when the data size exceeds the memory size. We propose an improved parallel algorithm of SPRINT by incorporating a dynamic scheduling. Dynamic scheduling is effective in reducing the amount of disk access for storing intermediate results; however, it may cause imbalance in data distribution on PEs (Processing Elements). We solved this problem by incorporating data redistribution. The evaluation result shows that our method realizes an improvement in speed of 3.5 times, for the best case, and equal performance even in the worst case, compared with SPRINT. We also discuss how further performance enhancement may be possible by improving the communication performance.
Archive | 2001
Shigeru Oyanagi; Hiroshi Sakai; Kazuto Kubota
Archive | 2001
Shigeru Oyanagi; Kazuto Kubota; Akihiko Nakase
Archive | 2013
Kyosuke Katayama; Kazuto Kubota; Takahisa Wada; Kiyotaka Matsue; Akihiro Suyama; Tomohiko Tanimoto; Hiroshi Taira
Archive | 2001
Shigeru Koyanagi; Kazuto Kubota; Akihiko Nakase; 和人 久保田; 明彦 仲瀬; 滋 小柳
Archive | 2014
Kyosuke Katayama; 恭介 片山; Kazuto Kubota; 和人 久保田; Takahisa Wada; 卓久 和田; Kiyotaka Matsue; 清高 松江; Akihiro Suyama; 明弘 酢山; Tomohiko Tanimoto; 智彦 谷本; Hiroshi Taira; 博司 平
Archive | 2010
Yoshiyuki Hondo; Shuichiro Imahara; Kazuto Kubota; Toshimitsu Kumazawa; 和人 久保田
Archive | 2009
Shuichiro Imahara; Kazuto Kubota; Mototaka Kanematsu; Akiko Matsukawa