Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hirofumi Matsuzawa.
Ibm Systems Journal | 2004
Hirofumi Matsuzawa; Tohru Nagano; Akiko Murakami; Hironori Takeuchi; Koichi Takeda
This paper describes the application of IBM TAKMI® for Biomedical Documents to facilitate knowledge discovery from the very large text databases characteristic of life science and healthcare applications. This set of tools, designated MedTAKMI, is an extension of the TAKMI (Text Analysis and Knowledge MIning) system originally developed for text mining in customer-relationship-management applications. MedTAKMI dynamically and interactively mines a collection of documents to obtain characteristic features within them. By using multifaceted mining of these documents together with biomedically motivated categories for term extraction and a series of drill-down queries, users can obtain knowledge about a specific topic after seeing only a few key documents. In addition, the use of natural language techniques makes it possible to extract deeper relationships among biomedical concepts. The MedTAKMI system is capable of mining the entire MEDLINE® database of 11 million biomedical journal abstracts. It is currently running at a customer site.
knowledge discovery and data mining | 2008
Shohei Hido; Tsuyoshi Idé; Hisashi Kashima; Harunobu Kubo; Hirofumi Matsuzawa
We propose a formulation of a new problem, which we call change analysis, and a novel method for solving the problem. In contrast to the existing methods of change (or outlier) detection, the goal of change analysis goes beyond detecting whether or not any changes exist. Its ultimate goal is to find the explanation of the changes.While change analysis falls in the category of unsupervised learning in nature, we propose a novel approach based on supervised learning to achieve the goal. The key idea is to use a supervised classifier for interpreting the changes. A classifier should be able to discriminate between the two data sets if they actually come from two different data sources. In other words, we use a hypothetical label to train the supervised learner, and exploit the learner for interpreting the change. Experimental results using real data show the proposed approach is promising in change analysis as well as concept drift analysis.
pacific asia conference on knowledge discovery and data mining | 2000
Hirofumi Matsuzawa; Takeshi Fukuda
We consider the data-mining problem of discovering structured association patterns from large databases. A structured association pattern is a set of sets of items that can represent a two level structure in some specified set of target data. Although the structure is very simple, it cannot be extracted by conventional pattern discovery algorithms. We present an algorithm that discovers all frequent structured association patterns. We were motivated to consider the problem by a specific text mining application, but our method is applicable to a broad range of data mining applications. Experiments with synthetic and real data show that our algorithm efficiently discovers structured association patterns in a large volume of data.
knowledge discovery and data mining | 2009
Shohei Hido; Hirofumi Matsuzawa; Fumihiko Kitayama; Masayuki Numao
Hierarchical structures of components often appear in industry, such as the components of cars. We focus on association mining from the hierarchically assembled data items that are characterized with identity labels such as lot numbers. Massive and physically distributed product databases make it difficult to directly find the associations of deep-level items. We propose a top-down algorithm using virtual lot numbers to mine association rules from the hierarchical databases. Virtual lot numbers delegate the identity information of the subcomponents to upper-level lot numbers without modifications to the databases. Our pruning method reduces the number of enumerated items and avoids redundant access to the databases. Experiments show that the algorithm works an order of magnitude faster than a naive approach.
extending database technology | 1998
Takeshi Fukuda; Hirofumi Matsuzawa
Decision support systems that include on-line analytical processing and data mining have recently attracted research attention. Such applications treat data in very large databases as multidimensional data cubes. Each cell of a data cube typically is some aggregation, such as total sales volume, that is of interest to analysts. Since it may be necessary to compute many cells, and the performance is critical, we propose parallel algorithms that compute multiple aggregate queries in data cubes on a shared-nothing multiprocessor with high-bandwidth communication facilities. We evaluate the algorithms on the basis of analytical modeling and an implementation on an IBM SP2 system.
Archive | 2000
Takeshi Fukuda; Hirofumi Matsuzawa
Archive | 1998
Hirofumi Matsuzawa; Takeshi Fukuda
Ibm Systems Journal | 2004
Robert L. Mack; S. Mukherjea; A. Soffer; Eric W. Brown; Anni Coden; James W. Cooper; Akihiro Inokuchi; B. Iyer; Y. Mass; Hirofumi Matsuzawa; L. V. Subramaniam
very large data bases | 1998
Yasuhiko Morimoto; Takeshi Fukuda; Hirofumi Matsuzawa; Takeshi Tokuyama; Kunikazu Yoda
Archive | 2002
Akiko Murakami; Hirofumi Matsuzawa; Tetsuya Nasukawa