Takashi Washio | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takashi Washio is active.

Explore More

Publication

Featured researches published by Takashi Washio.

european conference on principles of data mining and knowledge discovery | 2000

An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

Akihiro Inokuchi; Takashi Washio; Hiroshi Motoda

This paper proposes a novel approach named AGM to efficiently mine the association rules among the frequently appearing substructures in a given graph data set. A graph transaction is represented by an adjacency matrix, and the frequent patterns appearing in the matrices are mined through the extended algorithm of the basket analysis. Its performance has been evaluated for the artificial simulation data and the carcinogenesis data of Oxford University and NTP. Its high efficiency has been confirmed for the size of a real-world problem.

Sigkdd Explorations | 2003

State of the art of graph-based data mining

Takashi Washio; Hiroshi Motoda

The need for mining structured data has increased in the past few years. One of the best studied data structures in computer science and discrete mathematics are graphs. It can therefore be no surprise that graph based data mining has become quite popular in the last few years.This article introduces the theoretical basis of graph based data mining and surveys the state of the art of graph-based data mining. Brief descriptions of some representative approaches are provided as well.

Machine Learning | 2003

Complete Mining of Frequent Patterns from Graphs: Mining Graph Data

Akihiro Inokuchi; Takashi Washio; Hiroshi Motoda

Basket Analysis, which is a standard method for data mining, derives frequent itemsets from database. However, its mining ability is limited to transaction data consisting of items. In reality, there are many applications where data are described in a more structural way, e.g. chemical compounds and Web browsing history. There are a few approaches that can discover characteristic patterns from graph-structured data in the field of machine learning. However, almost all of them are not suitable for such applications that require a complete search for all frequent subgraph patterns in the data. In this paper, we propose a novel principle and its algorithm that derive the characteristic patterns which frequently appear in graph-structured data. Our algorithm can derive all frequent induced subgraphs from both directed and undirected graph structured data having loops (including self-loops) with labeled or unlabeled nodes and links. Its performance is evaluated through the applications to Web browsing pattern analysis and chemical carcinogenesis analysis.

Archive | 2007

New Frontiers in Artificial Intelligence

Takao Terano; Yukio Ohsawa; Toyoaki Nishida; Akira Namatame; Syusaku Tsumoto; Takashi Washio

Neg-Raising (NR) verbs form a class of verbs with a clausal complement that show the following behavior: when a negation syntactically attaches to the matrix predicate, it can semantically attach to the embedded predicate. This paper presents an account of NR predicates within Tree Adjoining Grammar (TAG). We propose a lexical semantic interpretation that heavily relies on a Montague-like semantics for TAG and on higher-order types.

knowledge discovery and data mining | 2004

Density-based spam detector

Kenichi Yoshida; Fuminori Adachi; Takashi Washio; Hiroshi Motoda; Teruaki Homma; Akihiro Nakashima; Hiromitsu Fujikawa; Katsuyuki Yamazaki

The volume of mass unsolicited electronic mail, often known as spam, has recently increased enormously and has become a serious threat to not only the Internet but also to society. This paper proposes a new spam detection method which uses document space density information. Although it requires extensive e-mail traffic to acquire the necessary information, an unsupervised learning engine with a short white list can achieve a 98% recall rate and 100% precision. A direct-mapped cache method contributes handling of over 13,000 e-mails per second. Experimental results, which were conducted using over 50 million actual e-mails of traffic, are also reported in this paper.

IEEE Transactions on Knowledge and Data Engineering | 2008

DryadeParent, An Efficient and Robust Closed Attribute Tree Mining Algorithm

Alexandre Termier; Marie Christine Rousset; Michèle Sebag; Kouzou Ohara; Takashi Washio; Hiroshi Motoda

In this paper, we present a new tree mining algorithm, DryadeParent, based on the hooking principle first introduced in DRYADE. In the experiments, we demonstrate that the branching factor and depth of the frequent patterns to find are key factors of complexity for tree mining algorithms, even if often overlooked in previous work. We show that DryadeParent outperforms the current fastest algorithm, CMTreeMiner, by orders of magnitude on data sets where the frequent tree patterns have a high branching factor.

web intelligence | 2001

Automatic Web-Page Classification by Using Machine Learning Methods

Makoto Tsukada; Takashi Washio; Hiroshi Motoda

This paper describes automatic Web-page classification by using machine learning methods. Recently, the importance of portal site services is increasing including the search engine function on World Wide Web. Especially, the portal site such as Yahoo! service, which hierarchically classifies Web-pages into many categories, is becoming popular. However, the classification of Web-page into each category relies on man power, which costs much time and care. To alleviate this problem, we propose techniques to generate attributes by using co-occurrence analysis and to classify Web-page automatically based on machine learning. We apply these techniques to Web-pages on Yahoo! JAPAN and construct decision trees, which determine appropriate category for each Web-page. The performance of this proposed method is evaluated in terms of error rate, recall, and precision. The experimental evaluation demonstrates that this method provides acceptable accuracy with the classification of Web-page into top level categories on Yahoo! JAPAN.

pacific asia conference on knowledge discovery and data mining | 2000

Extension of Graph-Based Induction for General Graph Structured Data

Takashi Matsuda; Tadashi Horiuchi; Hiroshi Motoda; Takashi Washio

A machine learning technique called Graph-Based Induction (GBI) efficiently extracts typical patterns from directed graph data by stepwise pair expansion (pairwise chunking). In this paper, we expand the capability of the Graph-Based Induction to handle not only tree structured data but also multi-inputs/outputs nodes and loop structure (including a self-loop) which cannot be treated in the conventional way. The method is verified to work as expected using artificially generated data and we evaluated experimentally the computation time of the implemented program. We, further, show the effectiveness of our approach by applying it to two kinds of the real-world data: World Wide Web browsing data and DNA sequence data.

Cardiovascular Drugs and Therapy | 2004

A Novel Data Mining Approach to the Identification of Effective Drugs or Combinations for Targeted Endpoints—Application to Chronic Heart Failure as a New Form of Evidence-based Medicine

Jiyoong Kim; Takashi Washio; Masakazu Yamagishi; Yoshio Yasumura; Satoshi Nakatani; Kazuhiko Hashimura; Akihisa Hanatani; Kazuo Komamura; Kunio Miyatake; Soichiro Kitamura; Hitonobu Tomoike; Masafumi Kitakaze

SummaryBackground: Data mining is a technique for discovering useful information hidden in a database, which has recently been used by the chemical, financial, pharmaceutical, and insurance industries. It may enable us to detect the interesting and hidden data on useful drugs especially in the field of cardiovascular disease. Methods: & Results: We evaluated the current treatments for chronic heart failure (CHF) in our institute using a decision tree method of data mining and compared the results with those of large-scale clinical trials. We enrolled 1,100 patients with CHF (NYHA classes II–IV and EF < 40%) who were hospitalized at the National Cardiovascular Center during the past 31 months. Drugs prescribed at discharge were extracted from the clinical database. Both echocardiograms and plasma BNP level at 6–12 months after discharge were determined prospectively. It was found that beta-blockers, angiotensin converting enzyme inhibitors, and angiotensin II receptor antagonists independently improve both the plasma BNP level and %fractional shortening (FS), while oral inotropic agents increased the plasma BNP level and decreased %FS. These findings agree with evidence accumulated from several large-scale trials. Interestingly, statins, histamine receptor blockers, and alpha-glucosidase inhibitors also attenuated the severity of CHF, suggesting the possibility of new treatment of CHF. Conclusion: Clinical data mining using Japanese CHF patients yielded almost identical data to the results of large-scale trials, and also suggested novel and unexpected candidates for CHF therapy. Further validation of the data mining approved in the cardiovascular field is warranted.

Neural Networks | 2012

Separation of stationary and non-stationary sources with a generalized eigenvalue problem

Satoshi Hara; Yoshinobu Kawahara; Takashi Washio; Paul von Bünau; Terumasa Tokunaga; K. Yumoto

Non-stationary effects are ubiquitous in real world data. In many settings, the observed signals are a mixture of underlying stationary and non-stationary sources that cannot be measured directly. For example, in EEG analysis, electrodes on the scalp record the activity from several sources located inside the brain, which one could only measure invasively. Discerning stationary and non-stationary contributions is an important step towards uncovering the mechanisms of the data generating system. To that end, in Stationary Subspace Analysis (SSA), the observed signal is modeled as a linear superposition of stationary and non-stationary sources, where the aim is to separate the two groups in the mixture. In this paper, we propose the first SSA algorithm that has a closed form solution. The novel method, Analytic SSA (ASSA), is more than 100 times faster than the state-of-the-art, numerically stable, and guaranteed to be optimal when the covariance between stationary and non-stationary sources is time-constant. In numerical simulations on wide range of settings, we show that our method yields superior results, even for signals with time-varying group-wise covariance. In an application to geophysical data analysis, ASSA extracts meaningful components that shed new light on the Pi 2 pulsations of the geomagnetic field.

Explore More