Takayoshi Shoudai
Kyushu University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Takayoshi Shoudai.
knowledge discovery and data mining | 2002
Tetsuhiro Miyahara; Yusuke Suzuki; Takayoshi Shoudai; Tomoyuki Uchida; Kenichi Takahashi; Hiroaki Ueda
Many Web documents such as HTML files and XML files have no rigid structure and are called semistructured data. In general, such semistructuredWeb documents are represented by rooted trees with ordered children. We propose a new method for discovering frequent tree structured patterns in semistructured Web documents by using a tag tree pattern as a hypothesis. A tag tree pattern is an edge labeled tree with ordered children which has structured variables. An edge label is a tag or a keyword in such Web documents, and a variable can be substituted by an arbitrary tree. So a tag tree pattern is suited for representing tree structured patterns in such Web documents. First we show that it is hard to compute the optimum frequent tag tree pattern. So we present an algorithm for generating all maximally frequent tag tree patterns and give the correctness of it. Finally, we report some experimental results on our algorithm. Although this algorithm is not efficient, experiments show that we can extract characteristic tree structured patterns in those data.
pacific asia conference on knowledge discovery and data mining | 2001
Tetsuhiro Miyahara; Takayoshi Shoudai; Tomoyuki Uchida; Kenichi Takahashi; Hiroaki Ueda
Many documents such as Web documents or XML files have no rigid structure. Such semistructured documents have been rapidly increasing. We propose a new method for discovering frequent tree structured patterns in semistructured Web documents. We consider the data mining problem of finding all maximally frequent tag tree patterns in semistructured data such as Web documents. A tag tree pattern is an edge labeled tree which has hyperedges as variables. An edge label is a tag or a keyword inWeb documents, and a variable can be substituted by any tree. So a tag tree pattern is suited for representing tree structured patterns in semistructured Web documents. We present an algorithm for finding all maximally frequent tag tree patterns. Also we report some experimental results on XML documents by using our algorithm.
algorithmic learning theory | 2002
Yusuke Suzuki; Takayoshi Shoudai; Tomoyuki Uchida; Tetsuhiro Miyahara
In the fields of data mining and knowledge discovery, many semistructured data such as HTML/XML files are represented by rooted trees t such that all children of each internal vertex of t are ordered and t has edge labels. In order to represent structural features common to such semistructured data, we propose a linear ordered term tree, which is a rooted tree pattern consisting of ordered tree structures and internal structured variables with distinct variable labels. For a set of edge labels Λ, let OTTΛ be the set of all linear ordered term trees. For a linear ordered term tree t in OTTΛ, the term tree language of t, denoted by LΛ (t), is the set of all ordered trees obtained from t by substituting arbitrary ordered trees for all variables in t. Given a set of ordered trees S, the minimal language problem for OTTLΛ = {LΛ (t) | t ∈ OTTΛ} is to find a linear ordered term tree t in OTTΛ such that LΛ (t) is minimal among all term tree languages which contain all ordered trees in S. We show that the class OTTLΛ is polynomial time inductively inferable from positive data, by giving a polynomial time algorithm for solving the minimal language problem for OTTLΛ.
conference on learning theory | 2002
Yusuke Suzuki; Ryuta Akanuma; Takayoshi Shoudai; Tetsuhiro Miyahara; Tomoyuki Uchida
Tree structured data such as HTML/XML files are represented by rooted trees with ordered children and edge labels. As a representation of a tree structured pattern in such tree structured data, we propose an ordered tree pattern, called a term tree, which is a rooted tree pattern consisting of ordered children and internal structured variables. A term tree is a generalization of standard tree patterns representing first order terms in formal logic. For a set of edge labels ? and a term tree t, the term tree language of t, denoted by L?(t), is the set of all labeled trees which are obtained from a term tree t by substituting arbitrary labeled trees for all variables in t. In this paper, we propose polynomial time algorithms for the following two problems for two fundamental classes of term trees. The membership problem is, given a term tree t and a tree T, to decide whether or not L?(t) includes T. The minimal language problem is, given a set of labeled trees S, to find a term tree t such that L?(t) is minimal among all term tree languages which contain all trees in S. Then, by using these two algorithms, we show that the two classes of term trees are polynomial time inductively inferable from positive data.
Machine Learning | 2009
Hitoshi Yamasaki; Yosuke Sasaki; Takayoshi Shoudai; Tomoyuki Uchida; Yusuke Suzuki
AbstractRecently, due to the rapid growth of electronic data having graph structures such as HTML and XML texts and chemical compounds, many researchers have been interested in data mining and machine learning techniques for finding useful patterns from graph-structured data (graph data). Since graph data contain a huge number of substructures and it tends to be computationally expensive to decide whether or not such data have given structural features, graph mining problems face computational difficulties. Let n
fundamentals of computation theory | 2001
Takayoshi Shoudai; Tomoyuki Uchida; Tetsuhiro Miyahara
{mathcal{C}}
inductive logic programming | 2003
Kazunori Yamagata; Tomoyuki Uchida; Takayoshi Shoudai; Yasuaki Nakamura
nbe a graph class which satisfies a connected hereditary property and contains infinitely many different biconnected graphs, and for which a special kind of the graph isomorphism problem can be computed in polynomial time. In this paper, we consider learning and mining problems forxa0n
inductive logic programming | 2007
Yosuke Sasaki; Hitoshi Yamasaki; Takayoshi Shoudai; Tomoyuki Uchida
{mathcal{C}}
pacific-asia conference on knowledge discovery and data mining | 2004
Tetsuhiro Miyahara; Yusuke Suzuki; Takayoshi Shoudai; Tomoyuki Uchida; Kenichi Takahashi; Hiroaki Ueda
n. Firstly, we define a new graph pattern, which is called a block preserving graph pattern (bp-graph pattern) forxa0n
inductive logic programming | 2002
Yusuke Suzuki; Kohtaro Inomae; Takayoshi Shoudai; Tetsuhiro Miyahara; Tomoyuki Uchida
{mathcal{C}}