Daiji Fukagawa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daiji Fukagawa is active.

Explore More

Publication

Featured researches published by Daiji Fukagawa.

Discrete Applied Mathematics | 2012

Inferring a graph from path frequency

Tatsuya Akutsu; Daiji Fukagawa; Jesper Jansson; Kunihiko Sadakane

This paper considers the problem of inferring a graph from the number of occurrences of vertex-labeled paths, which is closely related to the pre-image problem for graphs: to reconstruct a graph from its feature space representation. It is shown that both exact and approximate versions of the problem can be solved in polynomial time in the size of an output graph by using dynamic programming algorithms if the graphs are trees whose maximum degree is bounded by a constant and the lengths of given paths and alphabet size are bounded by constants. On the other hand, it is shown that this problem is strongly NP-hard even for trees of bounded degree if the maximum length of paths is not bounded. The problem of inferring a string from the number of occurrences of fixed size substrings is also studied.

BMC Bioinformatics | 2011

A clique-based method for the edit distance between unordered trees and its application to analysis of glycan structures.

Daiji Fukagawa; Takeyuki Tamura; Atsuhiro Takasu; Etsuji Tomita; Tatsuya Akutsu

BackgroundMeasuring similarities between tree structured data is important for analysis of RNA secondary structures, phylogenetic trees, glycan structures, and vascular trees. The edit distance is one of the most widely used measures for comparison of tree structured data. However, it is known that computation of the edit distance for rooted unordered trees is NP-hard. Furthermore, there is almost no available software tool that can compute the exact edit distance for unordered trees.ResultsIn this paper, we present a practical method for computing the edit distance between rooted unordered trees. In this method, the edit distance problem for unordered trees is transformed into the maximum clique problem and then efficient solvers for the maximum clique problem are applied. We applied the proposed method to similar structure search for glycan structures. The result suggests that our proposed method can efficiently compute the edit distance for moderate size unordered trees. It also suggests that the proposed method has the accuracy comparative to those by the edit distance for ordered trees and by an existing method for glycan search.ConclusionsThe proposed method is simple but useful for computation of the edit distance between unordered trees. The object code is available upon request.

Journal of Computational Biology | 2012

A Clique-Based Method Using Dynamic Programming for Computing Edit Distance Between Unordered Trees

Tomoya Mori; Takeyuki Tamura; Daiji Fukagawa; Atsuhiro Takasu; Etsuji Tomita; Tatsuya Akutsu

Many kinds of tree-structured data, such as RNA secondary structures, have become available due to the progress of techniques in the field of molecular biology. To analyze the tree-structured data, various measures for computing the similarity between them have been developed and applied. Among them, tree edit distance is one of the most widely used measures. However, the tree edit distance problem for unordered trees is NP-hard. Therefore, it is required to develop efficient algorithms for the problem. Recently, a practical method called clique-based algorithm has been proposed, but it is not fast for large trees. This article presents an improved clique-based method for the tree edit distance problem for unordered trees. The improved method is obtained by introducing a dynamic programming scheme and heuristic techniques to the previous clique-based method. To evaluate the efficiency of the improved method, we applied the method to comparison of real tree structured data such as glycan structures. For large tree-structures, the improved method is much faster than the previous method. In particular, for hard instances, the improved method achieved more than 100 times speed-up.

Algorithmica | 2010

Approximating Tree Edit Distance through String Edit Distance

Tatsuya Akutsu; Daiji Fukagawa; Atsuhiro Takasu

We present an algorithm to approximate edit distance between two ordered and rooted trees of bounded degree. In this algorithm, each input tree is transformed into a string by computing the Euler string, where labels of some edges in the input trees are modified so that structures of small subtrees are reflected to the labels. We show that the edit distance between trees is at least 1/6 and at most O(n3/4) of the edit distance between the transformed strings, where n is the maximum size of two input trees and we assume unit cost edit operations for both trees and strings. The algorithm works in O(n2) time since computation of edit distance and reconstruction of tree mapping from string alignment takes O(n2) time though transformation itself can be done in O(n) time.

Information Processing Letters | 2005

Performance analysis of a greedy algorithm for inferring boolean functions

Daiji Fukagawa; Tatsuya Akutsu

We analyzed average case performance of a known greedy algorithm for inference of a Boolean function from positive and negative examples, and gave a proof to an experimental conjecture that the greedy algorithm works optimally with high probability if both input data and the underlying function are generated uniformly at random.

international conference on data mining | 2007

Statistical Learning Algorithm for Tree Similarity

Atsuhiro Takasu; Daiji Fukagawa; Tatsuya Akutsu

Tree edit distance is one of the most frequently used distance measures for comparing trees. When using the tree edit distance, we need to determine the cost of each operation, but this is a labor-intensive and highly skilled task. This paper proposes an algorithm for learning the costs of tree edit operations from training data consisting of pairs of similar trees. To formalize the cost learning problem, we define a probabilistic model for tree alignment that is a variant of tree edit distance. Then, the parameters of the model are estimated using the expectation maximization (EM) technique. In this paper, we develop an algorithm for parameter learning that is polynomial in time (O{mn2d6)) and space (O{n2d4)) where n, d, and m represent the size of the trees, the maximum degree of trees, and the number of training pairs of trees, respectively.

conference on information and knowledge management | 2009

Dynamic hyperparameter optimization for bayesian topical trend analysis

Tomonari Masada; Daiji Fukagawa; Atsuhiro Takasu; Tsuyoshi Hamada; Yuichiro Shibata; Kiyoshi Oguri

This paper presents a new Bayesian topical trend analysis. We regard the parameters of topic Dirichlet priors in latent Dirichlet allocation as a function of document timestamps and optimize the parameters by a gradient-based algorithm. Since our method gives similar hyperparameters to the documents having similar timestamps, topic assignment in collapsed Gibbs sampling is affected by timestamp similarities. We compute TFIDF-based document similarities by using a result of collapsed Gibbs sampling and evaluate our proposal by link detection task of Topic Detection and Tracking.

international symposium on algorithms and computation | 2006

Approximating tree edit distance through string edit distance

Tatsuya Akutsu; Daiji Fukagawa; Atsuhiro Takasu

This paper presents an O(n2) time algorithm for approximating the unit cost edit distance for ordered and rooted trees of bounded degree within a factor of O(n3/4), where n is the maximum size of two input trees, and the algorithm is based on transformation of an ordered and rooted tree into a string.

Information Processing Letters | 2008

Improved approximation of the largest common subtree of two unordered trees of bounded height

Tatsuya Akutsu; Daiji Fukagawa; Atsuhiro Takasu

We present a polynomial time 1.5h-approximation algorithm for the problem of finding the largest common subtree between two rooted, labeled, and unordered trees of height at most h, where a tree S is called a subtree of a tree T if S is obtained from T by deletion of some nodes in T. This result improves the previous 2h-approximation algorithm.

asia-pacific bioinformatics conference | 2007

INFERRING A CHEMICAL STRUCTURE FROM A FEATURE VECTOR BASED ON FREQUENCY OF LABELED PATHS AND SMALL FRAGMENTS

Tatsuya Akutsu; Daiji Fukagawa

This paper proposes algorithms for inferring a chemical structure from a feature vector based on frequency of labeled paths and small fragments, where this inference problem has a potential application to drug design. In this paper, chemical structures are modeled as trees or tree-like structures. It is shown that the inference problems for these kinds of structures can be solved in polynomial time using dynamic programming-based algorithms. Since these algorithms are not practical, a branchand-bound type algorithm is also proposed. The result of computational experiment suggests that the algorithm can solve the inference problem in a few or few-tens of seconds for moderate size chemical compounds.

Explore More