Tatsuya Akutsu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tatsuya Akutsu is active.

Explore More

Publication

Featured researches published by Tatsuya Akutsu.

pacific symposium on biocomputing | 1998

Identification of genetic networks from a small number of gene expression patterns under the Boolean network model.

Tatsuya Akutsu; Satoru Miyano

Liang, Fuhrman and Somogyi (PSB98, 18-29, 1998) have described an algorithm for inferring genetic network architectures from state transition tables which correspond to time series of gene expression patterns, using the Boolean network model. Their results of computational experiments suggested that a small number of state transition (INPUT/OUTPUT) pairs are sufficient in order to infer the original Boolean network correctly. This paper gives a mathematical proof for their observation. Precisely, this paper devises a much simpler algorithm for the same problem and proves that, if the indegree of each node (i.e., the number of input nodes to each node) is bounded by a constant, only O(log n) state transition pairs (from 2n pairs) are necessary and sufficient to identify the original Boolean network of n nodes correctly with high probability. We made computational experiments in order to expose the constant factor involved in O(log n) notation. The computational results show that the Boolean network of size 100,000 can be identified by our algorithm from about 100 INPUT/OUTPUT pairs if the maximum indegree is bounded by 2. It is also a merit of our algorithm that the algorithm is conceptually so simple that it is extensible for more realistic network models.

Discrete Applied Mathematics | 2000

Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots

Tatsuya Akutsu

This paper shows simple dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. For a basic version of the problem (i.e., maximizing the number of base pairs), this paper presents an O(n 4 ) time exact algorithm and an O(n 4 ) time approximation algorithm. The latter one outputs, for most RNA sequences, a secondary structure in which the number of base pairs is at least 1 of the optimal, where ; are any constants satisfying 0<;<1. Several related results are shown too. ? 2000 Elsevier Science B.V. All rights reserved.

Journal of Computational Biology | 2000

Algorithms for identifying Boolean networks and related biological networks based on matrix multiplication and fingerprint function.

Tatsuya Akutsu; Satoru Miyano

Due to the recent progress of the DNA microarray technology, a large number of gene expression profile data are being produced. How to analyze gene expression data is an important topic in computational molecular biology. Several studies have been done using the Boolean network as a model of a genetic network. This paper proposes efficient algorithms for identifying Boolean networks of bounded indegree and related biological networks, where identification of a Boolean network can be formalized as a problem of identifying many Boolean functions simultaneously. For the identification of a Boolean network, an O(mnD+1) time naive algorithm and a simple O (mnD) time algorithm are known, where n denotes the number of nodes, m denotes the number of examples, and D denotes the maximum in degree. This paper presents an improved O(momega-2nD + mnD+omega-3) time Monte-Carlo type randomized algorithm, where omega is the exponent of matrix multiplication (currently, omega < 2.376). The algorithm is obtained by combining fast matrix multiplication with the randomized fingerprint function for string matching. Although the algorithm and its analysis are simple, the result is nontrivial and the technique can be applied to several related problems.

Journal of Chemical Information and Modeling | 2005

Graph kernels for molecular structure-activity relationship analysis with support vector machines.

Pierre Mahé; Nobuhisa Ueda; Tatsuya Akutsu; Jean-Luc Perret; Jean-Philippe Vert

The support vector machine algorithm together with graph kernel functions has recently been introduced to model structure-activity relationships (SAR) of molecules from their 2D structure, without the need for explicit molecular descriptor computation. We propose two extensions to this approach with the double goal to reduce the computational burden associated with the model and to enhance its predictive accuracy: description of the molecules by a Morgan index process and definition of a second-order Markov model for random walks on 2D structures. Experiments on two mutagenicity data sets validate the proposed extensions, making this approach a possible complementary alternative to other modeling strategies.

Protein Science | 2005

A novel representation of protein sequences for prediction of subcellular location using support vector machines

Setsuro Matsuda; Jean-Philippe Vert; Hiroto Saigo; Nobuhisa Ueda; Hiroyuki Toh; Tatsuya Akutsu

As the number of complete genomes rapidly increases, accurate methods to automatically predict the subcellular location of proteins are increasingly useful to help their functional annotation. In order to improve the predictive accuracy of the many prediction methods developed to date, a novel representation of protein sequences is proposed. This representation involves local compositions of amino acids and twin amino acids, and local frequencies of distance between successive (basic, hydrophobic, and other) amino acids. For calculating the local features, each sequence is split into three parts: N‐terminal, middle, and C‐terminal. The N‐terminal part is further divided into four regions to consider ambiguity in the length and position of signal sequences. We tested this representation with support vector machines on two data sets extracted from the SWISS‐PROT database. Through fivefold cross‐validation tests, overall accuracies of more than 87% and 91% were obtained for eukaryotic and prokaryotic proteins, respectively. It is concluded that considering the respective features in the N‐terminal, middle, and C‐terminal parts is helpful to predict the subcellular location.

international conference on machine learning | 2004

Extensions of marginalized graph kernels

Pierre Mahé; Nobuhisa Ueda; Tatsuya Akutsu; Jean-Luc Perret; Jean-Philippe Vert

Positive definite kernels between labeled graphs have recently been proposed. They enable the application of kernel methods, such as support vector machines, to the analysis and classification of graphs, for example, chemical compounds. These graph kernels are obtained by marginalizing a kernel between paths with respect to a random walk model on the graph vertices along the edges. We propose two extensions of these graph kernels, with the double goal to reduce their computation time and increase their relevance as measure of similarity between graphs. First, we propose to modify the label of each vertex by automatically adding information about its environment with the use of the Morgan algorithm. Second, we suggest a modification of the random walk model to prevent the walk from coming back to a vertex that was just visited. These extensions are then tested on benchmark experiments of chemical compounds classification, with promising results.

Bioinformatics | 2010

Cascleave: towards more accurate prediction of caspase substrate cleavage sites

Hao Tan; Hong-Bin Shen; Khalid Mahmood; Sarah E. Boyd; Geoffrey I. Webb; Tatsuya Akutsu; James C. Whisstock

MOTIVATION The caspase family of cysteine proteases play essential roles in key biological processes such as programmed cell death, differentiation, proliferation, necrosis and inflammation. The complete repertoire of caspase substrates remains to be fully characterized. Accordingly, systematic computational screening studies of caspase substrate cleavage sites may provide insight into the substrate specificity of caspases and further facilitating the discovery of putative novel substrates. RESULTS In this article we develop an approach (termed Cascleave) to predict both classical (i.e. following a P(1) Asp) and non-typical caspase cleavage sites. When using local sequence-derived profiles, Cascleave successfully predicted 82.2% of the known substrate cleavage sites, with a Matthews correlation coefficient (MCC) of 0.667. We found that prediction performance could be further improved by incorporating information such as predicted solvent accessibility and whether a cleavage sequence lies in a region that is most likely natively unstructured. Novel bi-profile Bayesian signatures were found to significantly improve the prediction performance and yielded the best performance with an overall accuracy of 87.6% and a MCC of 0.747, which is higher accuracy than published methods that essentially rely on amino acid sequence alone. It is anticipated that Cascleave will be a powerful tool for predicting novel substrate cleavage sites of caspases and shedding new insights on the unknown caspase-substrate interactivity relationship. AVAILABILITY http://sunflower.kuicr.kyoto-u.ac.jp/ approximately sjn/Cascleave/ CONTACT [email protected]; [email protected]; james; [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

pacific symposium on biocomputing | 1999

Algorithms for inferring qualitative models of biological networks.

Tatsuya Akutsu; Satoru Miyano

Modeling genetic networks and metabolic networks is an important topic in bioinformatics. We propose a qualitative network model which is a combination of the Boolean network and qualitative reasoning, where qualitative reasoning is a kind of reasoning method well-studied in Artificial Intelligence. We also present algorithms for inferring qualitative networks from time series data and an algorithm for inferring S-systems (synergistic and saturable systems) from time series data, where S-systems are based on a particular kind of nonlinear differential equation and have been applied to the analysis of various biological systems.

PLOS ONE | 2012

PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites.

Hao Tan; Andrew J. Perry; Tatsuya Akutsu; Geoffrey I. Webb; James C. Whisstock; Robert N. Pike

The ability to catalytically cleave protein substrates after synthesis is fundamental for all forms of life. Accordingly, site-specific proteolysis is one of the most important post-translational modifications. The key to understanding the physiological role of a protease is to identify its natural substrate(s). Knowledge of the substrate specificity of a protease can dramatically improve our ability to predict its target protein substrates, but this information must be utilized in an effective manner in order to efficiently identify protein substrates by in silico approaches. To address this problem, we present PROSPER, an integrated feature-based server for in silico identification of protease substrates and their cleavage sites for twenty-four different proteases. PROSPER utilizes established specificity information for these proteases (derived from the MEROPS database) with a machine learning approach to predict protease cleavage sites by using different, but complementary sequence and structure characteristics. Features used by PROSPER include local amino acid sequence profile, predicted secondary structure, solvent accessibility and predicted native disorder. Thus, for proteases with known amino acid specificity, PROSPER provides a convenient, pre-prepared tool for use in identifying protein substrates for the enzymes. Systematic prediction analysis for the twenty-four proteases thus far included in the database revealed that the features we have included in the tool strongly improve performance in terms of cleavage site prediction, as evidenced by their contribution to performance improvement in terms of identifying known cleavage sites in substrates for these enzymes. In comparison with two state-of-the-art prediction tools, PoPS and SitePrediction, PROSPER achieves greater accuracy and coverage. To our knowledge, PROSPER is the first comprehensive server capable of predicting cleavage sites of multiple proteases within a single substrate sequence using machine learning techniques. It is freely available at http://lightning.med.monash.edu.au/PROSPER/.

intelligent systems in molecular biology | 2011

IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming.

Kengo Sato; Yuki Kato; Michiaki Hamada; Tatsuya Akutsu; Kiyoshi Asai

Motivation: Pseudoknots found in secondary structures of a number of functional RNAs play various roles in biological processes. Recent methods for predicting RNA secondary structures cover certain classes of pseudoknotted structures, but only a few of them achieve satisfying predictions in terms of both speed and accuracy. Results: We propose IPknot, a novel computational method for predicting RNA secondary structures with pseudoknots based on maximizing expected accuracy of a predicted structure. IPknot decomposes a pseudoknotted structure into a set of pseudoknot-free substructures and approximates a base-pairing probability distribution that considers pseudoknots, leading to the capability of modeling a wide class of pseudoknots and running quite fast. In addition, we propose a heuristic algorithm for refining base-paring probabilities to improve the prediction accuracy of IPknot. The problem of maximizing expected accuracy is solved by using integer programming with threshold cut. We also extend IPknot so that it can predict the consensus secondary structure with pseudoknots when a multiple sequence alignment is given. IPknot is validated through extensive experiments on various datasets, showing that IPknot achieves better prediction accuracy and faster running time as compared with several competitive prediction methods. Availability: The program of IPknot is available at http://www.ncrna.org/software/ipknot/. IPknot is also available as a web server at http://rna.naist.jp/ipknot/. Contact: [email protected]; [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Explore More