Tetsuo Shibuya | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tetsuo Shibuya is active.

Explore More

Publication

Featured researches published by Tetsuo Shibuya.

Proceedings of the National Academy of Sciences of the United States of America | 2003

Reevaluation of human cytomegalovirus coding potential

Eain Murphy; Isidore Rigoutsos; Tetsuo Shibuya; Thomas Shenk

The Bio-Dictionary-based Gene Finder was used to reassess the coding potential of the AD169 laboratory strain of human cytomegalovirus and sequences in the Toledo strain that are missing in the laboratory strain of the virus. The gene-finder algorithm assesses the potential of an ORF to encode a protein based on matches to a database of amino acid patterns derived from a large collection of proteins. The algorithm was used to score all human cytomegalovirus ORFs with the potential to encode polypeptides ≥50 aa in length. As a further test for functionality, the genomes of the chimpanzee, rhesus, and murine cytomegaloviruses were searched for orthologues of the predicted human cytomegalovirus ORFs. The analysis indicates that 37 previously annotated ORFs ought to be discarded, and at least nine previously unrecognized ORFs with relatively strong coding potential should be added. Thus, the human cytomegalovirus genome appears to contain ≈192 unique ORFs with the potential to encode a protein. Support for several of the predictions of our in silico analysis was obtained by sequencing several domains within a clinical isolate of human cytomegalovirus.

workshop on algorithms in bioinformatics | 2003

Match Chaining Algorithms for cDNA Mapping

Tetsuo Shibuya; Igor Kurochkin

We propose a new algorithm called the MCCM (Match Chaining-based cDNA Mapping) algorithm that allows mapping cDNAs to the genomes efficiently and accurately, utilizing local matches called MUMs (maximal unique matches) or MRMs (maximal rare matches) obtained with suffix trees. From the MUMs (or MRMs), our algorithm selects appropriate matches which are related to the cDNA mapping. We call the selection the match chaining problem. Several O(klogk)-time algorithms are known where k is the number of the input matches, but they do not permit overlaps of the matches. We propose a new O(klogk)-time algorithm for the problem with provision for overlaps. Previously, only an O(k 2)-time algorithm existed. Furthermore, we also incorporate a restriction on the distances between matches for accurate cDNA mapping. We examine the performance of our algorithm through computational experiments using sequences of the FANTOM mouse cDNA database and the mouse genome. According to the experiments, the MCCM algorithm is not only very fast, but also very accurate: We achieved >95% specificity and >97% sensitivity at the same time against the mapping results of the FANTOM annotators.

international symposium on algorithms and computation | 1999

Constructing the Suffix Tree of a Tree with a Large Alphabet

Tetsuo Shibuya

The problem of constructing the suffix tree of a common suffix tree (CS-tree) is a generalization of the problem of constructing the suffix tree of a string. It has many applications, such as in minimizing the size of sequential transducers and in tree pattern matching. The best-known algorithm for this problem is Breslauers O(n log |Σ|) time algorithm where n is the size of the CS-tree and |Σ| is the alphabet size, which requires O(n log n) time if |Σ| is large. We improve this bound by giving an O(n log log n) algorithm for integer alphabets. For trees called shallow k-ary trees, we give an optimal linear time algorithm. We also describe a new data structure, the Bsuffix tree, which enables efficient query for patterns of completely balanced k-ary trees from a k-ary tree or forest. We also propose an optimal O(n) algorithm for constructing the Bsuffix tree for integer alphabets.

symposium on discrete algorithms | 1999

Optimal on-line algorithms for an electronic commerce money distribution system

Hiroshi Kawazoe; Tetsuo Shibuya; Takeshi Tokuyama

Abstract. We consider the money distribution problem for a micro-payment scheme using a distributed server system; in particular, for an automatic charging scheme named PayPerClick that allows Internet users to view Web pages for which access charges are levied without tedious payment procedures. A major bottleneck in the scheme is the network traffic caused by the distribution of electronic money to many different servers. We propose a simple online algorithm for distributing electronic money to servers so that the network traffic is minimized. The algorithm achieves the optimal online competitive ratio. We also consider a weighted version, for which we give an asymptotically optimal online algorithm within a constant factor.

Nucleic Acids Research | 2003

The web server of IBM's Bioinformatics and Pattern Discovery group

Tien Huynh; Isidore Rigoutsos; Laxmi Parida; Daniel E. Platt; Tetsuo Shibuya

We herein present and discuss the services and content which are available on the web server of IBMs Bioinformatics and Pattern Discovery group. The server is operational around the clock and provides access to a variety of methods that have been published by the groups members and collaborators. The available tools correspond to applications ranging from the discovery of patterns in streams of events and the computation of multiple sequence alignments, to the discovery of genes in nucleic acid sequences and the interactive annotation of amino acid sequences. Additionally, annotations for more than 70 archaeal, bacterial, eukaryotic and viral genomes are available on-line and can be searched interactively. The tools and code bundles can be accessed beginning at http://cbcsrv.watson.ibm.com/Tspd.html whereas the genomics annotations are available at http://cbcsrv.watson.ibm.com/Annotations/.

ACM Journal of Experimental Algorithms | 2000

Computing the nxm shortest path efficiently

Tetsuo Shibuya

Computation of all the shortest paths between multiple sources and multiple destinations on various networks is required in many problems, such as the traveling salesperson problem (TSP) and the vehicle routing problem (VRP). This paper proposes new algorithms that compute the set of shortest paths efficiently by using the A<sup>*</sup> algorithm. The efficiency and properties of these algorithms are examined by using the results of experiments on an actual road network.

Bioinformatics | 2004

Efficient filtering methods for clustering cDNAs with spliced sequence alignment

Tetsuo Shibuya; Hisashi Kashima; Akihiko Konagaya

MOTIVATION Clustering sequences of a full-length cDNA library into alternative splice form candidates is a very important problem. RESULTS We developed a new efficient algorithm to cluster sequences of a full-length cDNA library into alternative splice form candidates. Current clustering algorithms for cDNAs tend to produce too many clusters containing incorrect splice form candidates. Our algorithm is based on a spliced sequence alignment algorithm that considers splice sites. The spliced sequence alignment algorithm is a variant of an ordinary dynamic programming algorithm, which requires O(nm) time for checking a pair of sequences where n and m are the lengths of the two sequences. Since the time bound is too large to perform all-pair comparison for a large set of sequences, we developed new techniques to reduce the computation time without affecting the accuracy of the output clusters. Our algorithm was applied to 21 076 mouse cDNA sequences of the FANTOM 1.10 database to examine its performance and accuracy. In these experiments, we achieved about 2-12-fold speedup against a method using only a traditional hash-based technique. Moreover, without using any information of the mouse genome sequence data or any gene data in public databases, we succeeded in listing 87-89% of all the clusters that biologists have annotated manually. AVAILABILITY We provide a web service for cDNA clustering located at https://access.obigrid.org/ibm/cluspa/, for which registration for the OBIGrid (http://www.obigrid.org) is required.

Archive | 1998