Takeshi Shinohara | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takeshi Shinohara is active.

Explore More

Publication

Featured researches published by Takeshi Shinohara.

Proceedings of RIMS Symposium on Software Science and Engineering | 1983

Polynomial Time Inference of Extended Regular Pattern Languages

Takeshi Shinohara

A pattern is a string of constant symbols and variable symbols. The language of a pattern p is the set of all strings obtained by substituting any non-empty constant string for each variable symbol in p. A regular pattern has at most one occurrence of each variable symbol. In this paper, we consider polynomial time inference from positive data for the class of extended regular pattern languages which are sets of all strings obtained by substituting any (possibly empty) constant string, instead of non-empty string. Our inference machine uses MINL calculation which finds a minimal language containing a given finite set of strings. The relation between MINL calculation for the class of extended regular pattern languages and the longest common subsequence problem is also discussed.

Nucleic Acids Research | 1984

GENAS: a database system for nucleic acid sequence analysis.

Fumihiro Matsuo; Syoichi Futamura; Atsushi Fujita; Takeshi Shinohara; Toshihisa Takagi; Yoshiyuki Sakaki

A database system, named GENAS (GENe Analyzing System), for computer analysis of sequence was constructed using Adbis which is a relational database management system (1). GENAS enables us to retrieve any sequence data from EMBL nucleotide sequence data library (2) and readily to analyze them (if necessary, together with private data) by various application programs in a interactive manner. Analysis of structure of replication origin of replicons was demonstrated using this system.

hawaii international conference on system sciences | 1992

A learning algorithm for elementary formal systems and its experiments on identification of transmembrane domains

Setsuo Arikawa; Satoru Miyano; Ayumi Shinohara; Takeshi Shinohara

Proposes a method for algorithmic learning of transmembrane domains based on elementary formal systems. An elementary formal system (EFS) is a kind of a logic program consisting of if-then rules. With this framework, the authors have implemented the algorithm for identifying transmembrane domains in amino acid sequences. Because of the limitations on computational resources, they restrict candidate hypotheses to EFSs defined by collections of regular patterns. From 70 transmembrane sequences and a similar amount of negative examples which are not transmembrane sequences, the algorithm has produced several reasonable hypotheses of small size. Experiments with the database PIR show that one of them recognizes 95% of 689 transmembrane sequences and 95% of 19256 negative examples which consist of non-transmembrane sequences of length around 30 randomly chosen from PIR.<<ETX>>

hawaii international conference on system sciences | 1993

Finding alphabet indexing for decision trees over regular patterns: an approach to bioinformatical knowledge acquisition

Shinichi Shimozono; Ayumi Shinohara; Takeshi Shinohara; Satoru Miyano; Setsuo Arikawa

Considers a transformation from an alphabet to a smaller alphabet which does not lose any positive and negative information of the original examples. Such a transformation is called indexing. A method which exploits indexing by a local search technique for learning decision trees over regular patterns is proposed. From positive and negative examples, the system produces, as a hypothesis, an indexing-decision tree pair. The authors also report some experimental results obtained by this machine learning system on the following identification problems: transmembrane domains, and signal peptides. For transmembrane domains, the system discovered an indexing by two symbols and a decision tree with just three nodes that achieves 92% accuracy. The indexing was almost the same as that biased on the hydropathy index of Kyte and Doolittle (1982). For signal peptides, the system also found sufficiently good hypotheses.<<ETX>>

New Generation Computing | 1984

A run-time efficient realization of Aho-Corasick pattern matching machines

Setsuo Arikawa; Takeshi Shinohara

Realizations of Aho-Corasick pattern matching machines which deal with several keywords at a time are studied from the viewpoint of run-time and space complexity. New realizations by means of dividing character codes and transition tables are introduced and shown to be efficient. A time-space trade-off in such realizations is pointed out. Experimental results on run-time of our realizations are shown and compared with those of some other well-known pattern matching techniques. Applications of our realizations to sorting and searching for keywords are also discussed.

Proceedings of the Second International Workshop on Nonmonotonic and Inductive Logic | 1991

More About Learning Elementary Formal Systems

Setsuo Arikawa; Takeshi Shinohara; Satoru Miyano; Ayumi Shinohara

Elementary formal system (EFS for short) is a kind of logic program directly dealing with character strings. In 1989, we proposed the class of variable-bounded EFSs as a unifying framework for language learning. Responding to the proposal, several works have been developed. In this paper, a brief summary of these works on learning elementary formal systems, Shapiros model inference approach, inductive inference from positive data, Valiants PAC (probably approximately correct) learning approach, and applications to Molecular Biology, is presented.

international conference on data engineering | 1986

Efficient storage and retrieval of very large document databases

Fumihiro Matsuo; Shouichi Futamura; Takeshi Shinohara

The authors have developed an information retrieval system named AIR (Augmented Information Retrieval system), which might be one of the most efficient systems for very large document databases. AIR can store the document data compactly and retrieve them quickly. The techniques bringing AIR to the high efficiency, the data compression, the quick keyword index, and the automatic keyword selection, are discussed. These techniques, which are based on the statistical properties of word occurrence, are fairly simple, so that the information retrieval systems employing them can be implemented with ease. The data compression technique reduces English text by a factor of 4. The quick keyword index decreases the average number of disk accesses to retrieve a keyword to about 0.3. The automatic keyword selection technique roughly halves both the number of different keywords and the size of the inverted file with only 2% loss of retrieval power.

Proceedings of the International Spring School on Mathematical Methods of Specification and Synthesis of Software Systems '85 | 1985

Some problems on inductive inference from positive data

Takeshi Shinohara

This paper describes some problems on inductive inference of formal languages from positive data: polynomial time inference and its application to practical problem, inference of unions, and inference from negative data.

Algorithmic Learning for Knowledge-Based Systems, GOSLER Final Report | 1995