Keiji Shinzato
Kyoto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Keiji Shinzato.
Journal of Information Processing | 2012
Keiji Shinzato; Tomohide Shibata; Daisuke Kawahara; Sadao Kurohashi
Due to the explosive growth in the amount of information in the last decade, it is getting extremely harder to obtain necessary information by conventional information access methods. Hence, creation of drastically new technology is needed. For developing such new technology, search engine infrastructures are required. Although the existing search engine APIs can be regarded as such infrastructures, these APIs have several restrictions such as a limit on the number of API calls. To help the development of new technology, we are running an open search engine infrastructure, TSUBAKI, on a high-performance computing environment. In this paper, we describe TSUBAKI infrastructure.
web intelligence | 2009
Tomohide Shibata; Yasuo Bamba; Keiji Shinzato; Sadao Kurohashi
This paper describes a system that conducts search result clustering for several thousands of Web pages, and elaborates cluster labels through keyword distillation. Keyword distillation is a method that properly handles spelling variations, transliterations, synonyms, inclusion relations and word ambiguity, using linguistic resources and contexts of a users query. The system provides a clustering result from 1,000 pages in less than one minute by taking advantage of a search engine infrastructure and grid computing environment. Experimental results show that the system correctly merged synonymous keywords and is useful for finding topics hidden in the lower-ranked pages in a search result.
international universal communication symposium | 2009
Susumu Akamine; Yoshikiyo Kato; Daisuke Kawahara; Keiji Shinzato; Kentaro Inui; Sadao Kurohashi; Yutaka Kidawara
This paper reports the ongoing development of a large-scale Web crawler and search engine infrastructure at National Institute of Information and Communications Technology. This infrastructure has the following characteristics: (1) It collects one billion Japanese Web pages while keeping them up-to-date. (2) It selects 100 million pages from among the collected pages and converts them into a standard data format to store the results of morphological analysis, dependency parsing, and synonym augmentation. (3) The selected set of pages is searchable and accessible to the users. (4) The scalability of the system is achieved by using a large-scale cluster machine for distributed data processing.
international joint conference on natural language processing | 2008
Keiji Shinzato; Tomohide Shibata; Daisuke Kawahara; Chikara Hashimoto; Sadao Kurohashi
language resources and evaluation | 2008
Keiji Shinzato; Daisuke Kawahara; Chikara Hashimoto; Sadao Kurohashi
international joint conference on natural language processing | 2013
Keiji Shinzato; Satoshi Sekine
international joint conference on natural language processing | 2013
Daisuke Kawahara; Keiji Shinzato; Tomohide Shibata; Sadao Kurohashi
Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010) | 2010
Keiji Shinzato; Sadao Kurohashi
Journal of Natural Language Processing | 2016
Keiji Shinzato; Satoshi Sekine; Koji Murakami
Archive | 2013
Daisuke Kawahara; Keiji Shinzato; Tomohide Shibata; Sadao Kurohashi
Collaboration
Dive into the Keiji Shinzato's collaboration.
National Institute of Information and Communications Technology
View shared research outputsNational Institute of Information and Communications Technology
View shared research outputsNational Institute of Information and Communications Technology
View shared research outputsNational Institute of Information and Communications Technology
View shared research outputs