Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Keiji Shinzato is active.

Publication


Featured researches published by Keiji Shinzato.


Journal of Information Processing | 2012

TSUBAKI: An Open Search Engine Infrastructure for Developing Information Access Methodology

Keiji Shinzato; Tomohide Shibata; Daisuke Kawahara; Sadao Kurohashi

Due to the explosive growth in the amount of information in the last decade, it is getting extremely harder to obtain necessary information by conventional information access methods. Hence, creation of drastically new technology is needed. For developing such new technology, search engine infrastructures are required. Although the existing search engine APIs can be regarded as such infrastructures, these APIs have several restrictions such as a limit on the number of API calls. To help the development of new technology, we are running an open search engine infrastructure, TSUBAKI, on a high-performance computing environment. In this paper, we describe TSUBAKI infrastructure.


web intelligence | 2009

Web Information Organization Using Keyword Distillation Based Clustering

Tomohide Shibata; Yasuo Bamba; Keiji Shinzato; Sadao Kurohashi

This paper describes a system that conducts search result clustering for several thousands of Web pages, and elaborates cluster labels through keyword distillation. Keyword distillation is a method that properly handles spelling variations, transliterations, synonyms, inclusion relations and word ambiguity, using linguistic resources and contexts of a users query. The system provides a clustering result from 1,000 pages in less than one minute by taking advantage of a search engine infrastructure and grid computing environment. Experimental results show that the system correctly merged synonymous keywords and is useful for finding topics hidden in the lower-ranked pages in a search result.


international universal communication symposium | 2009

Development of a large-scale web crawler and search engine infrastructure

Susumu Akamine; Yoshikiyo Kato; Daisuke Kawahara; Keiji Shinzato; Kentaro Inui; Sadao Kurohashi; Yutaka Kidawara

This paper reports the ongoing development of a large-scale Web crawler and search engine infrastructure at National Institute of Information and Communications Technology. This infrastructure has the following characteristics: (1) It collects one billion Japanese Web pages while keeping them up-to-date. (2) It selects 100 million pages from among the collected pages and converts them into a standard data format to store the results of morphological analysis, dependency parsing, and synonym augmentation. (3) The selected set of pages is searchable and accessible to the users. (4) The scalability of the system is achieved by using a large-scale cluster machine for distributed data processing.


international joint conference on natural language processing | 2008

TSUBAKI: An Open Search Engine Infrastructure for Developing New Information Access Methodology.

Keiji Shinzato; Tomohide Shibata; Daisuke Kawahara; Chikara Hashimoto; Sadao Kurohashi


language resources and evaluation | 2008

A Large-Scale Web Data Collection as a Natural Language Processing Infrastructure

Keiji Shinzato; Daisuke Kawahara; Chikara Hashimoto; Sadao Kurohashi


international joint conference on natural language processing | 2013

Unsupervised Extraction of Attributes and Their Values from Product Description

Keiji Shinzato; Satoshi Sekine


international joint conference on natural language processing | 2013

Precise Information Retrieval Exploiting Predicate-Argument Structures

Daisuke Kawahara; Keiji Shinzato; Tomohide Shibata; Sadao Kurohashi


Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010) | 2010

Exploiting Term Importance Categories and Dependency Relations for Natural Language Search

Keiji Shinzato; Sadao Kurohashi


Journal of Natural Language Processing | 2016

Error Analysis on Product Attribute Value Extraction

Keiji Shinzato; Satoshi Sekine; Koji Murakami


Archive | 2013

Exploiting Predicate-Argument Structures

Daisuke Kawahara; Keiji Shinzato; Tomohide Shibata; Sadao Kurohashi

Collaboration


Dive into the Keiji Shinzato's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Chikara Hashimoto

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Susumu Akamine

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar

Yoshikiyo Kato

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar

Yutaka Kidawara

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge