Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kazuaki Kishida is active.

Publication


Featured researches published by Kazuaki Kishida.


Information Processing and Management | 2005

Technical issues of cross-language information retrieval: a review

Kazuaki Kishida

This paper reviews state-of-the-art techniques and methods for enhancing effectiveness of cross-language information retrieval (CLIR). The following research issues are covered: (1) matching strategies and translation techniques, (2) methods for solving the problem of translation ambiguity, (3) formal models for CLIR such as application of the language model, (4) the pivot language approach, (5) methods for searching multilingual document collection, (6) techniques for combining multiple language resources, etc.


Scientometrics | 1997

INTERNATIONAL PUBLICATION PATTERNS IN SOCIAL SCIENCES: A QUANTITATIVE ANALYSIS OF THE IBSS FILE

Kazuaki Kishida; Sachiko Matsui

A scientometric analysis of social science literature is tried by using the machine-readable files of theIBSS 1981–1985. This is a comprehensive international bibliography in social sciences including cultural anthropology, economics, political science and sociology. Data used were 40,313 monograph records in theIBSS files. First, the number of scholarly monographs was examined by country. As a result, it is shown that a large number of monographs was published by only a very small number of countries. Second, the number of monographs was examined by language. A similar pattern as that of countries was observed. Third, the relationship between the publishing country and the language used is discussed. It is clarified that some languages, such as English, French and Spanish, are used in many countries because of their historical background such as colonization. Finally, we examined the correlation among the number of published monographs, GDP, population and the number of people attaining a university education. A regression model that incorporates GDP as explanatory variables explains well the variation of the number of monographs by countries (R2=0.77).


cross language evaluation forum | 2005

A hybrid approach to query and document translation using a pivot language for cross-language information retrieval

Kazuaki Kishida; Noriko Kando

This paper reports experimental results for cross-language infor-mation retrieval (CLIR) from German to French, in which a hybrid approach to query and document translation was attempted, i.e., combining the results of query translation (German to French) and of document translation (French to German). In order to reduce the complexity of computation when translating a large amount of texts, we performed pseudo-translation, i.e., a simple replacement of terms by a bilingual dictionary (for query translation, a machine translation system was used). In particular, since English was used as an intermediary language for both translation directions between German and French, English translations at the middle stage were employed as document representations in order to reduce the number of translation steps. By omitting a translation step (English to German), the performance was improved. Unfortunately, our hybrid approach did not show better performance than a simple query translation. This may be due to the low performance of document translation, which was carried out by a simple replacement of terms using a bilingual dictionary with no term disambiguation.


cross language evaluation forum | 2003

Two-Stage Refinement of Query Translation in a Pivot Language Approach to Cross-Lingual Information Retrieval: An Experiment at CLEF 2003

Kazuaki Kishida; Noriko Kando

This paper reports experimental results of cross-lingual information retrieval from German to Italian. The authors are concerned with CLIR in cases where available language resources are very limited. Thus transitive translation of queries using English as a pivot language was used to search Italian document collections for German queries without any direct bilingual dictionary or MT system for these two languages. In order to remove irrelevant translations produced by the transitive translation, we propose a new disambiguation technique, in which two stages of refinement of query translation are executed. Basically, this refinement is based on the idea of pseudo relevance feedback. Our experimental results show that the two-stage refinement method is able to significantly improve search performance of bilingual IR using a pivot language.


Journal of Information Science | 2009

Translation disambiguation for cross-language information retrieval using context-based translation probability

Kazuaki Kishida; Emi Ishita

Disambiguation between multiple translation choices is very important in dictionary-based cross-language information retrieval. In prior work, disambiguation techniques have used term co-occurrence statistics from the collection being searched. Experimentally these techniques have worked well but are based upon heuristic assumptions. In this paper, a theoretically grounded alternative is proposed, one which uses sense disambiguation based upon context terms within the source text. Specifically this paper introduces the concept of translation probabilities incorporating a context term and extends the IBM Model 1 for estimating context-based translation probabilities from a sentence-aligned bilingual corpus. Experimental results in English to Italian bilingual searches show significant performance improvement of the context-based translation probabilities over the case without any disambiguation.


Journal of Information Science | 2011

Double-pass clustering technique for multilingual document collections

Kazuaki Kishida

It is often necessary to categorize automatically multilingual document sets, in which documents written in a variety of languages are included, into topically homogeneous subsets, such as when applying an automatic summarization system for multilingual news articles. However, there have been few studies on multilingual document clustering to date. In particular, it is not known whether clustering techniques are effective in medium- or large-scale multilingual document sets. For scalability, techniques should be based on dictionary-based translation and a single- or double-pass clustering algorithm. This article reports on experiments of applying multilingual document clustering to medium-scale sets of English, French, German and Italian documents (Reuters news articles). The results show that the double-pass algorithm has a positive effect in the case that each document is translated. On the other hand, the cluster translation strategy in which clusters obtained by applying a clustering algorithm to each language document set are translated has almost no effect. Also, translation disambiguation techniques can improve, but only slightly, the effectiveness of clustering.


Information Processing and Management | 2007

Term disambiguation techniques based on target document collection for cross-language information retrieval: an empirical comparison of performance between techniques

Kazuaki Kishida

Dictionary-based query translation for cross-language information retrieval often yields various translation candidates having different meanings for a source term in the query. This paper examines methods for solving the ambiguity of translations based on only the target document collections. First, we discuss two kinds of disambiguation technique: (1) one is a method using term co-occurrence statistics in the collection, and (2) a technique based on pseudo-relevance feedback. Next, these techniques are empirically compared using the CLEF 2003 test collection for German to Italian bilingual searches, which are executed by using English language as a pivot. The experiments showed that a variation of term co-occurrence based techniques, in which the best sequence algorithm for selecting translations is used with the Cosine coefficient, is dominant, and that the PRF method shows comparable high search performance, although statistical tests did not sufficiently support these conclusions. Furthermore, we repeat the same experiments for the case of French to Italian (pivot) and English to Italian (non-pivot) searches on the same CLEF 2003 test collection in order to verity our findings. Again, similar results were observed except that the Dice coefficient outperforms slightly the Cosine coefficient in the case of disambiguation based on term co-occurrence for English to Italian searches.


meeting of the association for computational linguistics | 2003

Pseudo Relevance Feedback Method based on Taylor Expansion of Retrieval Function in NTCIR-3 Patent Retrieval Task

Kazuaki Kishida

Pseudo relevance feedback is empirically known as a useful method for enhancing retrieval performance. For example, we can apply the Rocchio method, which is well-known relevance feedback method, to the results of an initial search by assuming that the top-ranked documents are relevant. In this paper, for searching the NTCIR-3 patent test collection through pseudo feedback, we employ two relevance feedback mechanism; (1) the Rocchio method, and (2) a new method that is based on Taylor formula of linear search functions. The test collection consists of near 700,000 records including full text of Japanese patent materials. Unfortunately, effectiveness of our pseudo feedback methods was not empirically observed at all in the experiment.


cross language evaluation forum | 2004

Two-stage refinement of transitive query translation with english disambiguation for cross-language information retrieval: an experiment at CLEF 2004

Kazuaki Kishida; Noriko Kando

This paper reports experimental results of cross-language information retrieval (CLIR) from German to French. The authors focus on CLIR in cases where available language resources are very limited. Thus transitive translation of queries using English as a pivot language was used to search French document collections for German queries without any direct bilingual dictionary or MT system for these two languages. The two-stage refinement of query translations that we proposed at the previous CLEF 2003 campaign is again used for enhancing performance of the pivot language approach. In particular, disambiguation of English terms in the middle stage of transitive translation was attempted as a new experiment. Our results show that the two-stage refinement method is able to significantly improve search performance of bilingual IR using a pivot language, but unfortunately, the English disambiguation has almost no effect.


international acm sigir conference on research and development in information retrieval | 2017

Report on NTCIR-12: The Twelfth Round of NII Testbeds and Community for Information Access Research

Makoto Kato; Kazuaki Kishida; Noriko Kando; Tetsuya Saka; Mark Sanderson

This is a report on the NTCIR-12 conference held in June 2016, in Tokyo, Japan. NTCIR-12 is the twelfth sesquiannual research project for evaluating information access technologies that organizes a diverse set of tasks related to information retrieval, question answering, and natural language processing. The NTCIR-12 conference is a venue in which task organizers and task participants presented their effort on their participating tasks, and attracted 236 participants from 21 countries/regions in this round. This report introduces the highlights of the conference, describes the scope and task designs of nine tasks organized at NTCIR-12, and provides a brief introduction to NTCIR-13, which started from June 2016 and will be closed in December 2017.

Collaboration


Dive into the Kazuaki Kishida's collaboration.

Top Co-Authors

Avatar

Noriko Kando

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar

Kazuko Kuriyama

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar

Koji Eguchi

Hitotsubashi University

View shared research outputs
Top Co-Authors

Avatar

Hsin-Hsi Chen

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Suk-Hoon Lee

Chungnam National University

View shared research outputs
Top Co-Authors

Avatar

Aitao Chen

University of California

View shared research outputs
Top Co-Authors

Avatar

Hailing Jiang

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge