Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kiyotaka Uchimoto is active.

Publication


Featured researches published by Kiyotaka Uchimoto.


international joint conference on natural language processing | 2009

An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging

Canasai Kruengkrai; Kiyotaka Uchimoto; Jun’ichi Kazama; Yiou Wang; Kentaro Torisawa; Hitoshi Isahara

In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance since it can handle both known and unknown words. We describe our strategies that yield good balance for learning the characteristics of known and unknown words and propose an error-driven policy that delivers such balance by acquiring examples of unknown words from particular errors in a training corpus. We describe an efficient framework for training our model based on the Margin Infused Relaxed Algorithm (MIRA), evaluate our approach on the Penn Chinese Treebank, and show that it achieves superior performance compared to the state-of-the-art approaches reported in the literature.


Proceedings of the 7th Workshop on Asian Language Resources | 2009

Enhancing the Japanese WordNet

Francis Bond; Hitoshi Isahara; Sanae Fujita; Kiyotaka Uchimoto; Takayuki Kuribayashi; Kyoko Kanzaki

The Japanese WordNet currently has 51,000 synsets with Japanese entries. In this paper, we discuss three methods of extending it: increasing the cover, linking it to examples in corpora and linking it to other resources (SUMO and GoiTaikei). In addition, we outline our plans to make it more useful by adding Japanese definition sentences to each synset. Finally, we discuss how releasing the corpus under an open license has led to the construction of interfaces in a variety of programming languages.


empirical methods in natural language processing | 2009

Improving Dependency Parsing with Subtrees from Auto-Parsed Data

Wenliang Chen; Jun’ichi Kazama; Kiyotaka Uchimoto; Kentaro Torisawa

This paper presents a simple and effective approach to improve dependency parsing by using subtrees from auto-parsed data. First, we use a baseline parser to parse large-scale unannotated data. Then we extract subtrees from dependency parse trees in the auto-parsed data. Finally, we construct new subtree-based features for parsing algorithms. To demonstrate the effectiveness of our proposed approach, we present the experimental results on the English Penn Treebank and the Chinese Penn Treebank. These results show that our approach significantly outperforms baseline systems. And, it achieves the best accuracy for the Chinese data and an accuracy which is competitive with the best known systems for the English data.


meeting of the association for computational linguistics | 2000

Named entity extraction based on a maximum entropy model and transformation rules

Kiyotaka Uchimoto; Qing Ma; Masaki Murata; Hiromi Ozaku; Hitoshi Isahara

This paper describes named entity (NE) extraction based on a maximum entropy (M. E.) model and transformation rules. There are two types of named entities when focusing on the relationship between morphemes and NEs as defined in the NE task of the IREX competition held in 1999. Each NE consists of one or more morphemes, or includes a substring of a morpheme. We extract the former type of NE by using the M. E. model. We then extract the latter type of NE by applying transformation rules to the text.


conference of the european chapter of the association for computational linguistics | 1999

Japanese dependency structure analysis based on maximum entropy models

Kiyotaka Uchimoto; Satoshi Sekine; Hitoshi Isahara

This paper describes a dependency structure analysis of Japanese sentences based on the maximum entropy models. Our model is created by learning the weights of some features from a training corpus to predict the dependency between bunsetsus or phrasal units. The dependency accuracy of our system is 87.2% using the Kyoto University corpus. We discuss the contribution of each feature set and the relationship between the number of training data and the accuracy.


meeting of the association for computational linguistics | 2007

A Hybrid Approach to Word Segmentation and POS Tagging

Tetsuji Nakagawa; Kiyotaka Uchimoto

In this paper, we present a hybrid method for word segmentation and POS tagging. The target languages are those in which word boundaries are ambiguous, such as Chinese and Japanese. In the method, word-based and character-based processing is combined, and word segmentation and POS tagging are conducted simultaneously. Experimental results on multiple corpora show that the integrated method has high accuracy.


arXiv: Computation and Language | 2000

Japanese probabilistic information retrieval using location and category information

Masaki Murata; Qing Ma; Kiyotaka Uchimoto; Hiromi Ozaku; Masao Utiyama; Hitoshi Isahara

Robertsons 2-poisson information retrieve model does not use location and category information. We constructed a framework using location and category information in a 2-poisson model. We submitted two systems based on this framework to the IREX contest, Japanese language information retrieval contest held in Japan in 1999. For precision in the A-judgement measure they scored 0.4926 and 0.4827, the highest values among the 15 teams and 22 systems that participated in the IREX contest. We describe our systems and the comparative experiments done when various parameters were changed. These experiments confirmed the effectiveness of using location and category information.


conference on computational natural language learning | 2009

Multilingual Dependency Learning: Exploiting Rich Features for Tagging Syntactic and Semantic Dependencies

Hai Zhao; Wenliang Chen; Jun’ichi Kazama; Kiyotaka Uchimoto; Kentaro Torisawa

This paper describes our system about multilingual syntactic and semantic dependency parsing for our participation in the joint task of CoNLL-2009 shared tasks. Our system uses rich features and incorporates various integration technologies. The system is evaluated on in-domain and out-of-domain evaluation data of closed challenge of joint task. For in-domain evaluation, our system ranks the second for the average macro labeled F1 of all seven languages, 82.52% (only about 0.1% worse than the best system), and the first for English with macro labeled F1 87.69%. And for out-of-domain evaluation, our system also achieves the second for average score of all three languages.


international conference on computational linguistics | 1994

Thesaurus-based efficient example retrieval by generating retrieval queries from similarities

Takehito Utsuro; Kiyotaka Uchimoto; Mitsutaka Matsumoto; Makoto Nagao

In example-based NLP, the problem of computational cost of example retrieval is severe, since the retrieval time increases in proportion to the number of examples in the database. This paper proposes a novel example retrieval method for avoiding full retrieval of examples. The proposed method has the following three features, 1) it generates retrieval queries from similarities, 2) efficient example retrieval through the tree structure of a thesaurus, 3) binary search along subsumption ordering of retrieval queries. Example retrieval time drastically decreases with the method.


web intelligence | 2008

Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia

Jong-Hoon Oh; Daisuke Kawahara; Kiyotaka Uchimoto; Jun’ichi Kazama; Kentaro Torisawa

We present a novel method for discovering missing cross-language links between English and Japanese Wikipedia articles. We collect candidates of missing cross-language links -- a pair of English and Japanese Wikipedia articles, which could be connected by cross-language links. Then we select the correct cross-language links among the candidates by using a classifier trained with various types of features. Our method has three desirable characteristics for discovering missing links. First, our method can discover cross-language links with high accuracy (92\% precision with 78\% recall rates). Second, the features used in a classifier are language-independent. Third, without relying on any external knowledge, we generate the features based on resources automatically obtained from Wikipedia. In this work, we discover approximately

Collaboration


Dive into the Kiyotaka Uchimoto's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Masao Utiyama

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar

Hiromi Ozaku

Ministry of Posts and Telecommunications

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kentaro Torisawa

National Institute of Information and Communications Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge