Is this you? Create Your Porfile

Kanako Komiya

Tokyo University of Agriculture and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kanako Komiya is active.

Explore More

Publication

Featured researches published by Kanako Komiya.

SpringerPlus | 2013

Question answering system using Q & A site corpus Query expansion and answer candidate evaluation

Kanako Komiya; Yuji Abe; Hajime Morita; Yoshiyuki Kotani

Question Answering (QA) is a task of answering natural language questions with adequate sentences. This paper proposes two methods to improve the performance of the QA system using a Q&A site corpus. The first method is for the relevant document retrieval module. We proposed modification of measure of mutual information for the query expansion; we calculate it between two words in each question and a word in its answer in the Q&A site corpus not to choose the words that are not suitable.The second method is for the candidate answer evaluation module. We proposed to evaluate candidate answers using the two measures together, i.e., the Web relevance score and the translation probability. The experiments were carried out using a Japanese Q&A site corpus. They revealed that the first proposed method was significantly better than the original method when their accuracies and MRR (Mean Reciprocal Rank) were compared and the second method was significantly better than the original methods when their MRR were compared.

international conference on technologies and applications of artificial intelligence | 2010

Nested Monte-Carlo Search with AMAF Heuristic

Haruhiko Akiyama; Kanako Komiya; Yoshiyuki Kotani

Nested Monte-Carlo Search, which calls MonteCarlo search in the nested call, has succeeded in the one-person game named Morpion Solitaire. The depth for the nest is called a level, and the runtime increases exponentially in the search for higher level. In the present study, All-Move-As-First heuristic is incorporated in Nested Monte-Carlo Search and the number of search is reduced to maintain a pseudo number of searches in order to achieve the higher level search. Our system generated a new world record 146 moves of the computer search in Morpion Solitaire touching version by this

meeting of the association for computational linguistics | 2016

Comparison of Annotating Methods for Named Entity Corpora.

Kanako Komiya; Masaya Suzuki; Tomoya Iwakura; Minoru Sasaki; Hiroyuki Shinnou

We compared two methods to annotate a corpus via non-expert annotators for named entity (NE) recognition task, which are (1) revising the results of the existing NE recognizer and (2) annotating NEs only by hand. We investigated the annotation time, the degrees of agreement, and the performances based on the gold standard. As we have two annotators for one file of each method, we evaluated the two performances, which are the averaged performances over the two annotators and the performances deeming the annotations correct when either of them is correct. The experiments revealed that the semi-automatic annotation was faster and showed better agreements and higher performances on average. However they also indicated that sometimes fully manual annotation should be used for some texts whose genres are far from its training data. In addition, the experiments using the annotated corpora via semi-automatic and fully manual annotation as training data for machine learning indicated that the F-measures sometimes could be better for some texts when we used manual annotation than when we used semi-automatic annotation.

Proceedings of the Sixth Named Entity Workshop | 2016

Constructing a Japanese Basic Named Entity Corpus of Various Genres

Tomoya Iwakura; Kanako Komiya; Ryuichi Tachibana

This paper introduces a Japanese Named Entity (NE) corpus of various genres. We annotated 136 documents in the Balanced Corpus of Contemporary Written Japanese (BCCWJ) with the eight types of NE tags defined by Information Retrieval and Extraction Exercise. The NE corpus consists of six types of genres of documents such as blogs, magazines, white papers, and so on, and the corpus contains 2,464 NE tags in total. The corpus can be reproduced with BCCWJ corpus and the tagging information obtained from https://sites.google.com/ site/projectnextnlpne/en/ .

International Conference of the Pacific Association for Computational Linguistics | 2015

Active Learning to Remove Source Instances for Domain Adaptation for Word Sense Disambiguation

Hiroyuki Shinnou; Yoshiyuki Onodera; Minoru Sasaki; Kanako Komiya

In this paper, an active learning method of domain adaptation issues for word sense disambiguation is presented. In general, active learning is an approach where data with high learning effect is selected from an unlabeled data set, then labeled manually, and added to the training data. However, data in the source domain can deteriorate classification precision (misleading data), which extends errors to the domain adaptation. When data labeled by active learning is added to training data, an attempt is made to detect misleading data in the source domain and delete it from the training data. In this way, compared to standard learning classification precision is improved.

international joint conference on computer science and software engineering | 2011

Classification of Japanese onomatopoeias using hierarchical clustering depending on contexts

Kanako Komiya; Yoshiyuki Kotani

Japanese has thousands of onomatopoeias and they have recently started to attract a lot of attention of researchers on natural language processing. Some onomatopoeias are semantically or phonologically similar each other and the choice of these onomatopoeias sometimes give a big difference among Japanese sentences. In this paper, the authors classify Japanese onomatopoeias using single link hierarchical clustering depending on contexts such as surrounding words, their part-of-speeches and their meanings, and show explicitly the relationships of them from the perspective of word senses.

Knowledge Based Systems | 2012

Nested Monte-Carlo Search with simulation reduction

Haruhiko Akiyama; Kanako Komiya; Yoshiyuki Kotani

The execution time of Nested Monte-Carlo Search for Morpion Solitaire, a single-player game, increases exponentially with the level of the nested search. We investigated the use of two methods for reducing the execution time in order to enable a deeper nested search: simply reducing the number of lower level searches by a constant rate and using All-Moves-As-First heuristic to the reduction in the number of lower level searches. Testing showed the latter is more effective. Using it, we achieved a new world record of 146 moves for a computer search for the touching version of Morpion Solitaire.

international joint conference on computer science and software engineering | 2011

Categorization of product pages depending on information on the Web

Naoto Sato; Kanako Komiya; Koji Fujimoto; Yoshiyuki Kotani

In this paper, the authors categorize product pages on the Web depending on their information. We used naive Bayes and the complement naive Bayes classifier, and tried four kinds of features to categorize them: all the words of the titles of the product pages, the nouns extracted from the titles, all the words of the titles and the descriptions of the product pages, and the nouns extracted from them. The experiments show that the product pages can be classified most correctly depending on only the nouns of the titles of the product pages. Moreover the complement naive Bayes classifier outperformed the naive Bayes classifier.

acm transactions on asian and low-resource language information processing | 2018

Comparison of Methods to Annotate Named Entity Corpora

Kanako Komiya; Masaya Suzuki; Tomoya Iwakura; Minoru Sasaki; Hiroyuki Shinnou

The authors compared two methods for annotating a corpus for the named entity (NE) recognition task using non-expert annotators: (i) revising the results of an existing NE recognizer and (ii) manually annotating the NEs completely. The annotation time, degree of agreement, and performance were evaluated based on the gold standard. Because there were two annotators for one text for each method, two performances were evaluated: the average performance of both annotators and the performance when at least one annotator is correct. The experiments reveal that semi-automatic annotation is faster, achieves better agreement, and performs better on average. However, they also indicate that sometimes, fully manual annotation should be used for some texts whose document types are substantially different from the training data document types. In addition, the machine learning experiments using semi-automatic and fully manually annotated corpora as training data indicate that the F-measures could be better for some texts when manual instead of semi-automatic annotation was used. Finally, experiments using the annotated corpora for training as additional corpora show that (i) the NE recognition performance does not always correspond to the performance of the NE tag annotation and (ii) the system trained with the manually annotated corpus outperforms the system trained with the semi-automatically annotated corpus with respect to newswires, even though the existing NE recognizer was mainly trained with newswires.

international conference on computational linguistics | 2017

Domain Adaptation for Word Sense Disambiguation Using Word Embeddings

Kanako Komiya; Shota Suzuki; Minoru Sasaki; Hiroyuki Shinnou; Manabu Okumura

In this paper, we propose domain adaptation in word sense disambiguation (WSD) using word embeddings. The validity of the word embeddings from a huge corpus, e.g., Wikipedia, for WSD had already been shown, but their validity in a domain adaptation framework has not been discussed before. In addition, if they are valid, the difference in effects according to the domain of the corpora is still unknown. Therefore, we investigate the performances of domain adaptation in WSD using the word embeddings from the source, target, and general corpora and examine (1) whether the word embeddings are valid for domain adaptation of WSD and (2) if they are, the effects in accordance with the domain of the corpora. The experiments using Japanese corpora revealed that the accuracy of WSD was highest when we used the word embeddings obtained from the target corpus.

Explore More