Masatsugu Tonoike
Kyoto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Masatsugu Tonoike.
north american chapter of the association for computational linguistics | 2009
Eiji Aramaki; Yasuhide Miura; Masatsugu Tonoike; Tomoko Ohkuma; Hiroshi Mashuichi; Kazuhiko Ohe
With the rapidly growing use of electronic health records, the possibility of large-scale clinical information extraction has drawn much attention. It is not, however, easy to extract information because these reports are written in natural language. To address this problem, this paper presents a system that converts a medical text into a table structure. This systems core technologies are (1) medical event recognition modules and (2) a negative event identification module that judges whether an event actually occurred or not. Regarding the latter module, this paper also proposes an SVM-based classifier using syntactic information. Experimental results demonstrate empirically that syntactic information can contribute to the methods accuracy.
international conference on the computer processing of oriental languages | 2006
Takehito Utsuro; Mitsuhiro Kida; Masatsugu Tonoike; Satoshi Sato
This paper proposes a method of domain specificity estimation of technical terms using the Web. In the proposed method, it is assumed that, for a certain technical domain, a list of known technical terms of the domain is given. Technical documents of the domain are collected through the Web search engine, which are then used for generating a vector space model for the domain. The domain specificity of a target term is estimated according to the distribution of the domain of the sample pages of the target term. We apply this technique of estimating domain specificity of a term to the task of discovering novel technical terms that are not included in any of existing lexicons of technical terms of the domain. Out of randomly selected 1,000 candidates of technical terms per a domain, we discovered about 100 ~ 200 novel technical terms.
WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus | 2006
Masatsugu Tonoike; Mitsuhiro Kida; Toshihiro Takagi; Yasuhiro Sasaki; Takehito Utsuro; Satoshi Sato
This paper studies issues related to the compilation of a bilingual lexicon for technical terms. In the task of estimating bilingual term correspondences of technical terms, it is usually rather difficult to find an existing corpus for the domain of such technical terms. In this paper, we adopt an approach of collecting a corpus for the domain of such technical terms from the Web. As a method of translation estimation for technical terms, we employ a compositional translation estimation technique. This paper focuses on quantitatively comparing variations of the components in the scoring functions of compositional translation estimation. Through experimental evaluation, we show that the domain/topic-specific corpus contributes toward improving the performance of the compositional translation estimation.
asia information retrieval symposium | 2006
Takehito Utsuro; Mitsuhiro Kida; Masatsugu Tonoike; Satoshi Sato
This paper proposes a method of domain specificity estimation of technical terms using the Web. In the proposed method, it is assumed that, for a certain technical domain, a list of known technical terms of the domain is given. Technical documents of the domain are collected through the Web search engine, which are then used for generating a vector space model for the domain. The domain specificity of a target term is estimated according to the distribution of the domain of the sample pages of the target term. Experimental evaluation results show that the proposed method achieved mostly 90% precision/recall.
Second International Conference on Informatics Research for Development of Knowledge Society Infrastructure (ICKS'07) | 2007
Takehito Utsuro; Masatsugu Tonoike; Satoshi Sato; Sadao Kurohashi
This paper studies how to exploit domains in the Web texts in the task of automatic compilation of bilingual technical term lexicon. First, we propose how to judge whether each technical term candidate is actually a term of the given domain, or a domain independent term of general use. Next, we propose how to estimate translation of a technical term, given its domain
Studies in health technology and informatics | 2010
Eiji Aramaki; Yasuhide Miura; Masatsugu Tonoike; Tomoko Ohkuma; Hiroshi Masuichi; Kayo Waki; Kazuhiko Ohe
conference of the european chapter of the association for computational linguistics | 2006
Xavier Robitaille; Yasuhiro Sasaki; Masatsugu Tonoike; Satoshi Sato; Takehito Utsuro
ROMAND '04 Proceedings of the 3rd Workshop on RObust Methods in Analysis of Natural Language Data | 2004
Masatsugu Tonoike; Takehito Utsuro; Satoshi B. Sato
Systems and Computers in Japan | 2007
Mitsuhiro Kida; Masatsugu Tonoike; Takehito Utsuro; Satoshi Sato
Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010) | 2010
Yasuhide Miura; Eiji Aramaki; Tomoko Ohkuma; Masatsugu Tonoike; Daigo Sugihara; Hiroshi Masuichi; Kazuhiko Ohe