Tadayoshi Hara
National Institute of Informatics
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tadayoshi Hara.
intelligent user interfaces | 2012
Pascual Martínez-Gómez; Chen Chen; Tadayoshi Hara; Yoshinobu Kano; Akiko Aizawa
Applications using eye-tracking devices need a higher accuracy in recognition when the task reaches a certain complexity. Thus, more sophisticated methods to correct eye-tracking measurement errors are necessary to lower the penetration barrier of eye-trackers in unconstrained tasks. We propose to take advantage of the content or the structure of textual information displayed on the screen to build informed error-correction algorithms that generalize well. The idea is to use feature-based image registration techniques to perform a linear transformation of gaze coordinates to find a good alignment with text printed on the screen. In order to estimate the parameters of the linear transformation, three optimization strategies are proposed to avoid the problem of local minima, namely Monte Carlo, multi-resolution and multi-blur optimization. Experimental results show that a more precise alignment of gaze data with words on the screen can be achieved by using these methods, allowing a more reliable use of eye-trackers in complex and unconstrained tasks.
pacific rim international conference on artificial intelligence | 2012
Pascual Martínez-Gómez; Tadayoshi Hara; Chen Chen; Kyohei Tomita; Yoshinobu Kano; Akiko Aizawa
Depending on the reading objective or task, text portions with certain linguistic features require more user attention to maximize the level of understanding. The goal is to build a predictor of these text areas. Our strategy consists in synthesizing image representations of linguistic features, that allows us to use natural language processing techniques while preserving the topology of the text. Eye-tracking technology allows us to precisely observe the identity of fixated words on a screen and their fixation duration. Then, we estimate the scaling factors of a linear combination of image representations of linguistic features that best explain certain gaze evidence, which leads us to a quantification of the influence of linguistic features in reading behavior. Finally, we can compute saliency maps that contain a prediction of the most interesting or cognitive demanding areas along the text. We achieve an important prediction accuracy of the text areas that require more attention for users to maximize their understanding in certain reading tasks, suggesting that linguistic features are good signals for prediction.
document engineering | 2017
Kenichi Iwatsuki; Takeshi Sagara; Tadayoshi Hara; Akiko Aizawa
One of the issues in extracting natural language sentences from PDF documents is the identification of non-textual elements in a sentence. In this paper, we report our preliminary results on the identification of in-line mathematical expressions. We first construct a manually annotated corpus and apply conditional random field (CRF) for the math-zone identification using both layout features, such as font types, and linguistic features, such as context n-grams, obtained from PDF documents. Although our method is naive and uses a small amount of annotated training data, our method achieved an 88.95% F-measure compared with 22.81% for existing math OCR software.
international conference on computational linguistics | 2014
Tadayoshi Hara; Goran Topić; Yusuke Miyao; Akiko Aizawa
Most conventional natural language processing (NLP) tools assume plain text as their input, whereas real-world documents display text more expressively, using a variety of layouts, sentence structures, and inline objects, among others. When NLP tools are applied to such text, users must first convert the text into the input/output formats of the tools. Moreover, this awkwardly obtained input typically does not allow the expected maximum performance of the NLP tools to be achieved. This work attempts to raise awareness of this issue using XML documents, where textual composition beyond plain text is given by tags. We propose a general framework for data conversion between XML-tagged text and plain text used as input/output for NLP tools and show that text sequences obtained by our framework can be much more thoroughly and efficiently processed by parsers than naively tag-removed text. These results highlight the significance of bridging real-world documents and NLP technologies.
Trends in Parsing Technology | 2010
Tadayoshi Hara; Yusuke Miyao; Jun’ichi Tsujii
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing | 2012
Tadayoshi Hara; Daichi Mochihashi; Yoshinobu Kano; Akiko Aizawa
international conference on computational linguistics | 2012
Pascual Mart'inez-Gómez; Tadayoshi Hara; Akiko Aizawa
TAG+ | 2002
Tadayoshi Hara; Yusuke Miyao; Jun’ichi Tsujii
international joint conference on natural language processing | 2011
Tadayoshi Hara; Takuya Matsuzaki; Yusuke Miyao; Jun’ichi Tsujii
人工知能学会全国大会論文集 | 2013
Shunsuke Ohashi; Tadayoshi Hara; Akiko Aizawa