Koichi Kise | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Koichi Kise is active.

Explore More

Publication

Featured researches published by Koichi Kise.

Computer Vision and Image Understanding | 1998

Segmentation of Page Images Using the Area Voronoi Diagram

Koichi Kise; Akinori Sato; Motoi Iwata

This paper presents a method of page segmentation based on the approximated area Voronoi diagram. The characteristics of the proposed method are as follows: (1) The Voronoi diagram enables us to obtain the candidates of boundaries of document components from page images with non-Manhattan layout and a skew. (2) The candidates are utilized to estimate the intercharacter and interline gaps without the use of domain-specific parameters to select the boundaries. From the experimental results for 128 images with non-Manhattan layout and the skew of 0°~45° as well as 98 images with Manhattan layout, we have confirmed that the method is effective for extraction of body text regions, and it is as efficient as other methods based on connected component analysis.

document analysis systems | 2006

Use of affine invariants in locally likely arrangement hashing for camera-based document image retrieval

Tomohiro Nakai; Koichi Kise; Masakazu Iwamura

Camera-based document image retrieval is a task of searching document images from the database based on query images captured using digital cameras. For this task, it is required to solve the problem of “perspective distortion” of images,as well as to establish a way of matching document images efficiently. To solve these problems we have proposed a method called Locally Likely Arrangement Hashing (LLAH) which is characterized by both the use of a perspective invariant to cope with the distortion and the efficiency: LLAH only requires O(N) time where N is the number of feature points that describe the query image. In this paper, we introduce into LLAH an affine invariant instead of the perspective invariant so as to improve its adjustability. Experimental results show that the use of the affine invariant enables us to improve either the accuracy from 96.2% to 97.8%, or the retrieval time from 112 msec./query to 75 msec./query by selecting parameters of processing.

augmented human international conference | 2014

In the blink of an eye: combining head motion and eye blink frequency for activity recognition with Google Glass

Shoya Ishimaru; Kai Kunze; Koichi Kise; Jens Weppner; Andreas Dengel; Paul Lukowicz; Andreas Bulling

We demonstrate how information about eye blink frequency and head motion patterns derived from Google Glass sensors can be used to distinguish different types of high level activities. While it is well known that eye blink frequency is correlated with user activity, our aim is to show that (1) eye blink frequency data from an unobtrusive, commercial platform which is not a dedicated eye tracker is good enough to be useful and (2) that adding head motion patterns information significantly improves the recognition rates. The method is evaluated on a data set from an experiment containing five activity classes (reading, talking, watching TV, mathematical problem solving, and sawing) of eight participants showing 67% recognition accuracy for eye blinking only and 82% when extended with head motion patterns.

international conference on document analysis and recognition | 2013

The Wordometer -- Estimating the Number of Words Read Using Document Image Retrieval and Mobile Eye Tracking

Kai Kunze; Hitoshi Kawaichi; Kazuyo Yoshimura; Koichi Kise

We introduce the Wordometer, a novel method to estimate the number of words a user reads using a mobile eye tracker and document image retrieval. We present a reading detection algorithm which works with over 91 % accuracy over 10 test subjects using 10-fold cross validation. We implement two algorithms to estimate the read words using a line break detector. A simple version gives an average error rate of 13,5 % for 9 users over 10 documents. A more sophisticated word count algorithm based on support vector regression with an RBF kernel reaches an average error rate from only 8.2 % (6.5 % if one test subject with abnormal behavior is excluded). The achieved error rates are comparable to pedometers that count our steps in our daily life. Thus, we believe the Wordometer can be used as a step counter for the information we read to make our knowledge life healthier.

international conference on document analysis and recognition | 2011

Real-Time Document Image Retrieval for a 10 Million Pages Database with a Memory Efficient and Stability Improved LLAH

Kazutaka Takeda; Koichi Kise; Masakazu Iwamura

This paper presents a real-time document image retrieval method for a large-scale database with Locally Likely Arrangement Hashing (LLAH). In general, when a database is scaled up, a large amount of memory is required and retrieval accuracy drops due to insufficient discrimination power of features. To solve these problems, we propose three improvements: memory reduction by sampling feature points, improvement of discrimination power by increasing the number of feature dimensions and stabilizing features by reducing redundancy. From the experimental results, we have confirmed that the proposed method realizes 50% memory reduction, and achieves 99.4% accuracy and 38ms processing time for a database of 10 million pages.

human factors in computing systems | 2013

Towards inferring language expertise using eye tracking

Kai Kunze; Hitoshi Kawaichi; Kazuyo Yoshimura; Koichi Kise

We present initial work towards recognizing reading activities. This paper describes our efforts detect the English skill level of a user and infer which words are difficult for them to understand. We present an initial study of 5 students and show our findings regarding the skill level assessment. We explain a method to spot difficult words. Eye tracking is a promising technology to examine and assess a users skill level.

international conference on pattern recognition | 1996

Page segmentation based on thinning of background

Koichi Kise; Osamu Yanagida; Shinobu Takamatsu

This paper presents a new method of page segmentation based on the analysis of background (white areas). The proposed method is capable of segmenting pages with non-rectangular layout as well as with various angles of skew. The characteristics of the method are as follows: (1) thinning of the background enables us to represent white areas of any shape as connected thin lines or chains and the robustness for tilted page images is also achieved by the representation; and (2) based on this representation, the task of page segmentation is defined as to find the loops enclosing printed areas. The task is achieved by eliminating unnecessary chains using not only a feature of white areas, but also a feature of black areas divided by a chain. Based on the experimental results and the comparison with previous methods, we discuss the advantages and limitations of the proposed method.

international conference on computer vision | 2013

What is the Most EfficientWay to Select Nearest Neighbor Candidates for Fast Approximate Nearest Neighbor Search

Masakazu Iwamura; Tomokazu Sato; Koichi Kise

Approximate nearest neighbor search (ANNS) is a basic and important technique used in many tasks such as object recognition. It involves two processes: selecting nearest neighbor candidates and performing a brute-force search of these candidates. Only the former though has scope for improvement. In most existing methods, it approximates the space by quantization. It then calculates all the distances between the query and all the quantized values (e.g., clusters or bit sequences), and selects a fixed number of candidates close to the query. The performance of the method is evaluated based on accuracy as a function of the number of candidates. This evaluation seems rational but poses a serious problem; it ignores the computational cost of the process of selection. In this paper, we propose a new ANNS method that takes into account costs in the selection process. Whereas existing methods employ computationally expensive techniques such as comparative sort and heap, the proposed method does not. This realizes a significantly more efficient search. We have succeeded in reducing computation times by one-third compared with the state-of-theart on an experiment using 100 million SIFT features.

international conference on document analysis and recognition | 2009

Real-Time Retrieval for Images of Documents in Various Languages Using a Web Camera

Tomohiro Nakai; Koichi Kise; Masakazu Iwamura

We propose a real-time retrieval method for document images in various languages. In this method, queries are images of documents captured by a web-camera. The document images corresponding to the queries are retrieved from the document image database in real time. Since we have already proposed a document image retrieval method for English documents, the proposed method is an extension for retrieval of documents in various languages. In the previous English document image retrieval method, only centroids of word regions are used as feature points. Therefore it cannot be applied to some languages including Japanese and Chinese due to no separation between words and periodic arrangements of characters. In the proposed method, additional features are introduced to realize real-time retrieval for document images in various languages.

ubiquitous computing | 2015

Quantifying reading habits: counting how many words you read

Kai Kunze; Katsutoshi Masai; Masahiko Inami; Ömer Sacakli; Marcus Liwicki; Andreas Dengel; Shoya Ishimaru; Koichi Kise

Reading is a very common learning activity, a lot of people perform it everyday even while standing in the subway or waiting in the doctors office. However, we know little about our everyday reading habits, quantifying them enables us to get more insights about better language skills, more effective learning and ultimately critical thinking. This paper presents a first contribution towards establishing a reading log, tracking how much reading you are doing at what time. We present an approach capable of estimating the words read by a user, evaluate it in an user independent approach over 3 experiments with 24 users over 5 different devices (e-ink reader, smartphone, tablet, paper, computer screen). We achieve an error rate as low as 5% (using a medical electrooculography system) or 15% (based on eye movements captured by optical eye tracking) over a total of 30 hours of recording. Our method works for both an optical eye tracking and an Electrooculography system. We provide first indications that the method works also on soon commercially available smart glasses.

Explore More