Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Daisuke Takuma is active.

Publication


Featured researches published by Daisuke Takuma.


international acm sigir conference on research and development in information retrieval | 2013

Faster upper bounding of intersection sizes

Daisuke Takuma; Hiroki Yanagisawa

There is a long history of developing efficient algorithms for set intersection, which is a fundamental operation in information retrieval and databases. In this paper, we describe a new data structure, a Cardinality Filter, to quickly compute an upper bound on the size of a set intersection. Knowing an upper bound of the size can be used to accelerate many applications such as top-k query processing in text mining. Given finite sets A and B, the expected computation time for the upper bound of the size of the intersection |A cap B| is O( (|A| + |B|) w), where w is the machine word length. This is much faster than the current best algorithm for the exact intersection, which runs in O((|A| + |B|) / √w + |A cap B|) expected time. Our performance studies show that our implementations of Cardinality Filters are from 2 to 10 times faster than existing set intersection algorithms, and the time for a top-k query in a text mining application can be reduced by half.


meeting of the association for computational linguistics | 2006

Phoneme-to-Text Transcription System with an Infinite Vocabulary

Shinsuke Mori; Daisuke Takuma; Gakuto Kurata

The noisy channel model approach is successfully applied to various natural language processing tasks. Currently the main research focus of this approach is adaptation methods, how to capture characteristics of words and expressions in a target domain given example sentences in that domain. As a solution we describe a method enlarging the vocabulary of a language model to an almost infinite size and capturing their context information. Especially the new method is suitable for languages in which words are not delimited by whitespace. We applied our method to a phoneme-to-text transcription task in Japanese and reduced about 10% of the errors in the results of an existing method.


Archive | 2008

Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building

Shinsuke Mori; Daisuke Takuma


Archive | 2010

Creating a terms dictionary with named entities or terminologies included in text data

Hiroki Oya; Daisuke Takuma; Hirobumi Toyoshima


Archive | 2005

Document data retrieval and reporting

Hiroshi Nomiyama; Daisuke Takuma


Archive | 2008

System of effectively searching text for keyword, and method thereof

Daisuke Takuma; Issei Yoshida; Yuta Tsuboi


Archive | 2009

INFORMATION SEARCH SYSTEM, METHOD AND PROGRAM

Daisuke Takuma; Yuta Tsuboi


Archive | 2006

CHARACTER STRING PROCESSING METHOD, APPARATUS, AND PROGRAM

Yohei Ikawa; Hiroshi Kanayama; Daisuke Takuma


Archive | 2005

Character string processing method and device, and program

Yohei Ikawa; Hiroshi Kaneyama; Daisuke Takuma; 洋平 伊川; 大介 宅間; 博 金山


Archive | 2008

System, method and program for creating index for database

Daisuke Takuma; Issei Yoshida

Researchain Logo
Decentralizing Knowledge