Is this you? Create Your Porfile

Akiko Aizawa

National Institute of Informatics

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Akiko Aizawa is active.

Explore More

Publication

Featured researches published by Akiko Aizawa.

electronic commerce | 1994

Scheduling of genetic algorithms in a noisy environment

Akiko Aizawa; Benjamin W. Wah

In this paper, we develop new methods for adjusting configuration parameters of genetic algorithms operating in a noisy environment. Such methods are related to the scheduling of resources for tests performed in genetic algorithms. Assuming that the population size is given, we address two problems related to the design of efficient scheduling algorithms specifically important in noisy environments. First, we study the durution-scheduling problem that is related to setting dynamically the duration of each generation. Second, we study the sample-allocation problem that entails the adaptive determination of the number of evaluations taken from each candidate in a generation. In our approach, we model the search process as a statistical selection process and derive equations useful for these problems. Our results show that our adaptive procedures improve the performance of genetic algorithms over that of commonly used static ones.

International Workshop on Challenges in Web Information Retrieval and Integration | 2005

A Fast Linkage Detection Scheme for Multi-Source Information Integration

Akiko Aizawa; Keizo Oyama

Record linkage refers to techniques for identifying records associated with the same real-world entities. Record linkage is not only crucial in integrating multi-source databases that have been generated independently, but is also considered to be one of the key issues in integrating heterogeneous Web resources. However, when targeting large-scale data, the cost of enumerating all the possible linkages often becomes impracticably high. Based on this background, this paper proposes a fast and efficient method for linkage detection. The features of the proposed approach are: first, it exploits a suffix array structure that enables linkage detection using variable length n-grams. Second, it dynamically generates blocks of possibly associated records using ‘blocking keys’ extracted from already known reliable linkages. The results from our preliminary experiments where the proposed method was applied to the integration of four bibliographic databases, which scale up to more than 10 million records, are also reported in the paper.

international acm sigir conference on research and development in information retrieval | 2000

The feature quantity: an information theoretic perspective of Tfidf-like measures

Akiko Aizawa

The feature quantity, a quantitative representation of specificity introduced in this paper, is based on an information theoretic perspective of co-occurrence events between terms and documents. Mathematically, the feature quantity is defined as a product of probability and information, and maintains a good correspondence with the tfidf-like measures popularly used in todays IR systems. In this paper, we present a formal description of the feature quantity, as well as some illustrative examples of applying such a quantity to different types of information retrieval tasks: representative term selection and text categorization.

intelligent user interfaces | 2014

Recognition of understanding level and language skill using measurements of reading behavior

Pascual Martínez-Gómez; Akiko Aizawa

The reading act is an intimate and elusive process that is important to understand. Psycholinguists have long studied the effects of task, personal or document characteristics on reading behavior. An essential factor in the success of those studies lies in the capability of analyzing eye-movements. These studies aim to recognize causal effects on patterns of eye-movements, by contriving variations in task, personal or document characteristics. In this work, we follow the opposite direction. We present a formal framework to recognize readers level of understanding and language skill given measurements of reading behavior via eye-gaze data. We show significant error reductions to recognize these attributes and provide a detailed study of the most discriminative features.

IEEE Transactions on Knowledge and Data Engineering | 1995

Genetics-based learning of new heuristics: rational scheduling of experiments and generalization

Benjamin W. Wah; Arthur Ieumwananonthachai; Lon-Chan Chu; Akiko Aizawa

We present new methods for the automated learning of heuristics in knowledge lean applications and for finding heuristics that can be generalized to unlearned domains. These applications lack domain knowledge for credit assignment; hence, operators for composing new heuristics are generally model free, domain independent, and syntactic in nature. The operators we have used are genetics based; examples of which include mutation and cross over. Learning is based on a generate and test paradigm that maintains a pool of competing heuristics, tests them to a limited extent, creates new ones from those that perform well in the past, and prunes poor ones from the pool. We have studied three important issues in learning better heuristics: anomalies in performance evaluation; rational scheduling of limited computational resources in testing candidate heuristics in single objective as well as multiobjective learning; and finding heuristics that can be generalized to unlearned domains. We show experimental results in learning better heuristics for: process placement for distributed memory multicomputers, node decomposition in a branch and bound search, generation of test patterns in VLSI circuit testing, and VLSI cell placement and routing. >

Journal of Parallel and Distributed Computing | 1992

Intelligent process mapping through systematic improvement of heuristics

Arthur Ieumwananonthachai; Akiko Aizawa; Steven R. Schwartz; Benjamin W. Wah; Jerry C. Yan

Abstract In this paper, we present the design of a system for automatically learning and evaluating new heuristic methods that can be used to map a set of communicating processes on a network of computers. Our learning system is based on testing a population of competing heuristic methods within a fixed time constraint. We develop and analyze various resource scheduling strategies based on a statistical model that trades between the number of new heuristic methods considered and the amount of testing performed on each. We implement a prototype learning system (TEACHER 4.1) for learning new heuristic methods used in post-game analysis, a system that iteratively generates and refines mappings of a set of communicating processes on a network of computers. Our performance results show that a significant improvement can be obtained by a systematic exploration of the space of possible heuristic methods.

international conference on computational linguistics | 2000

Automatic thesaurus generation through multiple filtering

Kyo Kageura; Keita Tsuji; Akiko Aizawa

In this paper, we propose a method of generating bilingual keyword clusters or thesauri from parallel or comparable bilingual corpora. The method combines morphological and lexical processing, bilingual word aligmnent, and graph-theoretic cluster generation. An experiment shows that the method is promising.

Polibits | 2011

Contextual Analysis of Mathematical Expressions for Advanced Mathematical Search

Keisuke Yokoi; Minh-Quoc Nghiem; Yuichiroh Matsubayashi; Akiko Aizawa

We found a way to use mathematical search to provide better navigation for reading papers on computers. Since the superficial information of mathematical expressions is ambiguous, considering not only mathematical expressions but also the texts around them is necessary. We present how to extract a natural language description, such as variable names or function definitions that refer to mathematical expressions with various experimental results. We first define an extraction task and constructed a reference dataset of 100 Japanese scientific papers by hand. We then propose the use of two methods, pattern matching and machine learning based ones for the extraction task. The effectiveness of the proposed methods is shown through experiments by using the reference set.

world congress on computational intelligence | 1994

Evolving SSE: a Stochastic Schemata Exploiter

Akiko Aizawa

This paper proposes Stochastic Schemata Exploiter (SSE), a new population-oriented search scheme which employs a schemata processing mechanism similar to the one used in genetic algorithm (GA). Compared with basic GA, SSE has the following two features: first, SSE exploits more the ability of local search of schemata processing while GA puts more emphasis on the ability of global search. Second, SSE reduces the number of control parameters which are in many cases problem dependent and should be determined heuristically. Because of these features, SSE is more suitable than basic GA to be combined with other advanced schemes such as niching or to be incorporated into hybrid search. In the paper, the advantage of SSE is demonstrated using GA-easy and GA-hard test functions.<<ETX>>

Journal of the Association for Information Science and Technology | 2014

Adding Twitter-specific features to stylistic features for classifying tweets by user type and number of retweets

Yui Arakawa; Akihiro Kameda; Akiko Aizawa; Takafumi Suzuki

Recently, Twitter has received much attention, both from the general public and researchers, as a new method of transmitting information. Among others, the number of retweets (RTs) and user types are the two important items of analysis for understanding the transmission of information on Twitter. To analyze this point, we applied text classification and feature extraction experiments using random forests machine learning with conventional stylistic and Twitter‐specific features. We first collected tweets from 40 accounts with a high number of followers and created tweet texts from 28,756 tweets. We then conducted 15 types of classification experiments using a variety of combinations of features such as function words, speech terms, Twitters descriptive grammar, and information roles. We deliberately observed the effects of features for classification performance. The results indicated that class classification per user indicated the best performance. Furthermore, we observed that certain features had a greater impact on classification. In the case of the experiments that assessed the level of RT quantity, information roles had an impact. In the case of user experiments, important features, such as the honorific postpositional particle and auxiliary verbs, such as “desu” and “masu,” had an impact. This research clarifies the features that are useful for categorizing tweets according to the number of RTs and user types.

Explore More