Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Keizo Oyama is active.

Publication


Featured researches published by Keizo Oyama.


International Workshop on Challenges in Web Information Retrieval and Integration | 2005

A Fast Linkage Detection Scheme for Multi-Source Information Integration

Akiko Aizawa; Keizo Oyama

Record linkage refers to techniques for identifying records associated with the same real-world entities. Record linkage is not only crucial in integrating multi-source databases that have been generated independently, but is also considered to be one of the key issues in integrating heterogeneous Web resources. However, when targeting large-scale data, the cost of enumerating all the possible linkages often becomes impracticably high. Based on this background, this paper proposes a fast and efficient method for linkage detection. The features of the proposed approach are: first, it exploits a suffix array structure that enables linkage detection using variable length n-grams. Second, it dynamically generates blocks of possibly associated records using ‘blocking keys’ extracted from already known reliable linkages. The results from our preliminary experiments where the proposed method was applied to the integration of four bibliographic databases, which scale up to more than 10 million records, are also reported in the paper.


international acm sigir conference on research and development in information retrieval | 1998

Phrase processing methods for Japanese text retrieval

Noriko Kando; Kyo Kageura; Masaharu Yoshioka; Keizo Oyama

This paper examines the effectiveness of different phrase identification and weighting methods for Japanese text retrieval in an operational information retrieval (IR) system, called NACSIS-IR. Based on our previous experiments, we used character-based indexing with positional information and word-or phrase-based query processing, which allowed us to implement sophisticated linguistic analysis on large-scale databases while maintaining adequate efficiency. The results of retrieval experiments on a large-scale Japanese test collection showed that the combination of enhanced phrase identification using patterns defined over part-of-speech tags and our algorithms Phrase2 and Phrase5 made a significant positive contribution to retrieval effectiveness. The paper also discusses indexing and phrase processing of Japanese or East Asian languages.


international conference on pattern recognition | 1996

Approximate matching for OCR-processed bibliographic data

Atsuhiro Takasu; Norio Katayama; Masaki Yamaoka; Osamu Iwaki; Keizo Oyama; Jun Adachi

This paper presents a method for matching bibliographies in references of academic papers obtained as document images with records of bibliographic databases. The main subject of this paper is to handle the erroneous bibliographic data obtained by a document understanding methodology. The presented method can find a candidate record set from referral databases in spite of the errors of string by means of approximate matching which is performed as an exact matching of k substrings of length m chosen from the strings of bibliographic data in references and in databases. For the accuracy /spl alpha/ of the OCR, theoretical observation shows that the accuracy of the presented method is 1-(1-/spl alpha//sup m/)/sup k/ under the assumption that the OCR error occurs randomly and independently in the string. The method is applied to references of 187 Japanese articles and achieves accuracy of 94.05%.


International Journal of Web Information Systems | 2014

Evaluating credibility of interest reflection on Twitter

Hao Han; Hidekazu Nakawatase; Keizo Oyama

Purpose – The purpose of this article was to confirm whether users’ interests are reflected by tweeted Web pages, and to evaluate the credibility of interest reflection of tweeted Web pages. Design/methodology/approach – Interest reflection of Twitter is investigated based on the context of sharing behavior. A context-oriented approach is proposed to evaluate the interest reflection of tweeted Web pages based on machine learning. Some different distribution models of similarity are present, and infer whether tweeted Web pages reflect respective users’ interests by analyzing user access profiles. Findings – The analysis of browsing behaviors finds that many users partially hide their own concerns, hobbies and interests, and emphasize the concerns about social phenomenon. The extensive experimental results showed the context-oriented approach is effective on real net view data. Originality/value – As the first-of-its-kind study on evaluating the credibility of interest reflection on Twitter, extensive exper...


hawaii international conference on system sciences | 2012

Can We Predict Political Poll Results by Using Blog Entries

Manabu Okumura; Tetsuya Motegi; Tetsuro Kobayashi; Keizo Oyama; Takahisa Suzuki

Blogs have become an important medium for people to publish their opinions and ideas on the Web. However, it is still not clear whether we can analyze political public opinions from blogs. There have been some recent work on political viewpoint classification, but most only classified political blog entries or sites into opposing viewpoints such as conservative/liberal or Israeli/Palestinian. However, to predict a broader range of political opinions, we need to analyze a wide variety of blogs. Therefore, we constructed a dataset of general blogs that are connected to political poll results. With the dataset, we conducted experiments to predict political poll results by using the blog entries. Our prediction methods are based on a supervised learning algorithm, Support Vector Machines (SVM), with features in blog sites. We also attempted manual prediction with three human subjects as the upper bound of the system performance, and found that such a task is rather difficult even for humans and that the system performance can outperform that of humans.


international acm sigir conference on research and development in information retrieval | 2004

An evaluation of the Web retrieval task at the third NTCIR workshop

Koji Eguchi; Keizo Oyama; Emi Ishida; Noriko Kando; Kazuko Kuriyama

We have investigated the evaluation methods for measuring retrieval effectiveness of Web search engine systems, attempting to make them suitable for real Web environment. With this objective, we conducted ‘Web Retrieval Task’ at the Third NTCIR Workshop (‘NTCIR-3 WEB’) from 2001 to 2002 [1, 2, 3]. Using this NTCIR-3 WEB, we built a re-usable test collection that is suitable for evaluating Web search engine systems, and evaluated the retrieval effectiveness of a certain number of Web search engine systems. TREC Web Tracks [4] are well-known workshops that have an objective to research the retrieval of large-scale Web document data. Past TREC Web Tracks have used data sets extracted from ‘the Internet Archive’or pages gathered from the ‘.gov’ domain as document sets. They assessed the relevance only on information given in English. NTCIR-3 WEB was another workshop that has used 100-gigabyte and/or 10gigabyte document data that were mainly gathered from the ‘.jp’ domain. Relevance judgment was performed on the retrieved documents that are written in Japanese or English, partially considering hyperlinks. By considering the hyperlinks, a ‘hub page’ that gives out-links to multiple ‘authority pages’ [5] may be judged as relevant even if these do not include sufficient relevant information in them. 16 groups enrolled to participate in the NTCIR-3 WEB, and seven of these groups submitted run results.


web age information management | 2007

Framework for building a high-quality web page collection considering page group structure

Yuxin Wang; Keizo Oyama

We propose a framework for building a high-quality web page collection considering page group structure in a two-step process: rough filtering and accurate classification. In both processes, we apply the idea of local page group structure. The rough filtering comprehensively gathers all potential homepages from the web with as few noise pages as possible. It uses property-based keyword lists according to four page group models that are based on the page group structure. The accurate classification uses a textual feature set for the support vector machine, which is composed by independently concatenating the feature subsets on the surrounding pages grouped according to the page group structure. Using a combination of a recall-assured classifier and a precision-assured classifier, we build a three-way classifier to accurately select the pages that need manual assessment to assure the required quality. The effectiveness of proposed method is shown by the experimental results.


annual acis international conference on computer and information science | 2016

Towards detecting and predicting fall events in elderly care using bidirectional electromyographic sensor network

Hao Han; Xiaojun Ma; Keizo Oyama

Falling is one of the most serious life-threatening events for the elders, and the ICT-based solution plays a key role in addressing this problem prevalently. In this paper, four principles are proposed as fundamental criteria for designing a sensor network for elder-oriented fall detection and prediction. According to these criteria, a bidirectional electromyographic sensor network model is experimentally constructed, and qualitative analysis is conducted to explain that this solution performs more realistically and rationally.


information reuse and integration | 2011

Retrieval, description and security: Towards the large-scale UI component-based reuse and integration

Hao Han; Peng Gao; Keizo Oyama

Under the trend of the information/functionality integration, the application integration at the presentation and logic layers becomes a popular issue. Without the open Web service APIs, the integration of traditional Web applications is based on the reuse of UI components usually, which represent the interactive functionalities of applications partially. In this paper, we present some common problems of the current UI component-based reuse and integration, and propose our solutions: a security-enhanced “component retrieval and integration description” method. Our purpose is to construct a reliable large-scale reuse and integration system for Web applications.


Systems and Computers in Japan | 2003

Development of an information retrieval system suitable for large-scale scholarly databases

Keizo Oyama; Kyo Kageura; Noriko Kando; Masaru Kimura; Katsumi Maruyama; Masaharu Yoshioka; Kazumichi Takahashi

The demand for full-fledged information retrieval services via the World Wide Web has grown as the Internet has evolved. However, existing information retrieval systems for the World Wide Web have various limitations in the search functions, and in structure and scale of databases. The information retrieval system described in this paper has been operating for a real information retrieval service, providing high-level search functions via a command line and a Web interface for large-scale scholarly databases in text form with complex structures. In this paper, the authors first describe an outline of and the design of the information retrieval functions, in particular the characteristic functions including set operations, thesaurus searches, hierarchical structure searches, and unified searches of multiple databases. Next, they describe the implementation technologies of the system focusing on the features of the search engine, the session control method, the handling of large-scale databases, and the structure of servers and processes. Then, the authors evaluate the current system based on the real operation, and finally discuss the future issues and the related researches.

Collaboration


Dive into the Keizo Oyama's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Noriko Kando

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar

Akiko Aizawa

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar

Haruko Ishikawa

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar

Yuxin Wang

Graduate University for Advanced Studies

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kazuko Kuriyama

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar

Hidekazu Nakawatase

National Institute of Informatics

View shared research outputs
Top Co-Authors

Avatar

Masao Takaku

National Institute for Materials Science

View shared research outputs
Researchain Logo
Decentralizing Knowledge