Kenji Hatano | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kenji Hatano is active.

Explore More

Publication

Featured researches published by Kenji Hatano.

international world wide web conferences | 2004

Adaptive web search based on user profile constructed without any effort from users

Kazunari Sugiyama; Kenji Hatano; Masatoshi Yoshikawa

Web search engines help users find useful information on the World Wide Web (WWW). However, when the same query is submitted by different users, typical search engines return the same result regardless of who submitted the query. Generally, each user has different information needs for his/her query. Therefore, the search result should be adapted to users with different information needs. In this paper, we first propose several approaches to adapting search results according to each users need for relevant information without any user effort, and then verify the effectiveness of our proposed approaches. Experimental results show that search systems that adapt to each users preferences can be achieved by constructing user profiles based on modified collaborative filtering with detailed analysis of users browsing history in one day.

acm conference on hypertext | 2003

Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages

Kazunari Sugiyama; Kenji Hatano; Masatoshi Yoshikawa; Shunsuke Uemura

In IR (information retrieval) systems based on the vector space model, the TF-IDF scheme is widely used to characterize documents. However, in the case of documents with hyperlink structures such as Web pages, it is necessary to develop a technique for representing the contents of Web pages more accurately by exploiting the contents of their hyperlinked neighboring pages. In this paper, we first propose several approaches to refining the TF-IDF scheme for a target Web page by using the contents of its hyperlinked neighboring pages, and then compare the retrieval accuracy of our proposed approaches. Experimental results show that, generally, more accurate feature vectors of a target Web page can be generated in the case of utilizing the contents of its hyperlinked neighboring pages at levels up to second in the backward direction from the target page.

advanced information networking and applications | 2007

Efficient Query Processing for Large XML Data in Distributed Environments

Hiroto Kurita; Kenji Hatano; Jun Miyazaki; Shunsuke Uemura

We propose an efficient distributed query processing method for large XML data by partitioning and distributing XML data to multiple computation nodes. There are several steps involved in this method; however, we focused particularly on XML data partitioning and dynamic relocation of partitioned XML data in our research. Since the efficiency of query processing depends on both XML data size and its structure, these factors should be considered when XML data is partitioned. Each partitioned XML data is distributed to computation nodes so that the CPU load can be balanced. In addition, it is important to take account of the query workload among each of the computation nodes because it is closely related to the query processing cost in distributed environments. In case of load skew among computation nodes, partitioned XML data should be relocated to balance the CPU load. Thus, we implemented an algorithm for relocating partitioned XML data based on the CPU load of query processing. From our experiments, we found that there is a performance advantage in our approach for executing distributed query processing of large XML data.

database systems for advanced applications | 1997

A SOM-Based Information Organizer for Text and Video Data

Kenji Hatano; Qing Qian; Katsumi Tanaka

We propose an information organizer for e ective clustering and similarity-based retrieval of text and video data. Instead of giving keywords or authoring them, we use a vector space model and DCT image coding in order to extract characteristics of data. Data are clustered by Kohonens self-organizing map, and the result is visualized in a 3D form. By this, similarity-based retrieval is achieved. We implemented a prototype system and report experimental results. We consider that our system e ectively promotes reuse of distributed text and image data assets.

database systems for advanced applications | 1999

An interactive classification of Web documents by self-organizing maps and search engines

Kenji Hatano; Ryouichi Sano; Yiwei Duan; Katsumi Tanaka

We propose an effective classification view mechanism for hypertext data such as Web documents based on Kohonens self-organizing map (SOM) and search engines. Web documents collected by search engines are automatically classified by SOM and the obtained SOMs are incrementally modified according to the interaction between users and SOMs. At present, various search engines are designed to retrieve Web documents. When we use search engines to retrieve Web documents we get a lot of answers and have to examine each Web document. Therefore, in order to make up for search engines, we need a function to classify Web documents corresponding to the users point of view and their purposes. Furthermore, we cannot retrieve pertinent Web documents by conventional search engines when a specific topic is described by more than one Web document. To solve these problems, we exploited a content-based clustering system for Web documents. In this system, Web documents are automatically clustered by their feature vectors produced from Web documents or minimal subgraphs consisting of multiple Web documents, and their overview maps are dynamically generated by SOM. Furthermore, we propose a method by which an obtained SOM is modified by users interaction such as feedback operations.

database and expert systems applications | 2002

Information Retrieval System for XML Documents

Kenji Hatano; Hiroko Kinutani; Masatoshi Yoshikawa; Shunsuke Uemura

In the research field of document information retrieval, the unit of retrieval results returned by IR systems is a whole document or a document fragment, like a paragraph in passage retrieval. IR systems based on the vector space model compute feature vectors of the units and calculate the similarities between the units and the query. However, the unit of retrieval results are not suitable for document information retrieval since they are not congruent with the information which users are searching for. Therefore, the unit of retrieval results should be a portion of the XML document, such as a chapter, section, or subsection. That is, we think the most important concern of document information retrieval is to define the unit of retrieval results, that is meaningful for users. It is easy to construct the appropriate portion of XML documents as retrieval results because XML is a standard document format on the Internet and because XML documents consist of contents and document structures. In this paper, we propose an effective IR system for XML documents that automatically defines an appropriate unit of retrieval results by analyzing the XML document structure. We performed experimental evaluations and verified the effectiveness of our XML IR system. In addition, we also defined new recall and precision measures for XML information retrieval in order to evaluate our XML IR system.

INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval | 2005

Implementation of a high-speed and high-precision XML information retrieval system on relational databases

Kei Fujimoto; Toshiyuki Shimizu; Norimasa Terada; Kenji Hatano; Yu Suzuki; Toshiyuki Amagasa; Hiroko Kinutani; Masatoshi Yoshikawa

This paper describes an XML information retrieval system that we have developed. It is based on a vector space model, and implemented on top of XRel, a relational XML database system that has been developed in our research group. When a query is processed, a large number of fragments are retrieved, because a single XML document usually contains many XML fragments. Keeping all XML fragments degrades retrieval precision and increases query processing time, because some XML fragments are not appropriate as a query target. In existing methods, retrieval targets are manually selected by human experts when an XML collection is stored in the system. Such manual selection is not feasible when many kinds of XML documents are stored in the system. To cope with the problem we propose a method for automatically selecting document-centric fragments by introducing three measurements, namely, period ratio, number of different words, and empirical rules. By deleting inappropriate data-centric fragments from results of keyword query, we can improve the accuracy and performance of our system. Through performance evaluations, we confirmed the improvement of retrieval precision and query processing speed.

international conference on conceptual modeling | 2001

Extraction of Partial XML Documents Using IR-Based Structure and Contents Analysis

Kenji Hatano; Hiroko Kinutani; Masatoshi Yoshikawa; Shunsuke Uemura

As Internet technologies develop, XML is becoming widely used as a standard data/document format. Although the use of XML documents has attracted public attention, the application of IR technologies in XML document retrieval is still in its premature stage. We foresee that typical XML queries for end-users will be very terse, like those used with current Web search engines. Therefore, an XML search engine should be able to search appropriate retrieval results using only a few keywords. In this paper, we introduce a notion of context nodes. Context nodes are used to automatically extract coherent partial documents without the knowledge of XML document structures. This method is useful because it does not require domain analysts to analyze DTDs and specify candidate partial documents beforehand. We use the term “context search” to represent search methods which employ the notion of context node. As an instantiation of context search methods, we have developed algorithms to identify result partial documents in the vector space model. We made a performance evaluation to verify the effectiveness of our method.

hawaii international conference on system sciences | 2008

Cross-Language Information Retrieval by Domain Restriction Using Web Directory Structure

Fuminori Kimura; Akira Maeda; Kenji Hatano; Jun Miyazaki; Shunsuke Uemura

In this paper, we propose a cross-language information retrieval (CLIR) method based on estimating for domains of the query using hierarchic structures of Web directories. To get the most appropriate translation of the queries, we utilize the Web directories written in many different languages as multilingual corpus for disambiguating translation of the query and for estimating the domain of search results using hierarchic structures of Web directories. From experimental evaluations, we found that there is an advantage in retrieval accuracy using our proposal for disambiguating translation in CLIR system. We found that it is effective to restrict to target fields of the query using lower level merged categories in order to acquire suited translation of the query.

database and expert systems applications | 2002

A Method of Improving Feature Vector for Web Pages Reflecting the Contents of Their Out-Linked Pages

Kazunari Sugiyama; Kenji Hatano; Masatoshi Yoshikawa; Shunsuke Uemura

TF-IDF schemes are popular for generating the feature vectors of documents. These schemes are proposed for characterizing one document. Therefore, in order to characterize Web pages using tf-idf schemes, the feature vectors of the Web pages should be reflected by the contents of Web pages linked with other pages via hyperlinks. In this paper, we propose three methods of generating feature vectors for linked documents such as Web pages. Moreover, in order to verify the effectiveness of our proposed methods, we compare our methods with current search engines and confirm their retrieval accuracy using recall precision curves.

Explore More