Is this you? Create Your Porfile

Dao Dinh Kha

Nara Institute of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dao Dinh Kha is active.

Explore More

Publication

Featured researches published by Dao Dinh Kha.

international conference on data engineering | 2001

An XML indexing structure with relative region coordinate

Dao Dinh Kha; Masatoshi Yoshikawa; Shunsuke Uemura

For most of the index structures for XML data proposed so far, updating is a problem, because an XML elements coordinates are expressed using absolute values. Due to the structural relationship among the elements in XML documents, we have to re-compute these absolute values if the content of the source data is updated. The reconstruction requires the updating of a large portion of the index files, which causes a serious problem, especially when the XML data content is updated frequently. In this paper, we propose an indexing structure scheme based on the relative region coordinates that can effectively deal with the update problem. The main idea is that we express the coordinates of an XML element based on the region of its parent element. We present an algorithm to construct a tree-structured index in which related coordinates are stored together. In consequence, our indexing scheme requires the updating of only a small portion of the index file.

International Workshop on Challenges in Web Information Retrieval and Integration | 2005

A Mapping Scheme of XML Documents into Relational Databases using Schema-based Path Identi.ers

Kenji Fujimoto; Tsuyoshi Shimizu; Dao Dinh Kha; Masatoshi Yoshikawa; Toshiyuki Amagasa

In this paper, we propose a mapping scheme of XML documents into relational databases. The scheme enables us to store, retrieve and update XML documents efficiently. When storing XML documents in relational databases, XML tree structures must be preserved explicitly. To this end, a label is assigned to nodes in the XML tree. In general, document retrieval and update performance is affected by node labeling schemes. We use SPIDER (Schema based Path IDentifiER), a labeling scheme for XML documents utilizing DTDs that makes retrieval and update more efficient. SPIDER only identifies paths from root node to a node. Thus, multiple nodes appearing in the same path cannot be distinguished by only using SPIDER. We introduced Sibling Dewey Order to identify such nodes. Generally, when a new node is inserted into XML documents, some other nodes need to be relabeled to preserve the order of nodes. In our method, only Sibling Dewey Order is relabeled; SPIDER is not affected. Since the range of relabeling is small, it is possible to update documents efficiently. We stored documents utilizing SPIDER in a relational database and then translated various XPath expressions into SQL using SPIDER. We perform experiments and demonstrate that the proposed scheme outpeforms conventional methods both in retrieval and update

extending database technology | 2002

A Structural Numbering Scheme for XML Data

Dao Dinh Kha; Masatoshi Yoshikawa; Shunsuke Uemura

Identifier generation is a common but crucial task in many XML applications. In addition, the structural information of XML data is essential to evaluate the XML queries. In order to meet both these requirements, several numbering schemes, including the powerful UID technique, have been proposed. In this paper, we introduce a new numbering scheme based on the UID techniques called multilevel recursive UID (rUID). The proposed rUID is robust, scalable and hierarchical. rUID features identifier generation by level and takes into account the XML tree topology. rUID not only enables the computation of the parent nodes identifier from the child nodes identifier, as in the original UID, but also deals effectively with XML structural update and can be applied to arbitrarily large XML documents. In addition, we investigate the effectiveness of rUID in representing the XPath axes and query processing and briefly discuss other applications of rUID.

international xml database symposium | 2004

XML Query Processing Using a Schema-Based Numbering Scheme

Dao Dinh Kha; Masatoshi Yoshikawa

Establishing the hierarchical order among XML elements is an essential function of XML query processing techniques. Although most XML documents have an associated DTD or XML schema, the document structure information has not been utilized efficiently in query processing techniques proposed so far. In this paper, we propose a novel technique that uses DTD or XML schema to improve the disk I/O complexity of XML query processing. We present a schema-based numbering scheme called SPIDER that incorporates both structure information and tag names extracted from the document structure descriptions. Given the tag name and the identifier of an element, SPIDER can determine the tag names and the identifiers of the ancestor elements without disk I/O. Based on SPIDER, we designed a mechanism called VirtualJoin that significantly reduces disk I/O workload for processing XML queries. Our experiments indicated that SPIDER outperforms the structural join techniques Stack-Tree and PathStack in XML query processing, especially for XML queries with heavy join workload and large data sets.

IEICE Transactions on Information and Systems | 2006

An Efficient Schema-Based Technique for Querying XML Data

Dao Dinh Kha; Masatoshi Yoshikawa

As data integration over the Web has become an increasing demand, there is a growing desire to use XML as a standard format for data exchange. For sharing their grammars efficiently, most of the XML documents in use are associated with a document structure description, such as DTD or XML schema. However, the document structure information is not utilized efficiently in previously proposed techniques of XML query processing. In this paper, we present a novel technique that reduces the disk I/O complexity of XML query processing. We design a schema-based numbering scheme called SPAR that incorporates both structure information and tag names extracted from DTD or XML schema. Based on SPAR, we develop a mechanism called VirtualJoin that significantly reduces disk I/O workload for processing XML queries. As shown by experiments, VirtualJoin outperforms many prior techniques.

database and expert systems applications | 2002

Application of rUID in Processing XML Queries on Structure and Keyword

Dao Dinh Kha; Masatoshi Yoshikawa; Shunsuke Uemura

Applying numbering schemes to simulate the structure of XML data is a promising technique for XML query processing. In this paper, we describe SKEYRUS, a system, which enables the integrated structure-keyword searches on XML data using the rUID numbering scheme. rUID has been designed to be robust in structural update and applicable to arbitrarily large XML documents. SKEYRUS accepts XPath expressions containing word-containment predicates as the input, therefore the query expressiveness is significantly extended. The structural feature and the ability to generate XPath axes of rUID are exploited in query processing. Preliminary performance results of SKEYRUS were also reported.

IEICE Transactions on Information and Systems | 2004