Nobuo Ohbo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nobuo Ohbo is active.

Explore More

Publication

Featured researches published by Nobuo Ohbo.

international conference on management of data | 1993

Evaluation of signature files as set access facilities in OODBs

Yoshiharu Ishikawa; Hiroyuki Kitagawa; Nobuo Ohbo

Object-oriented database systems (OODBs) need efficient support for manipulation of complex objects. In particular, support of queries involving evaluations of set predicates is often required in handling complex objects. In this paper, we propose a scheme to apply signature file techniques, which were originally invented for text retrieval, to the support of set value accesses, and quantitatively evaluate their potential capabilities. Two signature file organizations, the sequential signature file and the bit-sliced signature file, are considered and their performance is compared with that of the nested index for queries involving the set inclusion operator (⊆). We develop a detailed cost model and present analytical results clarifying their retrieval, storage, and update costs. Our analysis shows that the bit-sliced signature file is a very promising set access facility in OODBs.

web age information management | 2000

MB+Tree: A Dynamically Updatable Metric Index for Similarity Searches

Masahiro Ishikawa; Hanxiong Chen; Kazutaka Furuse; Jeffrey Xu Yu; Nobuo Ohbo

One of the common query patterns is to find approximate matches to a given query object in a large database. This kind of query processing is referred as similarity search in a metric space. In this paper, we propose a new metric index MB+tree, called Metric B+tree, which supports near neighbour searching in a generic metric space. MB+tree is aimed at reducing both the number of I/O accesses and the number of distance calculations for similarity search in large databases, while allowing dynamic data updates. In this paper, we show that a B+tree, with an auxiliary tree, can be used as a metric index. Unlike other multi-dimensional (spatial) access methods, using our approach, we can partition data into disjoint partitions while building/maintaining a metric index, which can lead to a significant cost reduction since the number of metric sub-spaces to be searched is reduced. In order to use MB+tree, a slicing value is proposed. With the slicing value, in addition to space division information, a near neighbour searching can be systematically converted to a range search in B+tree. Several different slicing values are considered namely, one-focus-point scheme and two-focus-point scheme. We also conducted extensive experimental studies using synthetic data. Results are reported in this paper.

database and expert systems applications | 2008

Efficient Bounds in Finding Aggregate Nearest Neighbors

Sansarkhuu Namnandorj; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo

Developed from Nearest Neighbor (NN) queries, Aggregate Nearest Neighbor (ANN) queries return the object that minimizes an aggregate distance function with respect to a set of query points. Because of the multiple query points, ANN queries are much more complex than NN queries. For optimizing the query processing and improving the query efficiency, many ANN queries algorithms utilizes pruning strategies, with or without an index structure. Obviously, the pruning effect highly depends on the tightness of the bound estimation. In this paper, we figure out a property in vector space and develop some efficient bound estimations for two most popular types of ANN queries. Based on these bounds, we design the indexed and non-index ANN algorithms, and conduct experimental studies. Our algorithms show good performance, especially for high dimensional queries, for both real dataset and synthetic datasets.

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms | 1993

Estimation of False Drops in Set-valued Object Retrieval with Signature Files

Hiroyuki Kitagawa; Yoshiaki Fukushima; Yoshiharu Ishikawa; Nobuo Ohbo

Advanced database systems have to support complex data structures as treated in object-oriented data models and nested relational data models. In particular, efficient processing of set-valued object retrieval (simply, set retrieval) is indispensable for such systems. In the previous paper [6], we proposed the use of signature files as efficient set retrieval facilities and showed their potential capabilities based on a disk page access cost model. Retrieval with signature files is always accompanied by mismatches called false drops, and it is very important in designing signature files to properly control the false drops.

Knowledge and Information Systems | 2005

CVA file: an index structure for high-dimensional datasets

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo

Similarity search is important in information-retrieval applications where objects are usually represented as vectors of high dimensionality. This paper proposes a new dimensionality-reduction technique and an indexing mechanism for high-dimensional datasets. The proposed technique reduces the dimensions for which coordinates are less than a critical value with respect to each data vector. This flexible datawise dimensionality reduction contributes to improving indexing mechanisms for high-dimensional datasets that are in skewed distributions in all coordinates. To apply the proposed technique to information retrieval, a CVA file (compact VA file), which is a revised version of the VA file is developed. By using a CVA file, the size of index files is reduced further, while the tightness of the index bounds is held maximally. The effectiveness is confirmed by synthetic and real data.

intelligent data engineering and automated learning | 2003

Grid-Based Indexing for Large Time Series Databases

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo; Eamonn J. Keogh

Similarity search in large time series databases is an interesting and challenging problem. Because of the high dimensional nature of the data, the difficulties associated with dimensionality curse arise. The most promising solution is to use dimensionality reduction, and construct a multi-dimensional index structure for the reduced data. In this work we introduce a new approach called grid-based Datawise Dimensionality Reduction(DDR) which attempts to preserve the characteristics of time series. We then apply quantization to construct an index structure. An experimental comparison with existing techniques demonstrate the utility of our approach.

web age information management | 2002

C2VA: Trim High Dimensional Indexes

Hanxiong Chen; Jiyuan An; Kazutaka Furuse; Nobuo Ohbo

Classical multi-dimensional indexes are based on data space partitioning. The effectiveness declines because the number of indexing units grows exponentially as the number of dimensions increases. Then, unfortunately, using such index structures is less effective than linear scanning of all the data. The VA-file proposed a method of coordinate approximation, observing that nearest neighbor search becomes of linear complexity in high-dimensional spaces.In this paper we propose CM2VA(Clustered Compact VA) for dimensionality reduction. We investigate and find that real datasets are rarely uniformly distributed, which is the main assumption of VA-file. Instead of approximation on all dimensions, we figure out the condition of skipping less important dimensions. This avoids the problem of generating huge index file for a large, high dimensional dataset and hence saves a lot of I/O accesses when scanning. Moreover, we guarantee that C2VA preserves the precision of bounds as in VA-file, which maximizes the efficiency gain. The conviction is found in our experimental results.

advanced data mining and applications | 2011

Indexing expensive functions for efficient multi-dimensional similarity search

Hanxiong Chen; Jianquan Liu; Kazutaka Furuse; Jeffrey Xu Yu; Nobuo Ohbo

Similarity search is important in information retrieval applications where objects are usually represented as vectors of high dimensionality. This leads to the increasing need for supporting the indexing of high-dimensional data. On the other hand, indexing structures based on space partitioning are powerless because of the well-known “curse of dimensionality”. Linear scan of the data with approximation is more efficient in the high-dimensional similarity search. However, approaches so far have concentrated on reducing I/O, and ignored the computation cost. For an expensive distance function such as Lp norm with fractional p, the computation cost becomes the bottleneck. We propose a new technique to address expensive distance functions by “indexing the function” by pre-computing some key values of the function once. Then, the values are used to develop the upper/lower bounds of the distance between a data vector and the query vector. The technique is extremely efficient since it avoids most of the distance function computations; moreover, it does not involve any extra secondary storage because no index is constructed and stored. The efficiency is confirmed by cost analysis, as well as experiments on synthetic and real data.

australasian database conference | 2002

The convex polyhedra technique: an index structure for high-dimensional space

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Masahiro Ishikawa; Nobuo Ohbo

This paper proposes a new dimensionality reduction technique and an indexing mechanism for high dimensional data sets in which data points are not uniformly distributed. The proposed technique decomposes a data space into convex polyhedra, and the dimensionality of each data point is reduced according to which polyhedron includes the data point. One of the advantages of the proposed technique is that it reduces the dimensionality locally. This local dimensionality reduction contributes to improve indexing mechanisms for non-uniformly distributed data sets.To show the applicability and the effectiveness of the proposed technique, this paper describes a new indexing mechanism called CVA-file (Compact VA-File) which is a revised version of the VA-file. With the proposed dimensionality reduction technique, the size of data points stored in index files can be reduced. Furthermore, it can estimate upper and lower bounds of each entry in index files by using geographic properties of convex polyhedra. Results from experimental simulations show that the CVA-file is better than the VA-file for non-uniformly distributed real data sets.

computer software and applications conference | 1989

Design data modeling with versioned conceptual configuration

Hiroyuki Kitagawa; Nobuo Ohbo

Control of versions and time-varying configurations of design data is a very important issue in computer-aided design environments. A new data modeling scheme to facilitate their management is proposed. The basic idea is to manage complicatedly related time-varying design data in terms of well-defined versions and configurations of conceptual design artifacts. The modeling scheme views the design database as a collection of conceptual objects and representation objects. Representation objects are abstractions of conventional design data files and configurations of conceptual objects are cast on them as the management structure. To facilitate design data management through the conceptual objects, the scheme provides constructs for modeling version variant and invariant configurations and for modular management of large configurations. Operations for design data manipulation and a mechanism for maintaining design database consistency are also presented.<<ETX>>

Explore More