Andreas Henrich
Folkwang University of the Arts
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andreas Henrich.
international conference on data engineering | 1998
Andreas Henrich
Efficient access structures for similarity queries on feature vectors are an important research topic for application areas such as multimedia databases, molecular biology or time series analysis. Different access structures for high dimensional feature vectors have been proposed, namely: the SS-tree, the VAMSplit R-tree, the TV-tree, the SR-tree and the X-tree. All these access structures are derived from the R-tree. As a consequence, the fanout of the directory of these access structures decreases drastically for higher dimensions. Therefore we argue that the R-tree is not the best possible starting point for the derivation of an access structure for high-dimensional data. We show that k-d-tree-based access structures are at least as well suited for this application area and we introduce the LSD/sup h/-tree as an example for such a k-d-tree-based access structure for high-dimensional feature vectors. We describe the algorithms for the LSD/sup h/-tree and present experimental results comparing the LSD/sup h/-tree and the X-tree.
Proceedings Software Engineering Environments | 1995
Andreas Henrich
The paper presents P-OQL (Pcte-Object-Query-Language) a domain-oriented query language for Pcte. Pcte is the ECMA standard for a public tool interface (PTI) for system development environments (SDE) and includes as one of its major components a structurally object-oriented object management system (OMS). Whereas the ECMA standard is only concerned with navigational access to the object base, experience shows the need for a domain-oriented query language. P-OQL reflects the whole data model of Pcte especially designed to meet the requirements of system development environments. Special features like attributed links are integrated in a homogeneous way. The language can be used as an interactive query language for the end user, but the main objective is the embedded use in applications, i.e. software development tools, via the API. The embedded use enforces an elaborated integration into the normal (navigational) access to the Pcte-OMS. P-OQL is not designed to substitute the navigational access, but to complement it. Hence, the result of navigational access can be used as starting point of further domain-oriented access and vice versa. Furthermore P-OQL allows the access to attributes by navigational operations in a domain-oriented query. P-OQL includes a large set of operators and predicates. Nevertheless, extensions can be made with reasonable effort to meet the heterogeneous requirements in SDEs. First ideas to deal with problems caused by the distribution of the object base are presented.<<ETX>>
international acm sigir conference on research and development in information retrieval | 1996
Andreas Henrich
Modern system development environments usually deploy the object management facilities of a so-called repository to store the documents created and maintained during system development. PCTE is the ISO and ECMA standard for a public tool interface for an open repository [23]. In this paper we present document retrieval extensions for an OQLoriented query language for PCTE. The extensions proposed cover (1) pattern matching, (2) term based document retrieval with automatically generated document description vectors, (3) the flexible definition of what is addressed as a “document” in a given query, and (4) the integration of these facilities into a CASE tool. Whereas the integration of pattern matching facilities into query languages has been addressed by other authors before, the main contribution of our approach is the homogeneous integration of term based document retrieval and the flexible definition of documents.
conference on information and knowledge management | 1996
Andreas Henrich
In the field of information-retrieval the vector space model has been proposed. In this model queries and documents are represented ae term vectors where each coefficient represents the relevance of a given term with respect to the document or query. A typical task in this context is to search for the documents most similar to a given query vector. On the other hand, algorithms to perform nearest neighbor and distance scan queries have been proposed for various types of spatial access structures. Unfortunately, these access structures assume implicitly that the number of dimensions is relatively small — which is not the case for document representation vectors. In this paper we discuss the adaptation of spatial access structures for document representation vectors. We describe how some peculiarities of document representation vectors can be exploited to overcome the problems with higher dimensions to a certain extend. We exploit these peculiarities introducing a new cluster split technique and a sophisticated algorithm to calculate an upper bound for the similarity of the documents located in a subtree of the access structure.
workshop on graph theoretic concepts in computer science | 1989
Andreas Henrich; Hans Werner Six; Peter Widmayer
We propose the partially paged binary tree principle (PPbin tree principle, for short) for maintaining binary trees which do not fit into core and hence must be (at least partially) paged on secondary storage. The PPbin tree principle can be applied to balanced as well as unbalanced binary trees. Paging a balanced binary tree results in a balanced external binary tree. However, main advantage of the new principle is that even for unbalanced binary trees it is very unlikely that long external access paths will arise. As an example, we describe the partially paged k-d tree which is used as directory in a spatial data structure. The analysis of the expected storage utilization and the expected external height proves the efficiency of the new data structure derived from the application of the PPbin tree principle.
international conference on data engineering | 1996
Andreas Henrich
In recent years, various k-d-tree based multidimensional access structures have been proposed. All these structures share an average bucket utilization of at most ln 2/spl ap/69.3%. We present two algorithms which perform local redistributions of objects to improve the storage utilization of these access structures. We show that under fair conditions a good improvement algorithm can save up to 20% of space and up to 15% of query processing time. On the other hand we also show that a local redistribution scheme designed without care, can improve the storage utilization and at the same time worsen the performance of range queries drastically. Furthermore we show the dependencies between split strategies and local redistribution schemes and the general limitations which can be derived from these dependencies.
advances in geographic information systems | 1996
Andreas Henrich
With spatial access structures the split strategy determining how the objects stored in an overflowing bucket are distributed over two buckets. is crucial for the performance and robustness. Two categories of split strategies can be distinguished: Data dependent split strategies depend only on the objects stored in the bucket to be split. Distribution dependent split strategies choose the split dimension and the split position independently of the objects actually stored in the bucket to be split based on an hypothesis about the object distribution. Unfortunately both types of split strategies have specific drawbacks. For a distribution dependent split strategy a sound hypothesis for the spatial distribution of the objects is needed in advance and with data dependent split strategies the directory tree becomes extremely unbalanced for insertions in a geometrically presorted order. In this paper we present a hybrid split strategy which tries to combine the advantages of both types of split strategies. Normally it adapts the behavior of a data dependent split st.rategy, but when the risk of a degeneration of the access structure is detected, it mutates to a distribution dependent split strategy. We will show that this hybrid split strategy is robust with respect to skew distributions and with respect to insertions in presorted order.
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases | 1995
Andreas Henrich; Jens Möller
In recent years, many access structures have been proposed supporting access to objects via their spatial location. However, additional non-geometric properties are always associated with geometric objects, and in practice it is often necessary to use select conditions based on spatial and standard attributes. An obvious idea to improve the performance of queries with mixed select conditions is to extend spatial access structures with additional dimensions for standard attributes. Whereas this idea seems to be simple and promising at first glance, a closer look brings up serious problems, especially with select conditions containing arithmetic expressions or select conditions for non-point objects and with Boolean operators like or and not.
database and expert systems applications | 1996
Andreas Henrich; Dirk Däberitz
Modern system development environments usually deploy the object management facilities of a repository to store the documents created and maintained during system development. In this paper we present a pragmatic approach to express various levels of consistency for the data maintained in such a repository. The levels of consistency are achieved by means of a query language. Consistency constraints are stated as queries searching for inconsistent items in the repository. The queries are handled by a generic analyzer, which checks the defined constraints whenever appropriate. Furthermore, we present a general classification schema for consistency constraints and describe how different classes of constraints are handled by our approach.
IEEE Transactions on Knowledge and Data Engineering | 2000
Oliver Haase; Andreas Henrich
An important characteristic of distributed object management systems is that due to network or machine failure, the environment may become partitioned into subenvironments that cannot communicate with each other. In some application scenarios, it is important that the subenvironments remain operable even in this case. In particular, queries should be processed in an appropriate way. To this end, the final and all intermediate results of a query in a distributed object management system must be regarded as potentially vague. We propose a hybrid representation for vague sets and vague multisets designed for this application context. The representation consists of an enumerating part, which contains the elements we could access during query processing, and a descriptive part, which describes the relevant elements we could not access. We introduce propagation rules which can be used to minimize the vagueness of a query result represented in this hybrid way. The main advantage of our approach is that the descriptive part of the representation can be used to improve the enumerating part during query processing.