Is this you? Create Your Porfile

Sophie Cluet

French Institute for Research in Computer Science and Automation

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sophie Cluet is active.

Explore More

Publication

Featured researches published by Sophie Cluet.

international conference on management of data | 1994

From structured documents to novel query facilities

Vassilis Christophides; Serge Abiteboul; Sophie Cluet; Michel Scholl

Structured documents (e.g., SGML) can benefit a lot from database support and more specifically from object-oriented database (OODB) management systems. This paper describes a natural mapping from SGML documents into OODBs and a formal extension of two OODB query languages (one SQL-like and the other calculus) in order to deal with SGML document retrieval. Although motivated by structured documents, the extensions of query languages that we present are general and useful for a variety of other OODB applications. A key element is the introduction of paths as first class citizens. The new features allow to query data (and to some extent schema) without exact knowledge of the schema in a simple and homogeneous fashion.

international conference on management of data | 1998

Your mediators need data conversion

Sophie Cluet; Claude Delobel; Jérǒme Siméon; Katarzyna Smaga

Due to the development of the World Wide Web, the integration of heterogeneous data sources has become a major concern of the database community. Appropriate architectures and query languages have been proposed. Yet, the problem of data conversion which is essential for the development of mediators/wrappers architectures has remained largely unexplored. In this paper, we present the YAT system for data conversion. This system provides tools for the specification and the implementation of data conversions among heterogeneous data sources. It relies on a middleware model, a declarative language, a customization mechanism and a graphical interface. The model is based on named trees with ordered and labeled nodes. Like semistructured data models, it is simple enough to facilitate the representation of any data. Its main originality is that it allows to reason at various levels of representation. The YAT conversion language (called YATL) is declarative, rule-based and features enhanced pattern matching facilities and powerful restructuring primitives. It allows to preserve or reconstruct the order of collections. The customization mechanism relies on program instantiations: an existing program may be instantiated into a more specific one, and then easily modified. We also present the architecture, implementation and practical use of the YAT prototype, currently under evaluation within the OPAL* project.

international conference on management of data | 2000

On wrapping query languages and efficient XML integration

Vassilis Christophides; Sophie Cluet; Jérǒme Simèon

Modern applications (Web portals, digital libraries, etc.) require integrated access to various information sources (from traditional DBMS to semistructured Web repositories), fast deployment and low maintenance cost in a rapidly evolving environment. Because of its flexibility, there is an increasing interest in using XML as a middleware model for such applications. XML enables fast wrapping and declarative integration. However, query processing in XML-based integration systems is still penalized by the lack of an algebra with adequate optimization properties and the difficulty to understand source query capabilities. In this paper, we propose an algebraic approach to support efficient XML query evaluation. We define a general purpose algebra suitable for semistructured on XML query languages. We show how this algebra can be used, with appropriate type information, to also wrap more structured query languages such as OQL or SQL. Finally, we develop new optimization techniques for XML-based integration systems.

international conference on management of data | 1992

A general framework for the optimization of object-oriented queries

Sophie Cluet; Claude Delobel

The goal of this work is to integrate in a general framework the different query optimization techniques that have been proposed in the object-oriented context. As a first step, we focus essentially on the logical aspect of query optimization. In this paper, we propose a formalism (i) that unifies different rewriting formalisms, (ii) that allows easy and exhaustive factorization of duplicated subqueries, and (iii) that supports heuristics in order to reduce the optimization rewriting phase.

Theoretical Computer Science | 2002

Correspondence and translation for heterogeneous data

Serge Abiteboul; Sophie Cluet; Tova Milo

Data integration often requires a clean abstraction of the different formats in which data are stored, and means for specifying the correspondences/relationships between data in different worlds and for translating data from one world to another. For that, we introduce in this paper a middleware data model that serves as a basis for the integration task, and a declarative rules language for specifying the integration. We show that using the language, correspondences between data elements can be computed in polynomial time in many cases, and may require exponential time only when insensitivity to order or duplicates are considered. Furthermore, we show that in most practical cases the correspondence rules can be automatically turned into translation rules to map data from one representation to another. Thus, a complete integration task (derivation of correspondences, transformation of data from one world to the other, incremental integration of a new bulk of data, etc.) can be specified using a single set of declarative rules.

very large data bases | 2002

Views in a Large Scale XML Repository

Vincent Aguilera; Sophie Cluet; Tova Milo; Pierangelo Veltri; Dan Vodislav

Abstract. We are interested in defining and querying views in a huge and highly heterogeneous XML repository (Web scale). In this context, view definitions are very large, involving lots of sources, and there is no apparent limitation to their size. This raises interesting problems that we address in the paper: (i) how to distribute views over several machines without having a negative impact on the query translation process; (ii) how to quickly select the relevant part of a view given a query; (iii) how to minimize the cost of communicating potentially large queries to the machines where they will be evaluated. The solution that we propose is based on a simple view definition language that allows for automatic generation of views. The language maps paths in the view abstract DTD to paths in the concrete source DTDs. It enables a distributed implementation of the view system that is scalable both in terms of data and load. In particular, the query translation algorithm is shown to have a good (linear) complexity.

international conference on management of data | 1996

Evaluating queries with generalized path expressions

Vassilis Christophides; Sophie Cluet; Guido Moerkotte

In the past few years, query languages featuring generalized path expressions have been proposed. These languages allow the interrogation of both data and structure. They are powerful and essential for a number of applications. However, until now, their evaluation has relied on a rather naive and inefficient algorithm.In this paper, we extend an object algebra with two new operators and present some interesting rewriting techniques for queries featuring generalized path expressions. We also show how a query optimizer can integrate the new techniques.

Information Systems | 1998

Designing OQL: allowing objects to be queried

Sophie Cluet

Abstract This paper tells the story of OQL, the standard query language of the Object Database Management Group (ODMG) [30]. The story starts in 1988, at INRIA in the Altair Group.‡ The objective of that group was to develop an object-oriented database system [41]. This objective was reached: in September 1991 the O 2 database system started its commercial career as the main product of a company called O 2 Technology [6]. As opposed to its competitors, O 2 featured a full-fledged query language named O 2 SQL [22]. The story goes on with the creation of the ODMG in 1991 and the adoption of O 2 SQL as the standard object query language under its new and final name: OQL. During the following years, OQL went through some modifications, the most important of which resulted in OQL 1.2 that offers some level of compliance with SQL92. On top of providing the expressive power of the SQL92 query language [54], OQL allows objects to be queried. This is a claim also supported by the upcoming SQL3. However, due to its adequacy to the object oriented type system and its functional nature, OQL is much simpler to learn, use and implement. A goal of this paper is to demonstrate this. This paper tells about the mistakes and pertinent choices we made while designing and implementing OQL. I hope it also conveys the great pleasure I had to be part of this adventure.

international conference on database theory | 1997

Correspondence and Translation for Heterogeneous Data

Serge Abiteboul; Sophie Cluet; Tova Milo

We presented a specification of the integration of heterogeneous data based on correspondence rules. We showed how a unique specification can served many purposes (including two-way translation) assuming some reasonable restrictions. We claim that the framework and restrictions are acceptable in practice, and in particular one can show that all the document-OODB correspondences/translations of [2, 3] are covered. We are currently working on further substantiating this by more experimentation.

Computer Networks | 2002

The Xyleme project

Serge Abiteboul; Sophie Cluet; Guy Ferran; Marie-Christine Rousset

Abstract The current development of the Web and the generalization of XML technology [ http://www.w3.org/ ] provides a major opportunity to radically change the management of distributed data. We have developed a prototype of a dynamic warehouse for XML data , namely Xyleme. In the present paper, we briefly present some motivation and important aspects of the work performed in the framework of the Xyleme project. (A short preliminary version of this paper appeared in [IEEE Data Engng. Bull. 24 (2) (2001) 40].) The project was completed at the end of 2000 with the implementation of a prototype. The prototype was then turned into a product by a start-up company also called Xyleme [ http://www.xyleme.com/ ].

Explore More