Zoé Lacroix | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zoé Lacroix is active.

Explore More

Publication

Featured researches published by Zoé Lacroix.

conference on information and knowledge management | 2001

XOO7: applying OO7 benchmark to XML query processing tool

Ying Guang Li; Stéphane Bressan; Gillian Dobbie; Zoé Lacroix; Mong Li Lee; Ullas Nambiar; Bimlesh Wadhwa

If XML is to play the critical role of the lingua franca for Internet data interchange that many predict, it is necessary to start designing and adopting benchmarks allowing the comparative performance analysis of the tools being developed and proposed. The effectiveness of existing XML query languages has been studied by many, with a focus on the comparison of linguistic features, implicitly reflecting the fact that most XML tools exist only on paper. In this paper, with a focus on efficiency and concreteness, we propose a pragmatic first step toward the systematic benchmarking of XML query processing platforms with an initial focus on the data (versus document) point of view. We propose XOO7, an XML version of the OO7 benchmark. We discuss the applicability of XOO7, its strengths, limitations and the extensions we are considering. We illustrate its use by presenting and discussing the performance comparison against XOO7 of three different query processing platforms for XML.

international conference of the ieee engineering in medicine and biology society | 2002

Biological data integration: wrapping data and tools

Zoé Lacroix

Scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data access, analysis, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an Extensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.

advances in geographic information systems | 2002

A WFS-based mediation system for GIS interoperability

Omar Boucelma; Mehdi Essid; Zoé Lacroix

The proliferation of spatial data on the Internet is beginning to allow a much wider access to data currently available in various Geographic Information Systems (GIS). In order to move to a real Web-based community where geographical data can be accessed and exchanged, we need to provide flexible and powerful GIS data integration solutions. Indeed, GIS are highly heterogeneous: not only they differ by their data representations, but they also offer radically different query languages. A GIS mediation approach should provide (1) an integrated view of the data supplied by all sources, and (2) a geographical query language to access and manipulate integrated data.In this paper we propose an approach that not only focuses on the data integration, but also addresses the integration of query capabilities available at the sources. A GIS may provide a query capability inexistent at another GIS, whereas two query capabilities may be similar but with a slightly different semantics. We introduce the notion of derived wrappers that capture additional query capabilities to either compensate capabilities lacking at a source, or to adjust an existing capability in order to make it homogeneous with other similar capabilities, wrapped at other sources. Finally we describe the implementation of the presented approach that complies with OpenGIS WFS recommendation.

bioinformatics and bioengineering | 2001

Optimized seamless integration of biomolecular data

Barbara A. Eckman; Zoé Lacroix; Louiqa Raschid

Today, scientific data is inevitably digitized, stored in a variety of heterogeneous formats, and is accessible over the Internet. Scientists need to access an integrated view of multiple remote or local heterogeneous data sources. They then integrate the results of complex queries and apply further analysis and visualization to support the task of scientific discovery. Building a digital library for scientific discovery requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web, as well as data that is locally materialized in warehouses or is generated by software. We consider several tasks to provide optimized and seamless integration of biomolecular data. Challenges to be addressed include capturing and representing source capabilities; developing a methodology to acquire and represent metadata about source contents and access costs; and decision support to select sources and capabilities using cost based and semantic knowledge, and generating low cost query evaluation plans.

data integration in the life sciences | 2004

Efficient Techniques to Explore and Rank Paths in Life Science Data Sources

Zoé Lacroix; Louiqa Raschid; Maria-Esther Vidal

Life science data sources represent a complex link-driven federation of publicly available Web accessible sources. A fundamental need for scientists today is the ability to completely explore all relationships between scientific classes, e.g., genes and citations, that may be retrieved from various data sources. A challenge to such exploration is that each path between data sources potentially has different domain specific semantics and yields different benefit to the scientist. Thus, it is important to efficiently explore paths so as to generate paths with the highest benefits. In this paper, we explore the search space of paths that satisfy queries expressed as regular expressions. We propose an algorithm ESearch that runs in polynomial time in the size of the graph when the graph is acyclic. We present expressions to determine the benefit of a path based on metadata (statistics). We develop a heuristic search OnlyBestXX%. Finally, we compare OnlyBestXX% and ESearch.

IEEE Internet Computing | 2002

Current approaches to XML management

Ullas Nambiar; Zoé Lacroix; Stéphane Bressan; Mong Li Lee; Yingguang Li

The Extensible Markup Language has become the standard for information interchange on the Web. We study the data- and document-centric uses of XML management systems (XMLMS). We want to provide XML data users with a guideline for choosing the data management system that best meets their needs. Because the systems we test are first-generation approaches, we suggest a hypothetical design for a useful XML database that could use all the expressive power of XML and XML query languages.

very large data bases | 2003

The XOO7 Benchmark

Stéphane Bressan; Mong Li Lee; Ying Guang Li; Zoé Lacroix; Ullas Nambiar

As XML becomes the standard for electronic data interchange, benchmarks are needed to provide a comparative performance analysis of XML Management Systems (XMLMS). Typically a benchmark should adhere to four criteria: relevance, portability, scalability and simplicity [1]. The data structure of a benchmark for XML must be complex enough to capture the characteristics of XML data representation. Data sets should be in various sizes. Benchmark queries should only be defined with the primitives of the language.

data integration in the life sciences | 2004

Links and paths through life sciences data sources

Zoé Lacroix; Hyma Murthy; Felix Naumann; Louiqa Raschid

An abundance of biological data sources contain data on classes of scientific entities, such as genes and sequences. Logical relationships between scientific objects are implemented as URLs and foreign IDs. Query processing typically involves traversing links and paths (concatenation of links) through these sources. We model the data objects in these sources and the links between objects as an object graph. Analogous to database cost models, we use samples and statistics from the object graph to develop a framework to estimate the result size for a query on the object graph.

data and knowledge engineering | 2005

Accelerating queries by pruning XML documents

Stéphane Bressan; Barbara Catania; Zoé Lacroix; Ying Guang Li; Anna Maddalena

Some XML query processors operate on an internal representation of XML documents and can leverage neither the XML storage structure nor the possible access methods dedicated to this storage structure. Such query processors are often used in organizations that usually process transient XML documents received from other organizations. In this paper, we propose a different approach to accelerating query execution on XML source documents in such environments. The approach is based on the notion of query equivalence of XML documents with respect to a query. Under this equivalence, we propose two different document transformation strategies which prune parts of the documents irrelevant to the query, just before executing the query itself. The proposed transformations are implemented and evaluated using a two-level index structure: a structural directory capturing document paths and an inverted index of tag offsets.

statistical and scientific database management | 2006

Semantic Map of Services for Structural Bioinformatics

Pierre Tufféry; Zoé Lacroix; Hervé Ménager

We present a semantic map of resources for structural bioinformatics applied to proteins, i.e., various methods to predict and analyze protein structures in silico. Our map depicts resources on two levels: a logical level that provides a high-level description of the scientific concepts using a domain ontology; a physical level, that describes the actual resources implementing these connections. Scientists can use our system to express a query that captures their scientific aim, and are guided to identify the resources best meeting their needs. It is intended to provide scientists a tool to register and share knowledge about the available services in this field. Our approach addresses the problem of semantic interoperability of scientific resources publicly available on the Web

Explore More