Juliana Freire | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Juliana Freire is active.

Explore More

Publication

Featured researches published by Juliana Freire.

international conference on data engineering | 2002

From XML schema to relations: a cost-based approach to XML storage

Philip Bohannon; Juliana Freire; Prasan Roy; Jérôme Siméon

As Web applications manipulate an increasing amount of XML, there is a growing interest in storing XML data in relational databases. Due to the mismatch between the complexity of XMLs tree structure and the simplicity of flat relational tables, there are many ways to store the same document in an RDBMS, and a number of heuristic techniques have been proposed. These techniques typically define fixed mappings and do not take application characteristics into account. However, a fixed mapping is unlikely to work well for all possible applications. In contrast, LegoDB is a cost-based XML storage mapping engine that explores a space of possible XML-to-relational mappings and selects the best mapping for a given application. LegoDB leverages current XML and relational technologies: (1) it models the target application with an XML Schema, XML data statistics, and an XQuery workload; (2) the space of configurations is generated through XML-Schema rewritings; and (3) the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer. We describe the LegoDB storage engine and provide experimental results that demonstrate the effectiveness of this approach.

international conference on management of data | 2006

VisTrails: visualization meets data management

Steven P. Callahan; Juliana Freire; Emanuele Santos; Carlos Eduardo Scheidegger; Cláudio T. Silva; Huy T. Vo

Scientists are now faced with an incredible volume of data to analyze. To successfully analyze and validate various hypothesis, it is necessary to pose several queries, correlate disparate data, and create insightful visualizations of both the simulated processes and observed phenomena. Often, insight comes from comparing the results of multiple visualizations. Unfortunately, today this process is far from interactive and contains many error-prone and time-consuming tasks. As a result, the generation and maintenance of visualizations is a major bottleneck in the scientific process, hindering both the ability to mine scientific data and the actual use of the data. The VisTrails system represents our initial attempt to improve the scientific discovery process and reduce the time to insight. In VisTrails, we address the problem of visualization from a data management perspective: VisTrails manages the data and metadata of a visualization product. In this demonstration, we show the power and flexibility of our system by presenting actual scenarios in which scientific visualization is used and showing how our system improves usability, enables reproducibility, and greatly reduces the time required to create scientific visualizations.

international conference on management of data | 2008

Provenance and scientific workflows: challenges and opportunities

Susan B. Davidson; Juliana Freire

Provenance in the context of workflows, both for the data they derive and for their specification, is an essential component to allow for result reproducibility, sharing, and knowledge re-use in the scientific community. Several workshops have been held on the topic, and it has been the focus of many research projects and prototype systems. This tutorial provides an overview of research issues in provenance for scientific workflows, with a focus on recent literature and technology in this area. It is aimed at a general database research audience and at people who work with scientific data and workflows. We will (1) provide a general overview of scientific workflows, (2) describe research on provenance for scientific workflows and show in detail how provenance is supported in existing systems; (3) discuss emerging applications that are enabled by provenance; and (4) outline open problems and new directions for database-related research.

Computing in Science and Engineering | 2008

Provenance for Computational Tasks: A Survey

Juliana Freire; David Koop; Emanuele Santos; Cláudio T. Silva

The problem of systematically capturing and managing provenance for computational tasks has recently received significant attention because of its relevance to a wide range of domains and applications. The authors give an overview of important concepts related to provenance management, so that potential users can make informed decisions when selecting or designing a provenance solution.

ieee visualization | 2005

VisTrails: enabling interactive multiple-view visualizations

Louis Bavoil; Steven P. Callahan; Patricia Crossno; Juliana Freire; Carlos Eduardo Scheidegger; Cláudio T. Silva; Huy T. Vo

VisTrails is a new system that enables interactive multiple-view visualizations by simplifying the creation and maintenance of visualization pipelines, and by optimizing their execution. It provides a general infrastructure that can be combined with existing visualization systems and libraries. A key component of VisTrails is the visualization trail (vistrail), a formal specification of a pipeline. Unlike existing dataflow-based systems, in VisTrails there is a clear separation between the specification of a pipeline and its execution instances. This separation enables powerful scripting capabilities and provides a scalable mechanism for generating a large number of visualizations. VisTrails also leverages the vistrail specification to identify and avoid redundant operations. This optimization is especially useful while exploring multiple visualizations. When variations of the same pipeline need to be executed, substantial speedups can be obtained by caching the results of overlapping subsequences of the pipelines. In this paper, we describe the design and implementation of VisTrails, and show its effectiveness in different application scenarios.

international provenance and annotation workshop | 2006

Managing rapidly-evolving scientific workflows

Juliana Freire; Cláudio T. Silva; Steven P. Callahan; Emanuele Santos; Carlos Eduardo Scheidegger; Huy T. Vo

We give an overview of VisTrails, a system that provides an infrastructure for systematically capturing detailed provenance and streamlining the data exploration process. A key feature that sets VisTrails apart from previous visualization and scientific workflow systems is a novel action-based mechanism that uniformly captures provenance for data products and workflows used to generate these products. This mechanism not only ensures reproducibility of results, but it also simplifies data exploration by allowing scientists to easily navigate through the space of workflows and parameter settings for an exploration task.

IEEE Transactions on Visualization and Computer Graphics | 2013

Visual Exploration of Big Spatio-Temporal Urban Data: A Study of New York City Taxi Trips

Nivan Ferreira; Jorge Poco; Huy T. Vo; Juliana Freire; Cláudio T. Silva

As increasing volumes of urban data are captured and become available, new opportunities arise for data-driven analysis that can lead to improvements in the lives of citizens through evidence-based decision making and policies. In this paper, we focus on a particularly important urban data set: taxi trips. Taxis are valuable sensors and information associated with taxi trips can provide unprecedented insight into many different aspects of city life, from economic activity and human behavior to mobility patterns. But analyzing these data presents many challenges. The data are complex, containing geographical and temporal components in addition to multiple variables associated with each trip. Consequently, it is hard to specify exploratory queries and to perform comparative analyses (e.g., compare different regions over time). This problem is compounded due to the size of the data-there are on average 500,000 taxi trips each day in NYC. We propose a new model that allows users to visually query taxi trips. Besides standard analytics queries, the model supports origin-destination queries that enable the study of mobility across the city. We show that this model is able to express a wide range of spatio-temporal queries, and it is also flexible in that not only can queries be composed but also different aggregations and visual representations can be applied, allowing users to explore and compare results. We have built a scalable system that implements this model which supports interactive response times; makes use of an adaptive level-of-detail rendering strategy to generate clutter-free visualization for large results; and shows hidden details to the users in a summary through the use of overlay heat maps. We present a series of case studies motivated by traffic engineers and economists that show how our model and system enable domain experts to perform tasks that were previously unattainable for them.

Journal of Statistical Mechanics: Theory and Experiment | 2007

The ALPS project release 2.0: open source software for strongly correlated systems

Bela Bauer; Lincoln D. Carr; Hans Gerd Evertz; Adrian E. Feiguin; Juliana Freire; Sebastian Fuchs; Lukas Gamper; Jan Gukelberger; Emanuel Gull; S Guertler; A Hehn; R Igarashi; Sergei V. Isakov; David Koop; Pn Ma; P Mates; Haruhiko Matsuo; Olivier Parcollet; G Pawłowski; Jd Picon; Lode Pollet; Emanuele Santos; V. W. Scarola; Ulrich Schollwöck; Cláudio T. Silva; Brigitte Surer; Synge Todo; Simon Trebst; Matthias Troyer; Michael L. Wall

We present release 2.0 of the ALPS (Algorithms and Libraries for Physics Simulations) project, an open source software project to develop libraries and application programs for the simulation of strongly correlated quantum lattice models such as quantum magnets, lattice bosons, and strongly correlated fermion systems. The code development is centered on common XML and HDF5 data formats, libraries to simplify and speed up code development, common evaluation and plotting tools, and simulation programs. The programs enable non-experts to start carrying out serial or parallel numerical simulations by providing basic implementations of the important algorithms for quantum lattice models: classical and quantum Monte Carlo (QMC) using non-local updates, extended ensemble simulations, exact and full diagonalization (ED), the density matrix renormalization group (DMRG) both in a static version and a dynamic time-evolving block decimation (TEBD) code, and quantum Monte Carlo solvers for dynamical mean field theory (DMFT). The ALPS libraries provide a powerful framework for programmers to develop their own applications, which, for instance, greatly simplify the steps of porting a serial code onto a parallel, distributed memory machine. Major changes in release 2.0 include the use of HDF5 for binary data, evaluation tools in Python, support for the Windows operating system, the use of CMake as build system and binary installation packages for Mac OS X and Windows, and integration with the VisTrails workflow provenance tool. The software is available from our web server at http://alps.comp-phys.org/.

international world wide web conferences | 2001

WebViews: accessing personalized web content and services

Juliana Freire; Bharat Kumar; Daniel F. Lieuwen

The ability to take information, entertainment and e-commerce on the go has great promise. However, the existing Web infrastructure and content were designed for desktop computers and are not well-suited for other types of accesses, e.g., devices that have less processing power and memory, small screens, and limited input facilities, or through wireless data networks with low bandwidth and high latency. Thus, there is a growing need for techniques that provide alternative means to access Web content and services, be it the ability to browse the Web through a wireless PDA or smart phone, or hands-free access through voice interfaces. In this paper, we discuss issues involved in making existing Web content and services available for diverse environments, and describe WebViews, a system that allows casual Web users to easily create customized views of Web sites that are well-suited for di erent types of terminals. In particular, we describe our approach to provide voice access to these Web views and experiences in building the system.

international conference on management of data | 2002

StatiX: making XML count

Juliana Freire; Jayant R. Haritsa; Maya Ramanath; Prasan Roy; Jérôme Siméon

The availability of summary data for XML documents has many applications, from providing users with quick feedback about their queries, to cost-based storage design and query optimization. StatiX is a novel XML Schema-aware statistics framework that exploits the structure derived by regular expressions (which define elements in an XML Schema) to pinpoint places in the schema that are likely sources of structural skew. As we discuss below, this information can be used to build concise, yet accurate, statistical summaries for XML data. StatiX leverages standard XML technology for gathering statistics, notably XML Schema validators, and it uses histograms to summarize both the structure and values in an XML document. In this paper we describe the StatiX system. We develop algorithms that decompose schemas to obtain statistics at different granularities and discuss how statistics can be gathered as documents are validated. We also present an experimental evaluation which demonstrates the accuracy and scalability of our approach and show an application of these statistics to cost-based XML storage design.

Explore More