Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stefan Pröll is active.

Publication


Featured researches published by Stefan Pröll.


international conference on big data | 2013

Scalable data citation in dynamic, large databases: Model and reference implementation

Stefan Pröll; Andreas Rauber

Uniquely and precisely identifying and citing arbitrary subsets of data is essential in many settings, e.g. to facilitate experiment validation and data re-use in meta-studies. Current approaches relying on pointers to entire data collections or on explicit copies of data do not scale. We propose a novel approach relying on persistent, timestamped, adapted queries to versioned and timestamped data sources. Result set hashes are used for validation correctness on later re-execution. The proposed method works both for static as well as dynamically growing or changing data. Alternative implementation styles for relational databases are presented and evaluated with regard to performance issues and impact on existing applications while aiming at minimal to no additional effort requirements for data users. The approach is validated in an infrastructure monitoring domain relying on sensor data networks.


international conference on data technologies and applications | 2014

A Scalable Framework for Dynamic Data Citation of Arbitrary Structured Data

Stefan Pröll; Andreas Rauber

Sharing research data is becoming increasingly important as it enables peers to validate and reproduce data driven experiments. Without original raw data at hand, serious peer review is impossible. Also exchanging data allows scientists to reuse data in different contexts and gather new knowledge from available sources. But with increasing volume and iteratively enhanced data sets, researchers need to reference exact versions of data sets. Until now access to research data often based on single archives of data files where versioning and subsetting support is limited. In this paper we introduce a mechanism that allows researchers to create versioned subsets of research data which can be cited and shared in a lightweight and secure manner. We demonstrate a prototype that supports researchers in creating subsets based on filtering and sorting source data. These subsets can be cited for later reference and reuse. The system produces evidence that allows users to verify the correctness and completeness of a subset based on cryptographic hashing. We describe a replication scenario for enabling scalable data citation in dynamic contexts.


D-lib Magazine | 2017

Enabling Reproducibility for Small and Large Scale Research Data Sets

Stefan Pröll; Andreas Rauber

A large portion of scientific results is based on analysing and processing research data. In order for an eScience experiment to be reproducible, we need to able to identify precisely the data set which was used in a study. Considering evolving data sources this can be a challenge, as studies often use subsets which have been extracted from a potentially large parent data set. Exporting and storing subsets in multiple versions does not scale with large amounts of data sets. For tackling this challenge, the RDA Working Group on Data Citation has developed a framework and provides a set of recommendations, which allow identifying precise subsets of evolving data sources based on versioned data and timestamped queries. In this work, we describe how this method can be applied in small scale research data scenarios and how it can be implemented in large scale data facilities having access to sophisticated data infrastructure. We describe how the RDA approach improves the reproducibility of eScience experiments and we provide an overview of existing pilots and use cases in small and large scale settings.


international conference theory and practice digital libraries | 2013

From Preserving Data to Preserving Research: Curation of Process and Context

Rudolf Mayer; Stefan Pröll; Andreas Rauber; Raúl Palma; Daniel Garijo

In the domain of eScience, investigations are increasingly collaborative. Most scientific and engineering domains benefit from building on top of the outputs of other research: By sharing information to reason over and data to incorporate in the modelling task at hand.


iPRES | 2013

Framework for Verification of Preserved and Redeployed Processes

Tomasz Miksa; Stefan Pröll; Rudolf Mayer; Stephan Strodl; Ricardo Vieira; José Barateiro; Andreas Rauber


TCDL Bulletin | 2016

Identification of Reproducible Subsets for Data Citation, Sharing and Re-Use.

Andreas Rauber; Ari Asmi; Dieter van Uytvanck; Stefan Pröll


international conference on data technologies and applications | 2018

Citable by Design - A Model for Making Data in Dynamic Environments Citable

Stefan Pröll; Andreas Rauber


Ercim News | 2015

Asking the Right Questions - Query-Based Data Citation to Precisely Identify Subsets of Data.

Stefan Pröll; Andreas Rauber


iPRES | 2012

The Applicability of Workflow Management Systems for the Preservation of Business Processes.

Rudolf Mayer; Stefan Pröll; Andreas Rauber


DAMDID/RCDL | 2015

Repeatability and Re-usability in Scientific Processes: Process Context, Data Identification and Verification

Andreas Rauber; Tomasz Miksa; Rudolf Mayer; Stefan Pröll

Collaboration


Dive into the Stefan Pröll's collaboration.

Top Co-Authors

Avatar

Andreas Rauber

Vienna University of Technology

View shared research outputs
Top Co-Authors

Avatar

Rudolf Mayer

Vienna University of Technology

View shared research outputs
Top Co-Authors

Avatar

Tomasz Miksa

Vienna University of Technology

View shared research outputs
Top Co-Authors

Avatar

Daniel Garijo

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Raúl Palma

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar

Stephan Strodl

Vienna University of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge