Stefan Pröll
University of Vienna
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stefan Pröll.
international conference on big data | 2013
Stefan Pröll; Andreas Rauber
Uniquely and precisely identifying and citing arbitrary subsets of data is essential in many settings, e.g. to facilitate experiment validation and data re-use in meta-studies. Current approaches relying on pointers to entire data collections or on explicit copies of data do not scale. We propose a novel approach relying on persistent, timestamped, adapted queries to versioned and timestamped data sources. Result set hashes are used for validation correctness on later re-execution. The proposed method works both for static as well as dynamically growing or changing data. Alternative implementation styles for relational databases are presented and evaluated with regard to performance issues and impact on existing applications while aiming at minimal to no additional effort requirements for data users. The approach is validated in an infrastructure monitoring domain relying on sensor data networks.
international conference on data technologies and applications | 2014
Stefan Pröll; Andreas Rauber
Sharing research data is becoming increasingly important as it enables peers to validate and reproduce data driven experiments. Without original raw data at hand, serious peer review is impossible. Also exchanging data allows scientists to reuse data in different contexts and gather new knowledge from available sources. But with increasing volume and iteratively enhanced data sets, researchers need to reference exact versions of data sets. Until now access to research data often based on single archives of data files where versioning and subsetting support is limited. In this paper we introduce a mechanism that allows researchers to create versioned subsets of research data which can be cited and shared in a lightweight and secure manner. We demonstrate a prototype that supports researchers in creating subsets based on filtering and sorting source data. These subsets can be cited for later reference and reuse. The system produces evidence that allows users to verify the correctness and completeness of a subset based on cryptographic hashing. We describe a replication scenario for enabling scalable data citation in dynamic contexts.
D-lib Magazine | 2017
Stefan Pröll; Andreas Rauber
A large portion of scientific results is based on analysing and processing research data. In order for an eScience experiment to be reproducible, we need to able to identify precisely the data set which was used in a study. Considering evolving data sources this can be a challenge, as studies often use subsets which have been extracted from a potentially large parent data set. Exporting and storing subsets in multiple versions does not scale with large amounts of data sets. For tackling this challenge, the RDA Working Group on Data Citation has developed a framework and provides a set of recommendations, which allow identifying precise subsets of evolving data sources based on versioned data and timestamped queries. In this work, we describe how this method can be applied in small scale research data scenarios and how it can be implemented in large scale data facilities having access to sophisticated data infrastructure. We describe how the RDA approach improves the reproducibility of eScience experiments and we provide an overview of existing pilots and use cases in small and large scale settings.
international conference theory and practice digital libraries | 2013
Rudolf Mayer; Stefan Pröll; Andreas Rauber; Raúl Palma; Daniel Garijo
In the domain of eScience, investigations are increasingly collaborative. Most scientific and engineering domains benefit from building on top of the outputs of other research: By sharing information to reason over and data to incorporate in the modelling task at hand.
iPRES | 2013
Tomasz Miksa; Stefan Pröll; Rudolf Mayer; Stephan Strodl; Ricardo Vieira; José Barateiro; Andreas Rauber
TCDL Bulletin | 2016
Andreas Rauber; Ari Asmi; Dieter van Uytvanck; Stefan Pröll
international conference on data technologies and applications | 2018
Stefan Pröll; Andreas Rauber
Ercim News | 2015
Stefan Pröll; Andreas Rauber
iPRES | 2012
Rudolf Mayer; Stefan Pröll; Andreas Rauber
DAMDID/RCDL | 2015
Andreas Rauber; Tomasz Miksa; Rudolf Mayer; Stefan Pröll