Alessandro Spinuso
Royal Netherlands Meteorological Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alessandro Spinuso.
workflows in support of large scale science | 2014
Sandra Gesing; Malcolm P. Atkinson; Rosa Filgueira; Ian Taylor; Andrew Clifford Jones; Vlado Stankovski; Chee Sun Liew; Alessandro Spinuso; Gabor Terstyanszky; Péter Kacsuk
In the last 20 years quite a few mature workflow engines and workflow editors have been developed to support communities in managing workflows. While there is a trend followed by the providers of workflow engines to ease the creation of workflows tailored to their specific workflow system, the management tools still often necessitate much understanding of the workflow concepts and languages. This paper describes the approach targeting various workflow systems and building a single user interface for editing and monitoring workflows under consideration of aspects such as optimization and provenance of data. The design allots agile Web frameworks and novel technologies to build a workflow dashboard offered in a web browser and connecting seamlessly to available workflow systems and external resources like Cloud infrastructures. The user interface eliminates the need to become acquainted with diverse layouts. Thus, the usability is immensely increased for various aspects of managing workflows.
international conference on e-science | 2015
Malcolm P. Atkinson; Michele Carpenè; Emanuele Casarotti; Steffen Claus; Rosa Filgueira; Anton Frank; Michelle Galea; Tom Garth; André Gemünd; Heiner Igel; Iraklis Klampanos; Amrey Krause; Lion Krischer; Siew Hoon Leong; Federica Magnoni; Jonas Matser; Alberto Michelini; Andreas Rietbrock; Horst Schwichtenberg; Alessandro Spinuso; Jean-Pierre Vilotte
The VERCE project has pioneered an e-Infrastructure to support researchers using established simulation codes on high-performance computers in conjunction with multiple sources of observational data. This is accessed and organised via the VERCE science gateway that makes it convenient for seismologists to use these resources from any location via the Internet. Their data handling is made flexible and scalable by two Python libraries, ObsPy and dispel4py and by data services delivered by ORFEUS and EUDAT. Provenance driven tools enable rapid exploration of results and of the relationships between data, which accelerates understanding and method improvement. These powerful facilities are integrated and draw on many other e-Infrastructures. This paper presents the motivation for building such systems, it reviews how solid-Earth scientists can make significant research progress using them and explains the architecture and mechanisms that make their construction and operation achievable. We conclude with a summary of the achievements to date and identify the crucial steps needed to extend the capabilities for seismologists, for solid-Earth scientists and for similar disciplines.
edbt icdt workshops | 2013
Alessandro Spinuso; James Cheney; Malcolm P. Atkinson
Harvesting provenance for streaming workflows presents challenges related to the high rate of the updates and a large distribution of the execution, which can be spread across several institutional infrastructures. Moreover, the typically large volume of data produced by each transformation step can not be always stored and preserved efficiently. This can represent an obstacle for the evaluation of the results, for instance, in real-time, suggesting the importance of customisable metadata extraction procedures. In this paper we present our approach to the aforementioned provenance challenges within a use-case driven scenario in the field of seismology, which requires the execution of processing pipelines over a large datastream. In particular, we will discuss the current implementation and the upcoming challenges for an in-worfklow programmatic approach to provenance tracing, building on composite functions, selective recording and domain specific metadata production.
international conference on e-science | 2015
Rosa Filgueira; Amrey Krause; Malcolm P. Atkinson; Iraklis Klampanos; Alessandro Spinuso; Susana Sanchez-Exposito
We present dispel4py a versatile data-intensive kit presented as a standard Python library. It empowers scientists to experiment and test ideas using their familiar rapid-prototyping environment. It delivers mappings to diverse computing infrastructures, including cloud technologies, HPC architectures and specialised data-intensive machines, to move seamlessly into production with large-scale data loads. The mappings are fully automated, so that the encoded data analyses and data handling are completely unchanged. The underpinning model is lightweight composition of fine-grained operations on data, coupled together by data streams that use the lowest cost technology available. These fine-grained workflows are locally interpreted during development and mapped to multiple nodes and systems such as MPI and Storm for production. We explain why such an approach is becoming more essential in order that data-driven research can innovate rapidly and exploit the growing wealth of data while adapting to current technical trends. We show how provenance management is provided to improve understanding and reproducibility, and how a registry supports consistency and sharing. Three application domains are reported and measurements on multiple infrastructures show the optimisations achieved. Finally we present the next steps to achieve scalability and performance.
international conference on e-science | 2015
Daniele Bailo; Keith G. Jeffery; Alessandro Spinuso; Giuseppe Fiameni
EPOS is an e-Infrastructure for solid Earh science in Europe. It integrates many heterogeneous Research Infrastructures (RIs) using a novel approach based on the harmonization of existing service and component interfaces. EPOS is designed to provide an architectural framework for new Research Infrastructures in the domain, and to interface with increasing sophistication of existing RIs working with them in co-development from their present state to a future integrated state. The key is the metadata catalogue based on CERIF which provides the virtualization required for EPOS to provide a homogeneous view over the heterogeneity. Architectural concepts together with a plan for integration and collaboration with EPOS nodes in order to interoperate are presented in this paper.
Science Gateways for Distributed Computing Infrastructures | 2014
Tamas Kiss; Péter Kacsuk; Róbert Lovas; Ákos Balaskó; Alessandro Spinuso; Malcolm P. Atkinson; Daniele D’Agostino; Emanuele Danovaro; Michael Schiffers
Besides core project partners, the SCI-BUS project also supported several external user communities in developing and setting up customized science gateways. The focus was on large communities typically represented by other European research projects. However, smaller local efforts with the potential of generalizing the solution to wider communities were also supported. This chapter gives an overview of support activities related to user communities external to the SCI-BUS project. A generic overview of such activities is provided, followed by the detailed description of three gateways developed in collaboration with European projects: the agINFRA Science Gateway for Workflows for agricultural research, the VERCE Science Gateway for seismology, and the DRIHM Science Gateway for weather research and forecasting.
Procedia Computer Science | 2017
Daniele Bailo; Damian Ulbricht; Martin Nayembil; Luca Trani; Alessandro Spinuso; Keith G. Jeffery
EPOS is a Research Infrastructure plan that is undertaking the challenge of integrating data from different solid Earth disciplines and of providing a common knowledge-base for the Solid-Earth community in Europe, by implementing and managing a logically centralised catalog based on the CERIF model. The EPOS catalogue will contain the information about all the participating actors, such as Research Infrastructures, Organisations and their assets, in relationship with the people, their roles and their affilitation within the specific scientific domain. The catalogue will guarantee the discoverability of domain specific data, data products, software and services (DDSS) and enable the EPOS Integrated Core Services system to perform - on behalf of a end user advanced operations on data as for instance processing and visualization. It will also foster the homogenisation of vocabularies, as well as supporting heterogeneous metadata. Clearly, the effort of accomodating the diversities across all the players needs to take into account of existing initiatives concerning metadata standards and institutional recommendations, trying to satisfy the EPOS requirements by incorporating and profiling more generic concepts and semantics. The paper describes the approach of the EPOS metadata working group, providing the rationale behind the integration, extension and mapping strategy to converge the EPOS metadata baseline model towards the CERIF entities, relationships and vocabularies. Special attention will be given to the outcomes of the mapping process between two elements of the EPOS baseline - Research Infrastructure and Equipment - and CERIF, by providing detailed insights and description of the two data models, of encountered issues and of proposed solutions.
Archive | 2015
Rosa Filgueira; Amrey Krause; Malcolm P. Atkinson; Iraklis Klampanos; Alessandro Spinuso; Susana Sanchez-Exposito
We present dispel4py a versatile data-intensive kit presented as a standard Python library. It empowers scientists to experiment and test ideas using their familiar rapid-prototyping environment. It delivers mappings to diverse computing infrastructures, including cloud technologies, HPC architectures and specialised data-intensive machines, to move seamlessly into production with large-scale data loads. The mappings are fully automated, so that the encoded data analyses and data handling are completely unchanged. The underpinning model is lightweight composition of fine-grained operations on data, coupled together by data streams that use the lowest cost technology available. These fine-grained workflows are locally interpreted during development and mapped to multiple nodes and systems such as MPI and Storm for production. We explain why such an approach is becoming more essential in order that data-driven research can innovate rapidly and exploit the growing wealth of data while adapting to current technical trends. We show how provenance management is provided to improve understanding and reproducibility, and how a registry supports consistency and sharing. Three application domains are reported and measurements on multiple infrastructures show the optimisations achieved. Finally we present the next steps to achieve scalability and performance.
international supercomputing conference | 2013
Michele Carpenè; Iraklis Klampanos; Siew Hoon Leong; Emanuele Casarotti; Peter Danecek; Graziella Ferini; André Gemünd; Amrey Krause; Lion Krischer; Federica Magnoni; Marek Simon; Alessandro Spinuso; Luca Trani; Malcolm P. Atkinson; Giovanni Erbacci; Anton Frank; Heiner Igel; Andreas Rietbrock; Horst Schwichtenberg; Jean-Pierre Vilotte
Advanced application environments for seismic analysis help geoscientists to execute complex simulations to predict the behaviour of a geophysical system and potential surface observations. At the same time data collected from seismic stations must be processed comparing recorded signals with predictions. The EU-funded project VERCE ( http://verce.eu/ ) aims to enable specific seismological use-cases and, on the basis of requirements elicited from the seismology community, provide a service-oriented infrastructure to deal with such challenges. In this paper we present VERCE’s architecture, in particular relating to forward and inverse modelling of Earth models and how the, largely file-based, HPC model can be combined with data streaming operations to enhance the scalability of experiments. We posit that the integration of services and HPC resources in an open, collaborative environment is an essential medium for the advancement of sciences of critical importance, such as seismology.
IEEE | 2015
Rosa Filgueira Vicente; Amrey Krause; Malcolm P. Atkinson; Iraklis Klampanos; Alessandro Spinuso; Susana Sanchez-Exposito
We present dispel4py a versatile data-intensive kit presented as a standard Python library. It empowers scientists to experiment and test ideas using their familiar rapid-prototyping environment. It delivers mappings to diverse computing infrastructures, including cloud technologies, HPC architectures and specialised data-intensive machines, to move seamlessly into production with large-scale data loads. The mappings are fully automated, so that the encoded data analyses and data handling are completely unchanged. The underpinning model is lightweight composition of fine-grained operations on data, coupled together by data streams that use the lowest cost technology available. These fine-grained workflows are locally interpreted during development and mapped to multiple nodes and systems such as MPI and Storm for production. We explain why such an approach is becoming more essential in order that data-driven research can innovate rapidly and exploit the growing wealth of data while adapting to current technical trends. We show how provenance management is provided to improve understanding and reproducibility, and how a registry supports consistency and sharing. Three application domains are reported and measurements on multiple infrastructures show the optimisations achieved. Finally we present the next steps to achieve scalability and performance.