Is this you? Create Your Porfile

Gianpaolo Coro

Istituto di Scienza e Tecnologie dell'Informazione

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gianpaolo Coro is active.

Explore More

Publication

Featured researches published by Gianpaolo Coro.

Concurrency and Computation: Practice and Experience | 2016

Species distribution modeling in the cloud

Leonardo Candela; Donatella Castelli; Gianpaolo Coro; Pasquale Pagano; Fabio Sinibaldi

Species distribution modeling is a process aiming at computationally predicting the distribution of species in geographic areas on the basis of environmental parameters including climate data. Such a quantitative approach has a lot of potentialities in many areas that include setting up conservation priorities, testing biogeographic hypotheses, and assessing the impact of accelerated land use. To further promote the diffusion of such an approach, it is fundamental to develop a flexible, comprehensive, and robust environment capable of enabling practitioners and communities of practice to produce species distribution models more efficiently. A promising way to build such an environment is offered by modern infrastructures promoting the sharing of resources, including hardware, software, data, and services. This paper describes an approach to species distribution modeling based on a Hybrid Data Infrastructure that can offer a rich array of data and data management services by leveraging other infrastructures (including Cloud). It discusses the whole set of services needed to support the phases of such a complex process including access to occurrence records and environmental parameters and the processing of such information to predict the probability of a species’ occurrence in given areas.Copyright

Concurrency and Computation: Practice and Experience | 2015

Parallelizing the execution of native data mining algorithms for computational biology

Gianpaolo Coro; Leonardo Candela; Pasquale Pagano; Angela Italiano; Loredana Liccardo

Data mining is being increasingly used in biology. Biologists are adopting prototyping languages, like R and Matlab, to facilitate the application of data mining algorithms to their data. As a result, their scripts are becoming increasingly complex and also require frequent updates. Application to large datasets becomes impractical and the time‐to‐paper increases. Furthermore, even if there are various systems that can be used to efficiently process large datasets, for example, using Cloud and High Performance Computing, they usually require procedures to be translated into specific languages or to be adapted to a certain computing platform. Such modifications can speed up the processing, but translation is not automatic, especially in complex cases, and can require a large amount of programming effort and accurate validation. In this paper, we propose an approach to parallelize data mining procedures in the form of compiled software or R scripts developed by biology communities of practice. Our approach requires minimal alteration of the original code. In many cases, there is no need for code modification. Furthermore, it allows for fast updating when a new version is ready. We clarify the constraints and the benefits of our method and report a practical use case to demonstrate such benefits compared with a standard execution. Our approach relies on a distributed network of web services and ultimately exposes the algorithms as‐a‐Service, to be invoked by remote thin clients. Copyright

Ecological Informatics | 2015

Retrieving taxa names from large biodiversity data collections using a flexible matching workflow

Edward Vanden Berghe; Gianpaolo Coro; Nicolas Bailly; Fabio Fiorellato; Caselyn Aldemita; Anton Ellenbroek; Pasquale Pagano

Abstract In the domain of biological classification there are several taxon name matching services that can search for a species scientific name in a large collection of taxonomic names. Many of these services are available online, and many others run on computers of individual scientists. While these systems may work very well, most suffer from the fact that the list of names used as a reference, and the criteria to decide on a match, are hard-coded in the engine that performs the name matching. In this paper we present BiOnym, a taxon name matching system that separates reference namelists, search criteria and matching engine. The user is offered a choice of several taxonomic reference lists, including the option to upload his/her own list onto the system. Furthermore, BiOnym is a flexible workflow, which embeds and combines techniques using lexical matching algorithms as well as expert knowledge. It is also an open platform allowing developers to contribute with new techniques. In this paper we demonstrate the benefits brought by this approach in terms of the efficiency and effectiveness of the information retrieval process with respect to other solutions.

Giscience & Remote Sensing | 2014

Comparing heterogeneous distribution maps for marine species

Gianpaolo Coro; Pasquale Pagano; Anton Ellenbroek

Automated comparison of heterogeneous geographical distribution maps detects statistical or punctual differences between these maps. The maps’ contents are heterogeneous; they can differ in format, resolution and scale. In this paper, the comparison is applied to species distributions in geographic areas. We present an automatic procedure to compare distribution maps for marine species. The comparison calculates the similarities at two different granularities, a detailed one that relies on point-to-point comparisons and a more general one using Kappa statistic. Furthermore, our method compares global scale maps by applying a solution based on cloud computing. We demonstrate the effectiveness of our approach by a practical use case and the efficiency by comparing it with a sequential computation. The method allows, for instance, marine biologists and fisheries managers to compare maps. The efficiency allows employing it in interactive systems that need to produce results in short time.

Environmental and Ecological Statistics | 2016

Automatic classification of climate change effects on marine species distributions in 2050 using the AquaMaps model

Gianpaolo Coro; Chiara Magliozzi; Anton Ellenbroek; Kristin Kaschner; Pasquale Pagano

Habitat modifications driven by human impact and climate change may influence species distribution, particularly in aquatic environments. Niche-based models are commonly used to evaluate the availability and suitability of habitat and assess the consequences of future climate scenarios on a species range and shifting edges of its distribution. Together with knowledge on biology and ecology, niche models also allow evaluating the potential of species to react to expected changes. The availability of projections of future climate scenarios allows comparing current and future niche distributions, assessing a species’ habitat suitability modification and shift, and consequently estimating potential species’ reaction. In this study, differences between the distribution maps of 406 marine species, which were produced by the AquaMaps niche models on current and future (year 2050) scenarios, were estimated and evaluated. Discrepancy measurements were used to identify a discrete number of categories, which represent different responses to climate change. Clustering analysis was then used to automatically detect these categories, demonstrating their reliability compared to human supervised classification. Finally, the distribution of characteristics like extinction risk (based on IUCN categories), taxonomic groups, population trends and habitat suitability change over the clustering categories was evaluated. In this assessment, direct human impact was neglected, in order to focus only on the consequences of environmental changes. Furthermore, in the comparison between two climate snapshots, the intermediate phases were assumed to be implicitly included into the model of the 2050 climate scenario.

IWSG | 2016

Virtual research environments as-a-service by gCube

Massimiliano Assante; Leonardo Candela; Donatella Castelli; Gianpaolo Coro; Lucio Lelii; Pasquale Pagano

5 Science is in continuous evolution and so are the methodologies and approaches scientists tend to apply by calling for 6 appropriate supporting environments. This is in part due to the limitations of the existing practices and in part due to the new 7 possibilities offered by technology advances. gCube is a software system promoting elastic and seamless access to research assets 8 (data, services, computing) across the boundaries of institutions, disciplines and providers to favour collaborative-oriented research 9 tasks. Its primary goal is to enable Hybrid Data Infrastructures facilitating the dynamic definition and operation of Virtual Research 10 Environments. To this end, it offers a comprehensive set of data management commodities on various types of data and a rich 11 array of “mediators” to interface well-established Infrastructures and Information Systems from various domains. Its effectiveness 12 has been proved by operating the D4Science.org infrastructure and serving concrete, multidisciplinary, challenging, and large scale 13 scenarios. This paper gives an overview of the gCube system. 14

Concurrency and Computation: Practice and Experience | 2017

Cloud computing in a distributed e‐infrastructure using the web processing service standard

Gianpaolo Coro; Giancarlo Panichi; Paolo Scarponi; Pasquale Pagano

New Science paradigms have recently evolved to promote open publication of scientific findings as well as multi‐disciplinary collaborative approaches to scientific experimentation. These approaches can face modern scientific challenges but must deal with large quantities of data produced by industrial and scientific experiments. These data, so‐called Big Data, require to introduce new computer science systems to help scientists cooperate, extract information, and possibly produce new knowledge out of the data. E‐infrastructures are distributed computer systems that foster collaboration between users and can embed distributed and parallel processing systems to manage big data. However, in order to meet modern Science requirements, e‐Infrastructures impose several requirements to computational systems in turn, eg, being economically sustainable, managing community‐provided processes, using standard representations for processes and data, managing big data size and heterogeneous representations, supporting reproducible Science, collaborative experimentation, and cooperative online environments, managing security and privacy for data and services. In this paper, we present a cloud computing system (gCube DataMiner) that meets these requirements and operates in an e‐Infrastructure, while sharing characteristics with state‐of‐the‐art cloud computing systems. To this aim, DataMiner uses the web processing service standard of the open geospatial consortium and introduces features like collaborative experimental spaces, automatic installation of processes and services on top of a flexible and sustainable cloud computing architecture. We compare DataMiner with another mature cloud computing system and highlight the benefits our system brings, the new paradigms requirements it satisfies, and the applications that can be developed based on this system.

International Journal of Digital Earth | 2016

Building a European geothermal information network using a distributed e-Infrastructure

Eugenio Trumpy; Gianpaolo Coro; Adele Manzella; Pasquale Pagano; Donatella Castelli; P. Calcagno; Annamaria Nador; Thorvaldur Bragasson; Sylvain Grellet; Gunter Siddiqi

ABSTRACT Geothermal data are published using different IT services, formats and content representations, and can refer to both regional and global scale information. Geothermal stakeholders search for information with different aims. E-Infrastructures are collaborative platforms that address this diversity of aims and data representations. In this paper, we present a prototype for a European Geothermal Information Platform that uses INSPIRE recommendations and an e-Infrastructure (D4Science) to collect, aggregate and share data sets from different European data contributors, thus enabling stakeholders to retrieve and process a large amount of data. Our system merges segmented and national realities into one common framework. We demonstrate our approach by describing a platform that collects data from Italian, French, Hungarian, Swiss and Icelandic geothermal data providers.

international conference on adaptive and natural computing algorithms | 2013

Automatic Procedures to Assist in Manual Review of Marine Species Distribution Maps

Gianpaolo Coro; Pasquale Pagano; Anton Ellenbroek

Ecological Niche Modeling (ENM) is a branch of biology that uses algorithms to predict the distribution of species in a geographic area on the basis of a numerical representation of their preferred habitat and environment. Algorithmic maps can be produced for suitable or native habitats and require a review by human experts. During the review operation biologists use their knowledge about a species to modify the maps. They usually take algorithmic maps as starting point in the review. In this paper we provide a methodology for biologists to use the automatic maps as references also during and after the review process. Our approach is based on a comparison between the reviewed map and two systems: an expert system and a Feed Forward Neural Network. Furthermore we suggest an evaluation procedure of the quality of the environmental features used as training set, for assessing the models reliability.

italian research conference on digital library management systems | 2012

Supporting Tabular Data Characterization in a Large Scale Data Infrastructure by Lexical Matching Techniques

Leonardo Candela; Gianpaolo Coro; Pasquale Pagano

Digital Libraries continue to evolve towards research environments supporting access and management of multiform Information Objects spread across multiple data sources and organizational domains. This evolution has introduced the need to deal with Information Objects having traits different from those characterizing Digital Libraries at their early stages and to revise the services supporting their management. Tabular data represent a class of Information Objects that require to be efficiently managed because of their core role in many eScience scenarios. This paper discusses the tabular data characterization problem, i.e., the problem of identifying the reference dataset of any column of the dataset. In particular, the paper presents an approach based on lexical matching techniques to support users during the data curation phase by providing them with a ranked list of reference datasets suitable for a dataset column.

Explore More