Gianluca Roscigno
University of Salerno
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gianluca Roscigno.
advanced information networking and applications | 2014
Giuseppe Cattaneo; Gianluca Roscigno; Umberto Ferraro Petrillo
In this paper, we explore the possibility to solve a commonly-known digital image forensics problem, the Source Camera Identification (SCI) problem, using a distributed approach. The SCI problem requires to recognize the camera used to acquire a given digital image, distinguishing even among cameras of the same brand and model. The solution we present is based on the algorithm by Lukas Fridrich, as it is recognized by many as the reference solution for this problem, and is formulated according to the MapReduce paradigm, as implemented by the Hadoop framework. The first implementation we coded was straightforward to obtain as we leveraged the ability of the Hadoop framework to turn a stand-alone Java application into a distributed one with very few interventions on its original source code. However, our first experimental results with this code were not encouraging. Thus, we conducted a careful profiling activity that allowed us to pinpoint some serious performance issues arising with this vanilla porting of the algorithm. We then developed several optimizations to improve the performance of the Lukas algorithm by taking better advantage of the Hadoop framework. The out coming implementations have been subject to a thorough experimental analysis, conducted using a cluster of 33 commodity PCs and a data set of 5, 160 images. The experimental results show that the performance of our optimized implementations scale well with the number of computing nodes while exhibiting performance that are, at most, two times slower than the maximum speedup theoretically achievable.
Bioinformatics | 2017
Umberto Ferraro Petrillo; Gianluca Roscigno; Giuseppe Cattaneo; Raffaele Giancarlo
Summary: MapReduce Hadoop bioinformatics applications require the availability of special‐purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built‐in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers versatility and efficiency. That is, it can handle collections of reads, with or without quality scores, as well as long genomic sequences while the existing routines concentrate mainly on NGS sequence data. Moreover, in the domain where a comparison is possible, the routines proposed here are faster than the available ones. In conclusion, FASTdoop is a much needed addition to Hadoop‐BAM. Availability and Implementation : The software and the datasets are available at http://www.di.unisa.it/FASTdoop/. Contact : [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
network-based information systems | 2014
Giuseppe Cattaneo; Gianluca Roscigno
This paper aims to give a contribute to the experimental evaluation of tampered image detection algorithms (i.e. Image Integrity algorithms), by describing a possible way to improve these experimentations with respect to the traditional approaches followed in this area. In particular, the paper focuses on the problem of choosing a proper test dataset allowing to keep low the bias on the experimental performance of these kind of algorithms. The paper first describes a JPEG image integrity algorithm, the Lin et al. algorithm, that has been used as benchmark during our experiments. Then, the experimental performance of this algorithm are presented and discussed. These performance have been measured by running it on the CASIA TIDE public dataset, which represents the de facto standard for the experimental evaluation of image integrity algorithms. The considered algorithm apparently performs very well on this dataset. However, a closer analysis reveals the existence of some statistical artifacts in the dataset that improve the performance of the algorithm. In order to confirm this observation, we assembled an alternative dataset. This new dataset has been conceived to not exhibit the statistical artifacts existing in the images of the CASIA TIDE dataset, while producing an uniform distribution of some physical image features such as the quality factor. Then, we repeated the same experiments conducted on the CASIA TIDE dataset, using this new dataset. As expected, we observed a performance degradation of the Lin et al. algorithm, thus confirming our hypotheses about the CASIA TIDE dataset being, in some way, flawed.
The Journal of Supercomputing | 2017
Giuseppe Cattaneo; Umberto Ferraro Petrillo; Raffaele Giancarlo; Gianluca Roscigno
Alignment-free methods are one of the mainstays of biological sequence comparison, i.e., the assessment of how similar two biological sequences are to each other, a fundamental and routine task in computational biology and bioinformatics. They have gained popularity since, even on standard desktop machines, they are faster than methods based on alignments. However, with the advent of Next-Generation Sequencing Technologies, datasets whose size, i.e., number of sequences and their total length, is a challenge to the execution of alignment-free methods on those standard machines are quite common. Here, we propose the first paradigm for the computation of k-mer-based alignment-free methods for Apache Hadoop that extends the problem sizes that can be processed with respect to a standard sequential machine while also granting a good time performance. Technically, as opposed to a standard Hadoop implementation, its effectiveness is achieved thanks to the incremental management of a persistent hash table during the map phase, a task not contemplated by the basic Hadoop functions and that can be useful also in other contexts.
international conference on information and communication technology | 2014
Giuseppe Cattaneo; Gianluca Roscigno; Umberto Ferraro Petrillo
This paper aims to experimentally evaluate the performance of one popular algorithm for the detection of tampered JPEG images: the algorithm by Lin et al. [1]. We developed a reference implementation for this algorithm and performed a deep experimental analysis, by measuring its performance when applied to the images of the CASIA TIDE public dataset, the de facto standard for the experimental analysis of this family of algorithms. Our first results were very positive, thus confirming the good performance of this algorithm. However, a closer inspection revealed the existence of an unexpected anomaly in a consistent part of the images of the CASIA TIDE dataset that may have influenced our results as well as the results of previous studies conducted using this dataset. By taking advantage of this anomaly, we were able to develop a variant of the original algorithm which exhibited better performance on the same dataset.
IEEE Transactions on Big Data | 2017
Aniello Castiglione; Giuseppe Cattaneo; Giancarlo De Maio; Alfredo De Santis; Gianluca Roscigno
In the last decade Digital Forensics has experienced several issues when dealing with network evidence. Collecting network evidence is difficult due to its volatility. In fact, such information may change over time, may be stored on a server out jurisdiction or geographically far from the crime scene. On the other hand, the explosion of the Cloud Computing as the implementation of the Software as a Service (SaaS) paradigm is pushing users toward remote data repositories such as Dropbox, Amazon Cloud Drive, Apple iCloud, Google Drive, Microsoft OneDrive. In this paper is proposed a novel methodology for the collection of network evidence. In particular, it is focused on the collection of information from online services, such as web pages, chats, documents, photos and videos. The methodology is suitable for both expert and non-expert analysts as it “drives” the user through the whole acquisition process. During the acquisition, the information received from the remote source is automatically collected. It includes not only network packets, but also any information produced by the client upon its interpretation (such as video and audio output). A trusted-third-party, acting as a digital notary, is introduced in order to certify both the acquired evidence (i.e., the information obtained from the remote service) and the acquisition process (i.e., all the activities performed by the analysts to retrieve it). A proof-of-concept prototype, called LINEA, has been implemented to perform an experimental evaluation of the methodology.
advanced concepts for intelligent vision systems | 2016
Giuseppe Cattaneo; Gianluca Roscigno; Andrea Bruno
In this paper we discuss about video integrity problem and specifically we analyze whether the method proposed by Fridrich et al. [16] can be exploited for forensic purposes. In particular Fridrich et al. proposed a solution to identify the source camera given an input image. The method relies on the Pixel Non-Uniformity (PNU) noise produced by the sensor and existing in any digital image.
advanced concepts for intelligent vision systems | 2015
Giuseppe Cattaneo; Umberto Ferraro Petrillo; Gianluca Roscigno; Carmine De Fusco
In this paper we propose a non-blind passive technique for image forgery detection. Our technique is a variant of a method presented in [8] and it is based on the analysis of the Sensor Pattern Noise SPN. Its main features are the ability to detect small forged regions and to run in an automatic way. Our technique works by extracting the SPN from the image under scrutiny and, then, by correlating it with the reference SPN of a target camera. The two noises are partitioned into non-overlapping blocks before evaluating their correlation. Then, a set of operators is applied on the resulting Correlations Map to highlight forged regions and remove noise spikes. The result is processed using a multi-level segmentation algorithm to determine which blocks should be considered forged. We analyzed the performance of our technique by using a dataset of 4,i¾?000 images.
ieee international conference on cloud computing technology and science | 2017
Giuseppe Cattaneo; Umberto Ferraro Petrillo; Michele Nappi; Fabio Narducci; Gianluca Roscigno
Apache Hadoop offers the possibility of coding full-fledged distributed applications with very low programming efforts. However, the resulting implementations may suffer from some performance bottlenecks that nullify the potential of a distributed system. An engineering methodology based on the implementation of smart optimizations driven by a careful profiling activity may lead to a much better experimental performance as shown in this paper.
international conference on parallel processing | 2015
Giuseppe Cattaneo; Umberto Ferraro Petrillo; Raffaele Giancarlo; Gianluca Roscigno
Sequence comparison i.e., The assessment of how similar two biological sequences are to each other, is a fundamental and routine task in Computational Biology and Bioinformatics. Classically, alignment methods are the de facto standard for such an assessment. In fact, considerable research efforts for the development of efficient algorithms, both on classic and parallel architectures, has been carried out in the past 50 years. Due to the growing amount of sequence data being produced, a new class of methods has emerged: Alignment-free methods. Research in this ares has become very intense in the past few years, stimulated by the advent of Next Generation Sequencing technologies, since those new methods are very appealing in terms of computational resources needed and biological relevance. Despite such an effort and in contrast with sequence alignment methods, no systematic investigation of how to take advantage of distributed architectures to speed up alignment-free methods, has taken place. We provide a contribution of that kind, by evaluating the possibility of using the Hadoop distributed framework to speed up the running times of these methods, compared to their original sequential formulation.