Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Paolo Napoletano is active.

Publication


Featured researches published by Paolo Napoletano.


IEEE Transactions on Circuits and Systems for Video Technology | 2008

Bayesian Integration of Face and Low-Level Cues for Foveated Video Coding

Giuseppe Boccignone; Angelo Marcelli; Paolo Napoletano; G. Di Fiore; G. Iacovoni; S. Morsa

We present a Bayesian model that allows to automatically generate flxations/foveations and that can be suitably exploited for compression purposes. The twofold aim of this work is to investigate how the exploitation of high-level perceptual cues provided by human faces occurring in the video can enhance the compression process without reducing the perceived quality of the video and to validate such assumption with an extensive and principled experimental protocol. To such end, the model integrates top-down and bottom-up cues to choose the fixation point on a video frame: at the highest level, a fixation is driven by prior information and by relevant objects, namely human faces, within the scene; at the same time, local saliency together with novel and abrupt visual events contribute by triggering lower level control. The performance of the resulting video compression system has been evaluated with respect to both the perceived quality of foveated video clips and the compression gain with an extensive evaluation campaign, which has eventually involved 200 subjects.


Vision Research | 2009

Visuomotor Characterization of Eye Movements in a Drawing Task

Ruben Coen-Cagli; Paolo Coraggio; Paolo Napoletano; Odelia Schwartz; Mario Ferraro; Giuseppe Boccignone

Understanding visuomotor coordination requires the study of tasks that engage mechanisms for the integration of visual and motor information; in this paper we choose a paradigmatic yet little studied example of such a task, namely realistic drawing. On the one hand, our data indicate that the motor task has little influence on which regions of the image are overall most likely to be fixated: salient features are fixated most often. Viceversa, the effect of motor constraints is revealed in the temporal aspect of the scanpaths: (1) subjects direct their gaze to an object mostly when they are acting upon (drawing) it; and (2) in support of graphically continuous hand movements, scanpaths resemble edge-following patterns along image contours. For a better understanding of such properties, a computational model is proposed in the form of a novel kind of Dynamic Bayesian Network, and simulation results are compared with human eye-hand data.


Journal of The Optical Society of America A-optics Image Science and Vision | 2016

Evaluating color texture descriptors under large variations of controlled lighting conditions.

Claudio Cusano; Paolo Napoletano; Raimondo Schettini

The recognition of color texture under varying lighting conditions remains an open issue. Several features have been proposed for this purpose, ranging from traditional statistical descriptors to features extracted with neural networks. Still, it is not completely clear under what circumstances a feature performs better than others. In this paper, we report an extensive comparison of old and new texture features, with and without a color normalization step, with a particular focus on how these features are affected by small and large variations in the lighting conditions. The evaluation is performed on a new texture database, which includes 68 samples of raw food acquired under 46 conditions that present single and combined variations of light color, direction, and intensity. The database allows us to systematically investigate the robustness of texture descriptors across large variations of imaging conditions.


Computer Vision and Image Understanding | 2015

An interactive tool for manual, semi-automatic and automatic video annotation

Simone Bianco; Gianluigi Ciocca; Paolo Napoletano; Raimondo Schettini

The aim of annotation tools is to relieve the user from the burden of the manual annotation as much as possible.We developed an interactive video annotation tool that supports manual, semi-automatic and automatic annotations.The tool integrates several computer vision algorithms in an interactive and incremental learning framework.A quantitative and qualitative evaluation of the proposed tool on a challenging case study domain is presented.The use of the semi-automatic and automatic modality drastically reduces the human effort. The annotation of image and video data of large datasets is a fundamental task in multimedia information retrieval and computer vision applications. The aim of annotation tools is to relieve the user from the burden of the manual annotation as much as possible. To achieve this ideal goal, many different functionalities are required in order to make the annotations process as automatic as possible. Motivated by the limitations of existing tools, we have developed the iVAT: an interactive Video Annotation Tool. It supports manual, semi-automatic and automatic annotations through the interaction of the user with various detection algorithms. To the best of our knowledge, it is the first tool that integrates several computer vision algorithms working in an interactive and incremental learning framework. This makes the tool flexible and suitable to be used in different application domains. A quantitative and qualitative evaluation of the proposed tool on a challenging case study domain is presented and discussed. Results demonstrate that the use of the semi-automatic, as well as the automatic, modality drastically reduces the human effort while preserving the quality of the annotations.


Computers in Human Behavior | 2014

Text classification using a few labeled examples

Francesco Colace; Massimo De Santo; Luca Greco; Paolo Napoletano

Supervised text classifiers need to learn from many labeled examples to achieve a high accuracy. However, in a real context, sufficient labeled examples are not always available because human labeling is enormously time-consuming. For this reason, there has been recent interest in methods that are capable of obtaining a high accuracy when the size of the training set is small. In this paper we introduce a new single label text classification method that performs better than baseline methods when the number of labeled examples is small. Differently from most of the existing methods that usually make use of a vector of features composed of weighted words, the proposed approach uses a structured vector of features, composed of weighted pairs of words. The proposed vector of features is automatically learned, given a set of documents, using a global method for term extraction based on the Latent Dirichlet Allocation implemented as the Probabilistic Topic Model. Experiments performed using a small percentage of the original training set (about 1%) confirmed our theories.


Information Processing and Management | 2015

Weighted Word Pairs for query expansion

Francesco Colace; Massimo De Santo; Luca Greco; Paolo Napoletano

Abstract This paper proposes a novel query expansion method to improve accuracy of text retrieval systems. Our method makes use of a minimal relevance feedback to expand the initial query with a structured representation composed of weighted pairs of words. Such a structure is obtained from the relevance feedback through a method for pairs of words selection based on the Probabilistic Topic Model. We compared our method with other baseline query expansion schemes and methods. Evaluations performed on TREC-8 demonstrated the effectiveness of the proposed method with respect to the baseline.


Journal of The Optical Society of America A-optics Image Science and Vision | 2014

Combining local binary patterns and local color contrast for texture classification under varying illumination

Claudio Cusano; Paolo Napoletano; Raimondo Schettini

This paper presents a texture descriptor for color texture classification specially designed to be robust against changes in the illumination conditions. The descriptor combines a histogram of local binary patterns (LBPs) with a novel feature measuring the distribution of local color contrast. The proposed descriptor is invariant with respect to rotations and translations of the image plane and with respect to several transformations in the color space. We evaluated the proposed descriptor on the Outex test suite, by measuring the classification accuracy in the case in which training and test images have been acquired under different illuminants. The results obtained show that our descriptor outperforms the original LBP approach and its color variants, even when these are computed after color normalization. Moreover, it also outperforms several other color texture descriptors in the state of the art.


International Journal of Remote Sensing | 2018

Visual descriptors for content-based retrieval of remote sensing images

Paolo Napoletano

ABSTRACT In this article, we present an extensive evaluation of visual descriptors for the content-based retrieval of remote-sensing (RS) images. The evaluation includes global hand-crafted, local hand-crafted, and convolutional neural networks (CNNs) features coupled with four different content-based image retrieval schemes. We conducted all the experiments on two publicly available datasets: the 21-class University of California (UC) Merced Land Use/Land Cover (LandUse) dataset and 19-class High-resolution Satellite Scene dataset (SceneSat). The content of RS images might be quite heterogeneous, ranging from images containing fine grained textures, to coarse grained ones or to images containing objects. It is, therefore, not obvious in this domain, which descriptor should be employed to describe images having such a variability. Results demonstrate that CNN-based features perform better than both global and local hand-crafted features whatever is the retrieval scheme adopted. Features extracted from a residual CNN suitable fine-tuned on the RS domain, shows much better performance than a residual CNN pre-trained on multimedia scene and object images. Features extracted from Network of Vector of Locally Aggregated Descriptors (NetVLAD), a CNN that considers both CNN and local features, works better than others CNN solutions on those images that contain fine-grained textures and objects.


IEEE Journal of Biomedical and Health Informatics | 2017

Food Recognition: A New Dataset, Experiments, and Results

Gianluigi Ciocca; Paolo Napoletano; Raimondo Schettini

We propose a new dataset for the evaluation of food recognition algorithms that can be used in dietary monitoring applications. Each image depicts a real canteen tray with dishes and foods arranged in different ways. Each tray contains multiple instances of food classes. The dataset contains 1027 canteen trays for a total of 3616 food instances belonging to 73 food classes. The food on the tray images has been manually segmented using carefully drawn polygonal boundaries. We have benchmarked the dataset by designing an automatic tray analysis pipeline that takes a tray image as input, finds the regions of interest, and predicts for each region the corresponding food class. We have experimented with three different classification strategies using also several visual descriptors. We achieve about 79% of food and tray recognition accuracy using convolutional-neural-networks-based features. The dataset, as well as the benchmark framework, are available to the research community.


ambient intelligence | 2017

Falls as anomalies? An experimental evaluation using smartphone accelerometer data

Daniela Micucci; Marco Mobilio; Paolo Napoletano; Francesco Tisato

Life expectancy keeps growing and, among elderly people, accidental falls occur frequently. A system able to promptly detect falls would help in reducing the injuries that a fall could cause. Such a system should meet the needs of the people to which is designed, so that it is actually used. In particular, the system should be minimally invasive and inexpensive. Thanks to the fact that most of the smartphones embed accelerometers and powerful processing unit, they are good candidates both as data acquisition devices and as platforms to host fall detection systems. For this reason, in the last years several fall detection methods have been experimented on smartphone accelerometer data. Most of them have been tuned with simulated falls because, to date, datasets of real-world falls are not available. This article evaluates the effectiveness of methods that detect falls as anomalies. To this end, we compared traditional approaches with anomaly detectors. In particular, we experienced the kNN and the SVM methods using both the one-class and two-classes configurations. The comparison involved three different collections of accelerometer data, and four different data representations. Empirical results demonstrated that, in most of the cases, falls are not required to design an effective fall detector.

Collaboration


Dive into the Paolo Napoletano's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Raimondo Schettini

University of Milano-Bicocca

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Simone Bianco

University of Milano-Bicocca

View shared research outputs
Top Co-Authors

Avatar

Gianluigi Ciocca

University of Milano-Bicocca

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge