Samuele Martelli | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Samuele Martelli is active.

Explore More

Publication

Featured researches published by Samuele Martelli.

international conference on computer vision | 2013

Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification

Marco San Biagio; Marco Crocco; Marco Cristani; Samuele Martelli; Vittorio Murino

Capturing the essential characteristics of visual objects by considering how their features are inter-related is a recent philosophy of object classification. In this paper, we embed this principle in a novel image descriptor, dubbed Heterogeneous Auto-Similarities of Characteristics (HASC). HASC is applied to heterogeneous dense features maps, encoding linear relations by co variances and nonlinear associations through information-theoretic measures such as mutual information and entropy. In this way, highly complex structural information can be expressed in a compact, scale invariant and robust manner. The effectiveness of HASC is tested on many diverse detection and classification scenarios, considering objects, textures and pedestrians, on widely known benchmarks (Caltech-101, Brodatz, Daimler Multi-Cue). In all the cases, the results obtained with standard classifiers demonstrate the superiority of HASC with respect to the most adopted local feature descriptors nowadays, such as SIFT, HOG, LBP and feature co variances. In addition, HASC sets the state-of-the-art on the Brodatz texture dataset and the Daimler Multi-Cue pedestrian dataset, without exploiting ad-hoc sophisticated classifiers.

international conference on distributed smart cameras | 2011

FPGA-based pedestrian detection using array of covariance features

Samuele Martelli; Diego Tosato; Marco Cristani; Vittorio Murino

In this paper we propose a pedestrian detection algorithm and its implementation on a Xilinx Virtex-4 FPGA. The algorithm is a sliding window-based classifier, that exploits a recently designed descriptor, the covariance of features, for characterizing pedestrians in a robust way. In the paper we show how such descriptor, originally suited for maximizing accuracy performances without caring about timings, can be quickly computed in an elegant, parallel way on the FPGA board. A grid of overlapped covariances extracts information from the sliding window, and feeds a linear Support Vector Machine that performs the detection. Experiments are performed on the INRIA pedestrian benchmark; the performances of the FPGA-based detector are discussed in terms of required computational effort and accuracy, showing state-of-the-art detection performances under excellent timings and economic memory usage.

database and expert systems applications | 2010

An FPGA-based Classification Architecture on Riemannian Manifolds

Samuele Martelli; Diego Tosato; Michela Farenzena; Marco Cristani; Vittorio Murino

In Computer Vision and Pattern Recognition, the object detection problem is a fundamental task, but only a few systems are thought to be realized on an embedded architecture. To this end, we propose an effective, low-latency, affordable classification architecture, especially suited for embedded platforms. In particular, we have designed a novel highly-parallelizable classification framework for an FPGA-based implementation, which is suitable for generic detection problems. The underlying model consists in a weighted sum of boosted binary classifiers, learned on a set of overlapped image patches. Each patch is described by estimating the covariance matrix of a set of features, so forming a very compact and expressive descriptor. Covariances matrices live on Riemannian Manifold, whose topology is particularly simple, so that they can be approximated in the Euclidean Vector Space in a cheap and conservative way. The hardware design has been developed in a parallel fashion and with specific architectural solutions, allowing a fast response without degrading the functional performances. We finally specialize this architecture to the challenging pedestrian detection problem, defining state-of-the art results on the standard INRIA pedestrian benchmark dataset.

computer vision and pattern recognition | 2010

FPGA-based robust ellipse estimation for circular road sign detection

Samuele Martelli; Roberto Marzotto; Andrea Colombari; Vittorio Murino

Estimating parametric curves from images using robust fitting algorithms is a well-known and important computer vision task. We present a complete FPGA design and implementation of a fast and robust model fitting algorithm for real-time ellipse detection on video streams. The proposed solution relies on a the RANSAC algorithm, modified for FPGA deployment, in combination with an image-preprocessing pipeline in order to perform the intensive pixel-level analysis, reducing each frame to a simple binary image of edges. The design has been developed in a parallel fashion and with specific architectural solutions so as to allow a fast response without degrading the functional performances. Experimental results on synthetic and real data show that our implementation, synthesized onto a Xilinx Spartan-3A DSP 3400A device, succesfully runs in real-time with a low resource occupation, while maintaining a functionality comparable with the floating-point software version.

international conference on image processing | 2011

Fast FPGA-based architecture for pedestrian detection based on covariance matrices

Samuele Martelli; Diego Tosato; Marco Cristani; Vittorio Murino

Pedestrian detection is a crucial task in several video surveillance and automotive scenarios, but only a few detection systems are designed to be realized on an embedded architecture, allowing to increase the processing speed which is one of the key requirements in real applications. In this paper, we propose a novel SoC (System on Chip) architecture for fast pedestrian detection in video. Our implementation is based on a linear SVM (Support Vector Machine) classification framework, learned on a set of overlapped image patches. Each patch is described by a covariance matrix of a set of image features. Exploiting the inner parallelism of the FPGA (Field Programmable Gate Array) boards, we dramatically accelerate the covariance matrices computation that plays a crucial role in the framework. In the experiments, we show the effectiveness and the efficiency of our pedestrian detection system, reaching a detection speed of 132 fps at VGA resolution.

computer vision and pattern recognition | 2015

Learning with dataset bias in latent subcategory models

Dimitris Stamos; Samuele Martelli; Moin Nabi; Andrew M. McDonald; Vittorio Murino; Massimiliano Pontil

Latent subcategory models (LSMs) offer significant improvements over training linear support vector machines (SVMs). Training LSMs is a challenging task due to the potentially large number of local optima in the objective function and the increased model complexity which requires large training set sizes. Often, larger datasets are available as a collection of heterogeneous datasets. However, previous work has highlighted the possible danger of simply training a model from the combined datasets, due to the presence of bias. In this paper, we present a model which jointly learns an LSM for each dataset as well as a compound LSM. The method provides a means to borrow statistical strength from the datasets while reducing their inherent bias. In experiments we demonstrate that the compound LSM, when tested on PASCAL, LabelMe, Caltech101 and SUN09 in a leave-one-dataset-out fashion, achieves an average improvement of over 6.5% over a previous SVM-based undoing bias approach and an average improvement of over 8.5% over a standard LSM trained on the concatenation of the datasets.

International Journal of Pattern Recognition and Artificial Intelligence | 2014

ENCODING STRUCTURAL SIMILARITY BY CROSS-COVARIANCE TENSORS FOR IMAGE CLASSIFICATION

Marco San Biagio; Samuele Martelli; Marco Crocco; Marco Cristani; Vittorio Murino

In computer vision, an object can be modeled in two main ways: by explicitly measuring its characteristics in terms of feature vectors, and by capturing the relations which link an object with some exemplars, that is, in terms of similarities. In this paper, we propose a new similarity-based descriptor, dubbed structural similarity cross-covariance tensor (SS-CCT), where self-similarities come into play: Here the entity to be measured and the exemplar are regions of the same object, and their similarities are encoded in terms of cross-covariance matrices. These matrices are computed from a set of low-level feature vectors extracted from pairs of regions that cover the entire image. SS-CCT shares some similarities with the widely used covariance matrix descriptor, but extends its power focusing on structural similarities across multiple parts of an image, instead of capturing local similarities in a single region. The effectiveness of SS-CCT is tested on many diverse classification scenarios, considering objects and scenes on widely known benchmarks (Caltech-101, Caltech-256, PASCAL VOC 2007 and SenseCam). In all the cases, the results obtained demonstrate the superiority of our new descriptor against diverse competitors. Furthermore, we also reported an analysis on the reduced computational burden achieved by using and efficient implementation that takes advantage from the integral image representation.

iberoamerican congress on pattern recognition | 2013

Encoding Classes of Unaligned Objects Using Structural Similarity Cross-Covariance Tensors

Marco San Biagio; Samuele Martelli; Marco Crocco; Marco Cristani; Vittorio Murino

Encoding an object essence in terms of self-similarities between its parts is becoming a popular strategy in Computer Vision. In this paper, a new similarity-based descriptor, dubbed Structural Similarity Cross-Covariance Tensor is proposed, aimed to encode relations among different regions of an image in terms of cross-covariance matrices. The latter are calculated between low-level feature vectors extracted from pairs of regions. The new descriptor retains the advantages of the widely used covariance matrix descriptors [1], extending their expressiveness from local similarities inside a region to structural similarities across multiple regions. The new descriptor, applied on top of HOG, is tested on object and scene classification tasks with three datasets. The proposed method always outclasses baseline HOG and yields significant improvement over a recently proposed self-similarity descriptor in the two most challenging datasets.

international conference on computer vision | 2015

Seeing the Sound: A New Multimodal Imaging Device for Computer Vision

Andrea Zunino; Marco Crocco; Samuele Martelli; Andrea Trucco; Alessio Del Bue; Vittorio Murino

Audio imaging can play a fundamental role in computer vision, in particular in automated surveillance, boosting the accuracy of current systems based on standard optical cameras. We present here a new hybrid device for acoustic-optic imaging, whose characteristics are tailored to automated surveillance. In particular, the device allows realtime, high frame rate generation of an acoustic map, overlaid over a standard optical image using a geometric calibration of audio and video streams. We demonstrate the potentialities of the device for target tracking on three challenging setup showing the advantages of using acoustic images against baseline algorithms on image tracking. In particular, the proposed approach is able to overcome, often dramatically, visual tracking with state-of-art algorithms, dealing efficiently with occlusions, abrupt variations in visual appearence and camouflage. These results pave the way to a widespread use of acoustic imaging in application scenarios such as in surveillance and security.

IEEE Journal of Oceanic Engineering | 2015

Low-Cost Acoustic Cameras for Underwater Wideband Passive Imaging

Andrea Trucco; Samuele Martelli; Marco Crocco

The imaging of underwater acoustic sources using passive 3-D sonar systems has many potential applications. However, to achieve both wide bandwidth and low cost, an array that is superdirective at low frequencies and aperiodic, to avoid aliasing at high frequencies, is required. To design a sparse array layout and the related filter-and-sum beamformer, a recently proposed method for airborne acoustic cameras is applied. First, the method is generalized to include position errors. Next, the method is used to demonstrate that 49 low-cost, poorly matched hydrophones are sufficient to create a square array for an underwater acoustic camera working from 500 Hz to 8.75 kHz with a side length of 1 m. Finally, the images of a simulated acoustic scenario obtained using different passive systems are compared, which reveals the advantages of the proposed design strategy.

Explore More