Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marco La Cascia is active.

Publication


Featured researches published by Marco La Cascia.


Multimedia Tools and Applications | 1997

Automatic Video Database Indexing and Retrieval

Edoardo Ardizzone; Marco La Cascia

The increasing development of advanced multimedia applications requires new technologies for organizing and retrieving by content databases of still digital images or digital video sequences. To this aim image and image sequence contents must be described and adequately coded. In this paper we describe a system allowing content-based annotation and querying in video databases. No user action is required during the database population step. The system automatically splits a video into a sequence of shots, extracts a few representative frames (said r-frames) from each shot and computes r-frame descriptors based on color, texture and motion. Queries based on one or more features are possible. Very interesting results obtained during the severe tests the system was subjected to are reported and discussed.


Pattern Recognition | 2016

3D skeleton-based human action classification

Liliana Lo Presti; Marco La Cascia

In recent years, there has been a proliferation of works on human action classification from depth sequences. These works generally present methods and/or feature representations for the classification of actions from sequences of 3D locations of human body joints and/or other sources of data, such as depth maps and RGB videos.This survey highlights motivations and challenges of this very recent research area by presenting technologies and approaches for 3D skeleton-based action classification. The work focuses on aspects such as data pre-processing, publicly available benchmarks and commonly used accuracy measurements. Furthermore, this survey introduces a categorization of the most recent works in 3D skeleton-based action classification according to the adopted feature representation.This paper aims at being a starting point for practitioners who wish to approach the study of 3D action classification and gather insights on the main challenges to solve in this emerging field. HighlightsState of the art 3D skeleton-based action classification methods are reviewed.Methods are categorized based on the adopted feature representation.Motivations and challenges for skeleton-based action recognition are highlighted.Data pre-processing, public benchmarks and validation protocols are discussed.Comparison of renowned methods, open problems and future work are presented.


asian conference on computer vision | 2014

Gesture Modeling by Hanklet-Based Hidden Markov Model

Liliana Lo Presti; Marco La Cascia; Stan Sclaroff; Octavia I. Camps

In this paper we propose a novel approach for gesture modeling. We aim at decomposing a gesture into sub-trajectories that are the output of a sequence of atomic linear time invariant (LTI) systems, and we use a Hidden Markov Model to model the transitions from the LTI system to another. For this purpose, we represent the human body motion in a temporal window as a set of body joint trajectories that we assume are the output of an LTI system. We describe the set of trajectories in a temporal window by the corresponding Hankel matrix (Hanklet), which embeds the observability matrix of the LTI system that produced it. We train a set of HMMs (one for each gesture class) with a discriminative approach. To account for the sharing of body motion templates we allow the HMMs to share the same state space. We demonstrate by means of experiments on two publicly available datasets that, even with just considering the trajectories of the 3D joints, our method achieves state-of-the-art accuracy while competing well with methods that employ more complex models and feature representations.


IEEE Transactions on Multimedia | 2012

Path Modeling and Retrieval in Distributed Video Surveillance Databases

Liliana Lo Presti; Stan Sclaroff; Marco La Cascia

We propose a framework for querying a distributed database of video surveillance data in order to retrieve a set of likely paths of a person moving in the area under surveillance. In our framework, each camera of the surveillance system locally processes the data and stores video sequences in a storage unit and the metadata for each detected person in the distributed database. A pedestrians path is formulated as a dynamic Bayesian network (DBN) to model the dependencies between subsequent observations of the person as he makes his way through the camera network. We propose a tool by which the analyst can pose queries about where a certain person appeared while moving in the site during a specified temporal window. The DBN is used in an algorithm that finds potentially relevant metadata records from the distributed databases and then assembles these into probable paths that the person took in the camera network. Finally, the system presents the analyst with the retrieved set of likely paths in ranked order. The computational complexity for our method is quadratic in the number of camera nodes and linear in the number of moving persons. Experiments were carried out on simulated data to test the system with large distributed databases and in a real setting in which six databases store the data from six video cameras. The simulations confirm that our method provides good results with varying numbers of cameras and persons moving in the network. In a real setting, the method reconstructs paths across the camera network with approximatively 75% accuracy at rank 1.


Image and Vision Computing | 2015

Hankelet-based dynamical systems modeling for 3D action recognition

Liliana Lo Presti; Marco La Cascia; Stan Sclaroff; Octavia I. Camps

This paper proposes to model an action as the output of a sequence of atomic Linear Time Invariant (LTI) systems. The sequence of LTI systems generating the action is modeled as a Markov chain, where a Hidden Markov Model (HMM) is used to model the transition from one atomic LTI system to another. In turn, the LTI systems are represented in terms of their Hankel matrices. For classification purposes, the parameters of a set of HMMs (one for each action class) are learned via a discriminative approach. This work proposes a novel method to learn the atomic LTI systems from training data, and analyzes in detail the action representation in terms of a sequence of Hankel matrices. Extensive evaluation of the proposed approach on two publicly available datasets demonstrates that the proposed method attains state-of-the-art accuracy in action classification from the 3D locations of body joints (skeleton). Display Omitted We model an action as sequence of outputs of linear time invariant (LTI) systems.We represent the outputs of LTI systems by means of Hankelets.We adopt an HMM to model the transitions from one LTI system to another.We formulate an inference and supervised learning formulation for our model.We also present a deep analysis of the parameter settings for our action representation.


electronic imaging | 2008

A novel approach to personal photo album representation and management

Edoardo Ardizzone; Marco La Cascia; Filippo Vella

In this paper we present a novel approach to personal photo album management allowing the end user to efficiently access the collection without any need for tedious manual annotation or indexing of the photos. The proposed work exploits methods and technology from the field of computer vision and pattern recognition for face detection, face representation and image annotation to automatically create description of images useful for content-based searching and retrieval. In fact, even if most of the used techniques are not reliable enough to address the general problem of content-based image retrieval, we show that, in a limited domain such as the one of personal photo album, it is possible to obtain results that improve the browsing capabilities of current photo album management systems. In particular, starting from the observation that most personal photos depict a usually small number of people in a relatively small number of different context (indoor, outdoor, beach, mountain, city, etc...) we propose the use of automatic techniques to index images based on who is present in the scene and on the context where the picture was taken. Experiments on a personal photo collection of about a thousand images proved that relatively simple content-based techniques lead to surprisingly good results in term of easyness of user access to the data.


computer vision and pattern recognition | 2015

Using Hankel matrices for dynamics-based facial emotion recognition and pain detection

Liliana Lo Presti; Marco La Cascia

This paper proposes a new approach to model the temporal dynamics of a sequence of facial expressions. To this purpose, a sequence of Face Image Descriptors (FID) is regarded as the output of a Linear Time Invariant (LTI) system. The temporal dynamics of such sequence of descriptors are represented by means of a Hankel matrix. The paper presents different strategies to compute dynamics-based representation of a sequence of FID, and reports classification accuracy values of the proposed representations within different standard classification frameworks. The representations have been validated in two very challenging application domains: emotion recognition and pain detection. Experiments on two publicly available benchmarks and comparison with state-of-the-art approaches demonstrate that the dynamics-based FID representation attains competitive performance when off-the-shelf classification tools are adopted.


complex, intelligent and software intensive systems | 2010

Mobile Interface for Content-Based Image Management

Marco La Cascia; Marco Morana; Salvatore Sorce

People make more and more use of digital image acquisition devices to capture screenshots of their everyday life. The growing number of personal pictures raise the problem of their classification. Some of the authors proposed an automatic technique for personal photo album management dealing with multiple aspects (i. e., people, time and background) in a homogenous way. In this paper we discuss a solution that allows mobile users to remotely access such technique by means of their mobile phones, almost from everywhere, in a pervasive fashion. This allows users to classify pictures they store on their devices. The whole solution is presented, with particular regard to the user interface implemented on the mobile phone, along with some experimental results.


Proceedings of the 1st ACM workshop on Vision networks for behavior analysis | 2008

Enabling technologies on hybrid camera networks for behavioral analysis of unattended indoor environments and their surroundings

Giovanni Gualdi; Andrea Prati; Rita Cucchiara; Edoardo Ardizzone; Marco La Cascia; Liliana Lo Presti; Marco Morana

This paper presents a layered network architecture and the enabling technologies for accomplishing vision-based behavioral analysis of unattended environments. Specifically the vision network covers both the attended environment and its surroundings by means of hybrid cameras. The layer overlooking at the surroundings is laid outdoor and tracks people, monitoring entrance/exit points. It recovers the geometry of the site under surveillance and communicates people positions to a higher level layer. The layer monitoring the unattended environment undertakes similar goals, with the addition of maintaining a global mosaic of the observed scene for further understanding. Moreover, it merges information coming from sensors beyond the vision to deepen the understanding or increase the reliability of the system. The behavioral analysis is demanded to a third layer that merges the information received from the two other layers and infers knowledge about what happened, happens and will be likely happening in the environment. The paper also describes a case study that was implemented in the Engineering Campus of the University of Modena and Reggio Emilia, where our surveillance system has been deployed in a computer laboratory which was often unaccessible due to lack of attendance.


international conference on image analysis and processing | 2015

Ensemble of Hankel Matrices for Face Emotion Recognition

Liliana Lo Presti; Marco La Cascia

In this paper, a face emotion is considered as the result of the composition of multiple concurrent signals, each corresponding to the movements of a specific facial muscle. These concurrent signals are represented by means of a set of multi-scale appearance features that might be correlated with one or more concurrent signals. The extraction of these appearance features from a sequence of face images yields to a set of time series. This paper proposes to use the dynamics regulating each appearance feature time series to recognize among different face emotions. To this purpose, an ensemble of Hankel matrices corresponding to the extracted time series is used for emotion classification within a framework that combines nearest neighbor and a majority vote schema. Experimental results on a public available dataset show that the adopted representation is promising and yields state-of-the-art accuracy in emotion classification.

Collaboration


Dive into the Marco La Cascia's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Filippo Vella

National Research Council

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge