Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Silvia Zuffi is active.

Publication


Featured researches published by Silvia Zuffi.


international conference on computer vision | 2013

Towards Understanding Action Recognition

Hueihan Jhuang; Juergen Gall; Silvia Zuffi; Cordelia Schmid; Michael J. Black

Although action recognition in videos is widely studied, current methods often fail on real-world datasets. Many recent approaches improve accuracy and robustness to cope with challenging video sequences, but it is often unclear what affects the results most. This paper attempts to provide insights based on a systematic performance evaluation using thoroughly-annotated data of human actions. We annotate human Joints for the HMDB dataset (J-HMDB). This annotation can be used to derive ground truth optical flow and segmentation. We evaluate current methods using this dataset and systematically replace the output of various algorithms with ground truth. This enables us to discover what is important - for example, should we work on improving flow algorithms, estimating human bounding boxes, or enabling pose estimation? In summary, we find that high-level pose features greatly outperform low/mid level features, in particular, pose over time is critical, but current pose estimation algorithms are not yet reliable enough to provide this information. We also find that the accuracy of a top-performing action recognition framework can be greatly increased by refining the underlying low/mid level features, this suggests it is important to improve optical flow and human detection algorithms. Our analysis and J-HMDB dataset should facilitate a deeper understanding of action recognition algorithms.


computer vision and pattern recognition | 2012

From Pictorial Structures to deformable structures

Silvia Zuffi; Oren Freifeld; Michael J. Black

Pictorial Structures (PS) define a probabilistic model of 2D articulated objects in images. Typical PS models assume an object can be represented by a set of rigid parts connected with pairwise constraints that define the prior probability of part configurations. These models are widely used to represent non-rigid articulated objects such as humans and animals despite the fact that such objects have parts that deform non-rigidly. Here we define a new Deformable Structures (DS) model that is a natural extension of previous PS models and that captures the non-rigid shape deformation of the parts. Each part in a DS model is represented by a low-dimensional shape deformation space and pairwise potentials between parts capture how the shape varies with pose and the shape of neighboring parts. A key advantage of such a model is that it more accurately models object boundaries. This enables image likelihood models that are more discriminative than previous PS likelihoods. This likelihood is learned using training imagery annotated using a DS “puppet.” We focus on a human DS model learned from 2D projections of a realistic 3D human body model and use it to infer human poses in images using a form of non-parametric belief propagation.


computer vision and pattern recognition | 2010

Contour people: A parameterized model of 2D articulated human shape

Oren Freifeld; Alexander Weiss; Silvia Zuffi; Michael J. Black

We define a new “contour person” model of the human body that has the expressive power of a detailed 3D model and the computational benefits of a simple 2D part-based model. The contour person (CP) model is learned from a 3D SCAPE model of the human body that captures natural shape and pose variations; the projected contours of this model, along with their segmentation into parts forms the training set. The CP model factors deformations of the body into three components: shape variation, viewpoint change and part rotation. This latter model also incorporates a learned non-rigid deformation model. The result is a 2D articulated model that is compact to represent, simple to compute with and more expressive than previous models. We demonstrate the value of such a model in 2D pose estimation and segmentation. Given an initial pose from a standard pictorial-structures method, we refine the pose and shape using an objective function that segments the scene into foreground and background regions. The result is a parametric, human-specific, image segmentation.


international conference on computer vision | 2013

Estimating Human Pose with Flowing Puppets

Silvia Zuffi; Javier Romero; Cordelia Schmid; Michael J. Black

We address the problem of upper-body human pose estimation in uncontrolled monocular video sequences, without manual initialization. Most current methods focus on isolated video frames and often fail to correctly localize arms and hands. Inferring pose over a video sequence is advantageous because poses of people in adjacent frames exhibit properties of smooth variation due to the nature of human and camera motion. To exploit this, previous methods have used prior knowledge about distinctive actions or generic temporal priors combined with static image likelihoods to track people in motion. Here we take a different approach based on a simple observation: Information about how a person moves from frame to frame is present in the optical flow field. We develop an approach for tracking articulated motions that links articulated shape models of people in adjacent frames through the dense optical flow. Key to this approach is a 2D shape model of the body that we use to compute how the body moves over time. The resulting flowing puppets provide a way of integrating image evidence across frames to improve pose inference. We apply our method on a challenging dataset of TV video sequences and show state-of-the-art performance.


computer vision and pattern recognition | 2015

The stitched puppet: A graphical model of 3D human shape and pose

Silvia Zuffi; Michael J. Black

We propose a new 3D model of the human body that is both realistic and part-based. The body is represented by a graphical model in which nodes of the graph correspond to body parts that can independently translate and rotate in 3D and deform to represent different body shapes and to capture pose-dependent shape variations. Pairwise potentials define a “stitching cost” for pulling the limbs apart, giving rise to the stitched puppet (SP) model. Unlike existing realistic 3D body models, the distributed representation facilitates inference by allowing the model to more effectively explore the space of poses, much like existing 2D pictorial structures models. We infer pose and body shape using a form of particle-based max-product belief propagation. This gives SP the realism of recent 3D body models with the computational advantages of part-based models. We apply SP to two challenging problems involving estimating human shape and pose from 3D data. The first is the FAUST mesh alignment challenge, where ours is the first method to successfully align all 3D meshes with no pose prior. The second involves estimating pose and shape from crude visual hull representations of complex body movements.


international conference on computer graphics and interactive techniques | 2016

Body talk: crowdshaping realistic 3D avatars with words

Stephan Streuber; M. Alejandra Quiros-Ramirez; Matthew Hill; Carina A. Hahn; Silvia Zuffi; Alice J. O'Toole; Michael J. Black

Realistic, metrically accurate, 3D human avatars are useful for games, shopping, virtual reality, and health applications. Such avatars are not in wide use because solutions for creating them from high-end scanners, low-cost range cameras, and tailoring measurements all have limitations. Here we propose a simple solution and show that it is surprisingly accurate. We use crowdsourcing to generate attribute ratings of 3D body shapes corresponding to standard linguistic descriptions of 3D shape. We then learn a linear function relating these ratings to 3D human shape parameters. Given an image of a new body, we again turn to the crowd for ratings of the body shape. The collection of linguistic ratings of a photograph provides remarkably strong constraints on the metric 3D shape. We call the process crowdshaping and show that our Body Talk system produces shapes that are perceptually indistinguishable from bodies created from high-resolution scans and that the metric accuracy is sufficient for many tasks. This makes body scanning practical without a scanner, opening up new applications including database search, visualization, and extracting avatars from books.


computer vision and pattern recognition | 2017

3D Menagerie: Modeling the 3D Shape and Pose of Animals

Silvia Zuffi; Angjoo Kanazawa; David W. Jacobs; Michael J. Black

There has been significant work on learning realistic, articulated, 3D models of the human body. In contrast, there are few such models of animals, despite many applications. The main challenge is that animals are much less cooperative than humans. The best human body models are learned from thousands of 3D scans of people in specific poses, which is infeasible with live animals. Consequently, we learn our model from a small set of 3D scans of toy figurines in arbitrary poses. We employ a novel part-based shape model to compute an initial registration to the scans. We then normalize their pose, learn a statistical shape model, and refine the registrations and the model together. In this way, we accurately align animal scans from different quadruped families with very different shapes and poses. With the registration to a common template we learn a shape space representing animals including lions, cats, dogs, horses, cows and hippos. Animal shapes can be sampled from the model, posed, animated, and fit to data. We demonstrate generalization by fitting it to images of real animals including species not seen in training.


Journal of Electronic Imaging | 2010

Comparing image preference in controlled and uncontrolled viewing conditions

Silvia Zuffi; Carla Brambilla; Reiner Eschbach; Alessandro Rizzi

We examine the relationship between controlled and uncontrolled visual preference tests. We compare the results for the preference of printed images in various viewing environments. The data are examined with regard to different numbers of observer subsets and we can derive an experimental guideline for an equivalence of the controlled and uncontrolled preference experiments based on the certainty of the expected result.


computer vision and pattern recognition | 2018

Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape From Images

Silvia Zuffi; Angjoo Kanazawa; Michael J. Black


Archive | 2017

Crowdshaping realistic 3d avatars with words

Stephan Streuber; Ramírez Maria Alejandra Quirós; Michael J. Black; Silvia Zuffi; Alice J. O'Toole; Matthew Hill; Carina A. Hahn

Collaboration


Dive into the Silvia Zuffi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alice J. O'Toole

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Carina A. Hahn

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Matthew Hill

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hueihan Jhuang

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Carla Brambilla

National Research Council

View shared research outputs
Researchain Logo
Decentralizing Knowledge