Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andreas E. Savakis is active.

Publication


Featured researches published by Andreas E. Savakis.


IEEE Transactions on Circuits and Systems for Video Technology | 2011

Online Distance Metric Learning for Object Tracking

Grigorios Tsagkatakis; Andreas E. Savakis

Tracking an object without any prior information regarding its appearance is a challenging problem. Modern tracking algorithms treat tracking as a binary classification problem between the object class and the background class. The binary classifier can be learned offline, if a specific object model is available, or online, if there is no prior information about the objects appearance. In this paper, we propose the use of online distance metric learning in combination with nearest neighbor classification for object tracking. We assume that the previous appearances of the object and the background are clustered so that a nearest neighbor classifier can be used to distinguish between the new appearance of the object and the appearance of the background. In order to support the classification, we employ a distance metric learning (DML) algorithm that learns to separate the object from the background. We utilize the first few frames to build an initial model of the object and the background and subsequently update the model at every frame during the course of tracking, so that changes in the appearance of the object and the background are incorporated into the model. Furthermore, instead of using only the previous frame as the objects model, we utilize a collection of previous appearances encoded in a template library to estimate the similarity under variations in appearance. In addition to the utilization of the online DML algorithm for learning the object/background model, we propose a novel feature representation of image patches. This representation is based on the extraction of scale invariant features over a regular grid coupled with dimensionality reduction using random projections. This type of representation is both robust, capitalizing on the reproducibility of the scale invariant features, and fast, performing the tracking on a reduced dimensional space. The proposed tracking algorithm was tested under challenging conditions and achieved state-of-the art performance.


international conference on computer vision | 2011

Manifold based Sparse Representation for robust expression recognition without neutral subtraction

Raymond W. Ptucha; Grigorios Tsagkatakis; Andreas E. Savakis

This paper exploits the discriminative power of manifold learning in conjunction with the parsimonious power of sparse signal representation to perform robust facial expression recognition. By utilizing an ℓ1 reconstruction error and a statistical mixture model, both accuracy and tolerance to occlusion improve without the need to perform neutral frame subtraction. Initially facial features are mapped onto a low dimensional manifold using supervised Locality Preserving Projections. Then an ℓ1 optimization is employed to relate surface projections to training exemplars, where reconstruction models on facial regions determine the expression class. Experimental procedures and results are done in accordance with the recently published extended Cohn-Kanade and GEMEP-FERA datasets. Results demonstrate that posed datasets overemphasize the mouth region, while spontaneous datasets rely more on the upper cheek and eye regions. Despite these differences, the proposed method overcomes previous limitations to using sparse methods for facial expression and produces state-of-the-art results on both types of datasets.


Image and Vision Computing | 2013

Manifold based sparse representation for facial understanding in natural images

Raymond W. Ptucha; Andreas E. Savakis

Sparse representations, motivated by strong evidence of sparsity in the primate visual cortex, are gaining popularity in the computer vision and pattern recognition fields, yet sparse methods have not gained widespread acceptance in the facial understanding communities. A main criticism brought forward by recent publications is that sparse reconstruction models work well with controlled datasets, but exhibit coefficient contamination in natural datasets. To better handle facial understanding problems, specifically the broad category of facial classification problems, an improved sparse paradigm is introduced in this paper. Our paradigm combines manifold learning for dimensionality reduction, based on a newly introduced variant of semi-supervised Locality Preserving Projections, with a @?^1 reconstruction error, and a regional based statistical inference model. We demonstrate state-of-the-art classification accuracy for the facial understanding problems of expression, gender, race, glasses, and facial hair classification. Our method minimizes coefficient contamination and offers a unique advantage over other facial classification methods when dealing with occlusions. Experimental results are presented on multi-class as well as binary facial classification problems using the Labeled Faces in the Wild, Cohn-Kanade, Extended Cohn-Kanade, and GEMEP-FERA datasets demonstrating how and under what conditions sparse representations can further the field of facial understanding.


IEEE Transactions on Image Processing | 2014

LGE-KSVD: Robust Sparse Representation Classification

Raymond W. Ptucha; Andreas E. Savakis

The parsimonious nature of sparse representations has been successfully exploited for the development of highly accurate classifiers for various scientific applications. Despite the successes of Sparse Representation techniques, a large number of dictionary atoms as well as the high dimensionality of the data can make these classifiers computationally demanding. Furthermore, sparse classifiers are subject to the adverse effects of a phenomenon known as coefficient contamination, where, for example, variations in pose may affect identity and expression recognition. We analyze the interaction between dimensionality reduction and sparse representations, and propose a technique, called Linear extension of Graph Embedding K-means-based Singular Value Decomposition (LGE-KSVD) to address both issues of computational intensity and coefficient contamination. In particular, the LGE-KSVD utilizes variants of the LGE to optimize the K-SVD, an iterative technique for small yet over complete dictionary learning. The dimensionality reduction matrix, sparse representation dictionary, sparse coefficients, and sparsity-based classifier are jointly learned through the LGE-KSVD. The atom optimization process is redefined to allow variable support using graph embedding techniques and produce a more flexible and elegant dictionary learning algorithm. Results are presented on a wide variety of facial and activity recognition problems that demonstrate the robustness of the proposed method.


2011 Western New York Image Processing Workshop | 2011

Interactive display using depth and RGB sensors for face and gesture control

Colin P. Bellmore; Raymond W. Ptucha; Andreas E. Savakis

This paper introduces an interactive display system guided by a human observers gesture, facial pose, and facial expression. The Kinect depth sensor is used to detect and track an observers skeletal joints while the RGB camera is used for detailed facial analysis. The display consists of active regions that the observer can manipulate with body gestures and secluded regions that are activated through head pose and facial expression. The observer receives realtime feedback allowing for intuitive navigation of the interface. A storefront interactive display was created and feedback was collected from over one hundred subjects. Promising results demonstrate the potential of the proposed approach for human-computer interaction applications.


international conference on multimedia and expo | 2009

Facial pose estimation using a symmetrical feature model

Raymond W. Ptucha; Andreas E. Savakis

This paper presents a robust approach to performing facial pose estimation by examining the behavior of key facial features over a wide range of poses. Such methods are useful in intelligent vision systems for entertainment, human computer interaction, and security. In our approach, faces of varying pose are automatically detected, eyes and mouth are located and an active shape model is superimposed. A facial pose estimator is developed using predictor models based on the position, size, and symmetry of facial features. By modeling these predictors over pose positions with varying yaw and pitch, excellent results are obtained without the need for complex computationally intensive methods.


computer vision and pattern recognition | 2013

Grassmannian Sparse Representations and Motion Depth Surfaces for 3D Action Recognition

Sherif Azary; Andreas E. Savakis

Manifold learning has been effectively used in computer vision applications for dimensionality reduction that improves classification performance and reduces computational load. Grassmann manifolds are well suited for computer vision problems because they promote smooth surfaces where points are represented as subspaces. In this paper we propose Grassmannian Sparse Representations (GSR), a novel subspace learning algorithm that combines the benefits of Grassmann manifolds with sparse representations using least squares loss L1-norm minimization for optimal classification. We further introduce a new descriptor that we term Motion Depth Surface (MDS) and compare its classification performance against the traditional Motion History Image (MHI) descriptor. We demonstrate the effectiveness of GSR on computationally intensive 3D action sequences from the Microsoft Research 3D-Action and 3D-Gesture datasets.


electronic imaging | 2005

Discrete Wavelet Transform Core for Image Processing Applications

Andreas E. Savakis; Richard Carbone

This paper presents a flexible hardware architecture for performing the Discrete Wavelet Transform (DWT) on a digital image. The proposed architecture uses a variation of the lifting scheme technique and provides advantages that include small memory requirements, fixed-point arithmetic implementation, and a small number of arithmetic computations. The DWT core may be used for image processing operations, such as denoising and image compression. For example, the JPEG2000 still image compression standard uses the Cohen-Daubechies-Favreau (CDF) 5/3 and CDF 9/7 DWT for lossless and lossy image compression respectively. Simple wavelet image denoising techniques resulted in improved images up to 27 dB PSNR. The DWT core is modeled using MATLAB and VHDL. The VHDL model is synthesized to a Xilinx FPGA to demonstrate hardware functionality. The CDF 5/3 and CDF 9/7 versions of the DWT are both modeled and used as comparisons. The execution time for performing both DWTs is nearly identical at approximately 14 clock cycles per image pixel for one level of DWT decomposition. The hardware area generated for the CDF 5/3 is around 15,000 gates using only 5% of the Xilinx FPGA hardware area, at 2.185 MHz max clock speed and 24 mW power consumption.


international conference on image processing | 2000

Evaluation of lossless compression methods for gray scale document images

Andreas E. Savakis

A comparative study of lossless compression algorithms is presented. The following algorithms are considered: UNIX compress, gzip, LZW, CCITT Group 3 and Group 4, JBIG, old lossless JPEG, JPEG-LS based on LOCO, CALIC, FELICS, S+P Transform, and PNG. In cases where the algorithm under consideration may only be applied to binary data, the bit planes of the gray scale image are separated, with and without Gray encoding, and the compression is applied to individual bit planes. Testing is done using a set of document images obtained by gray scale scanning of prints of the eight standard CCITT images and a set of nine gray scale pictorial images. The results show that the highest compression is obtained using the CALIC and JPEG-LS algorithms.


Pattern Recognition | 2001

Self-supervised texture segmentation using complementary types of features

Jiebo Luo; Andreas E. Savakis

Abstract A two-stage texture segmentation approach is proposed where an initial segmentation map is obtained through unsupervised clustering of multiresolution simultaneous autoregressive (MRSAR) features and is followed by self-supervised classification of wavelet features. The regions of “high confidence” and “low confidence” are identified based on the MRSAR segmentation result using multilevel morphological erosion. The second-stage classifier is trained by the “high-confidence” samples and is used to reclassify only the “low-confidence” pixels. The proposed approach leverages on the advantages of both MRSAR and wavelet features. Experimental results show that the misclassification error can be significantly reduced by using complementary types of texture features.

Collaboration


Dive into the Andreas E. Savakis's collaboration.

Top Co-Authors

Avatar

Raymond W. Ptucha

Rochester Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Grigorios Tsagkatakis

Rochester Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sherif Azary

Rochester Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Jiebo Luo

University of Rochester

View shared research outputs
Top Co-Authors

Avatar

Breton Minnehan

Rochester Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James Schimmel

Rochester Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Justin Hnatow

Rochester Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge