Iasonas Kokkinos | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Iasonas Kokkinos is active.

Explore More

Publication

Featured researches published by Iasonas Kokkinos.

computer vision and pattern recognition | 2010

Scale-invariant heat kernel signatures for non-rigid shape recognition

Michael M. Bronstein; Iasonas Kokkinos

One of the biggest challenges in non-rigid shape retrieval and comparison is the design of a shape descriptor that would maintain invariance under a wide class of transformations the shape can undergo. Recently, heat kernel signature was introduced as an intrinsic local shape descriptor based on diffusion scale-space analysis. In this paper, we develop a scale-invariant version of the heat kernel descriptor. Our construction is based on a logarithmically sampled scale-space in which shape scaling corresponds, up to a multiplicative constant, to a translation. This translation is undone using the magnitude of the Fourier transform. The proposed scale-invariant local descriptors can be used in the bag-of-features framework for shape retrieval in the presence of transformations such as isometric deformations, missing data, topological noise, and global and local scaling. We get significant performance improvement over state-of-the-art algorithms on recently established non-rigid shape retrieval benchmarks.

computer vision and pattern recognition | 2012

Discovering discriminative action parts from mid-level video representations

Michalis Raptis; Iasonas Kokkinos; Stefano Soatto

We describe a mid-level approach for action recognition. From an input video, we extract salient spatio-temporal structures by forming clusters of trajectories that serve as candidates for the parts of an action. The assembly of these clusters into an action class is governed by a graphical model that incorporates appearance and motion constraints for the individual parts and pairwise constraints for the spatio-temporal dependencies among them. During training, we estimate the model parameters discriminatively. During classification, we efficiently match the model to a video using discrete optimization. We validate the models classification ability in standard benchmark datasets and illustrate its potential to support a fine-grained analysis that not only gives a label to a video, but also identifies and localizes its constituent parts.

international conference on computer vision | 2015

Discriminative Learning of Deep Convolutional Feature Point Descriptors

Edgar Simo-Serra; Eduard Trulls; Luis Ferraz; Iasonas Kokkinos; Pascal Fua; Francesc Moreno-Noguer

Deep learning has revolutionalized image-level tasks such as classification, but patch-level tasks, such as correspondence, still rely on hand-crafted features, e.g. SIFT. In this paper we use Convolutional Neural Networks (CNNs) to learn discriminant patch representations and in particular train a Siamese network with pairs of (non-)corresponding patches. We deal with the large number of potential pairs with the combination of a stochastic sampling of the training set and an aggressive mining strategy biased towards patches that are hard to classify. By using the L2 distance during both training and testing we develop 128-D descriptors whose euclidean distances reflect patch similarity, and which can be used as a drop-in replacement for any task involving SIFT. We demonstrate consistent performance gains over the state of the art, and generalize well against scaling and rotation, perspective transformation, non-rigid deformation, and illumination changes. Our descriptors are efficient to compute and amenable to modern GPUs, and are publicly available.

International Journal of Computer Vision | 2016

Deep Filter Banks for Texture Recognition, Description, and Segmentation

Mircea Cimpoi; Subhransu Maji; Iasonas Kokkinos; Andrea Vedaldi

Visual textures have played a key role in image understanding because they convey important semantics of images, and because texture representations that pool local image descriptors in an orderless manner have had a tremendous impact in diverse applications. In this paper we make several contributions to texture understanding. First, instead of focusing on texture instance and material category recognition, we propose a human-interpretable vocabulary of texture attributes to describe common texture patterns, complemented by a new describable texture dataset for benchmarking. Second, we look at the problem of recognizing materials and texture attributes in realistic imaging conditions, including when textures appear in clutter, developing corresponding benchmarks on top of the recently proposed OpenSurfaces dataset. Third, we revisit classic texture represenations, including bag-of-visual-words and the Fisher vectors, in the context of deep learning and show that these have excellent efficiency and generalization properties if the convolutional layers of a deep model are used as filter banks. We obtain in this manner state-of-the-art performance in numerous datasets well beyond textures, an efficient method to apply deep features to image regions, as well as benefit in transferring features from one domain to another.

computer vision and pattern recognition | 2008

Scale invariance without scale selection

Iasonas Kokkinos; Alan L. Yuille

In this work we construct scale invariant descriptors (SIDs) without requiring the estimation of image scale; we thereby avoid scale selection which is often unreliable. Our starting point is a combination of log-polar sampling and spatially-varying smoothing that converts image scalings and rotations into translations. Scale invariance can then be guaranteed by estimating the Fourier transform modulus (FTM) of the formed signal as the FTM is translation invariant. We build our descriptors using phase, orientation and amplitude features that compactly capture the local image structure. Our results show that the constructed SIDs outperform state-of-the-art descriptors on standard datasets. A main advantage of SIDs is that they are applicable to a broader range of image structures, such as edges, for which scale selection is unreliable. We demonstrate this by combining SIDs with contour segments and show that the performance of a boundary-based model is systematically improved on an object detection task.

IEEE Transactions on Speech and Audio Processing | 2005

Nonlinear speech analysis using models for chaotic systems

Iasonas Kokkinos; Petros Maragos

In this paper, we use concepts and methods from chaotic systems to model and analyze nonlinear dynamics in speech signals. The modeling is done not on the scalar speech signal, but on its reconstructed multidimensional attractor by embedding the scalar signal into a phase space. We have analyzed and compared a variety of nonlinear models for approximating the dynamics of complex systems using a small record of their observed output. These models include approximations based on global or local polynomials as well as approximations inspired from machine learning such as radial basis function networks, fuzzy-logic systems and support vector machines. Our focus has been on facilitating the application of the methods of chaotic signal analysis even when only a short time series is available, like phonemes in speech utterances. This introduced an increased degree of difficulty that was dealt with by resorting to sophisticated function approximation models that are appropriate for short data sets. Using these models enabled us to compute for short time series of speech sounds useful features like Lyapunov exponents that are used to assist in the characterization of chaotic systems. Several experimental insights are reported on the possible applications of such nonlinear models and features.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2009

Texture Analysis and Segmentation Using Modulation Features, Generative Models, and Weighted Curve Evolution

Iasonas Kokkinos; Georgios Evangelopoulos; Petros Maragos

In this work we approach the analysis and segmentation of natural textured images by combining ideas from image analysis and probabilistic modeling. We rely on AM-FM texture models and specifically on the Dominant Component Analysis (DCA) paradigm for feature extraction. This method provides a low-dimensional, dense and smooth descriptor, capturing essential aspects of texture, namely scale, orientation, and contrast. Our contributions are at three levels of the texture analysis and segmentation problems: First, at the feature extraction stage we propose a regularized demodulation algorithm that provides more robust texture features and explore the merits of modifying the channel selection criterion of DCA. Second, we propose a probabilistic interpretation of DCA and Gabor filtering in general, in terms of Local Generative Models. Extending this point of view to edge detection facilitates the estimation of posterior probabilities for the edge and texture classes. Third, we propose the weighted curve evolution scheme that enhances the Region Competition/ Geodesic Active Regions methods by allowing for the locally adaptive fusion of heterogeneous cues. Our segmentation results are evaluated on the Berkeley Segmentation Benchmark, and compare favorably to current state-of-the-art methods.

computer vision and pattern recognition | 2017

UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory

Iasonas Kokkinos

In this work we train in an end-to-end manner a convolutional neural network (CNN) that jointly handles low-, mid-, and high-level vision tasks in a unified architecture. Such a network can act like a swiss knife for vision tasks, we call it an UberNet to indicate its overarching nature. The main contribution of this work consists in handling challenges that emerge when scaling up to many tasks. We introduce techniques that facilitate (i) training a deep architecture while relying on diverse training sets and (ii) training many (potentially unlimited) tasks with a limited memory budget. This allows us to train in an end-to-end manner a unified CNN architecture that jointly handles (a) boundary detection (b) normal estimation (c) saliency estimation (d) semantic segmentation (e) human part segmentation (f) semantic boundary detection, (g) region proposal generation and object detection. We obtain competitive performance while jointly addressing all tasks in 0.7 seconds on a GPU. Our system will be made publicly available.

computer vision and pattern recognition | 2014

Understanding Objects in Detail with Fine-Grained Attributes

Andrea Vedaldi; Siddharth Mahendran; Stavros Tsogkas; Subhransu Maji; Ross B. Girshick; Juho Kannala; Esa Rahtu; Iasonas Kokkinos; Matthew B. Blaschko; David Weiss; Ben Taskar; Karen Simonyan; Naomi Saphra; Sammy Mohamed

We study the problem of understanding objects in detail, intended as recognizing a wide array of fine-grained object attributes. To this end, we introduce a dataset of 7, 413 airplanes annotated in detail with parts and their attributes, leveraging images donated by airplane spotters and crowd-sourcing both the design and collection of the detailed annotations. We provide a number of insights that should help researchers interested in designing fine-grained datasets for other basic level categories. We show that the collected data can be used to study the relation between part detection and attribute prediction by diagnosing the performance of classifiers that pool information from different parts of an object. We note that the prediction of certain attributes can benefit substantially from accurate part detection. We also show that, differently from previous results in object detection, employing a large number of part templates can improve detection accuracy at the expenses of detection speed. We finally propose a coarse-to-fine approach to speed up detection through a hierarchical cascade algorithm.

european conference on computer vision | 2012

Learning-Based symmetry detection in natural images

Stavros Tsogkas; Iasonas Kokkinos

In this work we propose a learning-based approach to symmetry detection in natural images. We focus on ribbon-like structures, i.e. contours marking local and approximate reflection symmetry and make three contributions to improve their detection. First, we create and make publicly available a ground-truth dataset for this task by building on the Berkeley Segmentation Dataset. Second, we extract features representing multiple complementary cues, such as grayscale structure, color, texture, and spectral clustering information. Third, we use supervised learning to learn how to combine these cues, and employ MIL to accommodate the unknown scale and orientation of the symmetric structures. We systematically evaluate the performance contribution of each individual component in our pipeline, and demonstrate that overall we consistently improve upon results obtained using existing alternatives.

Explore More