Piotr Koniusz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Piotr Koniusz is active.

Explore More

Publication

Featured researches published by Piotr Koniusz.

Computer Vision and Image Understanding | 2013

Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection

Piotr Koniusz; Fei Yan; Krystian Mikolajczyk

Bag-of-Words lies at a heart of modern object category recognition systems. After descriptors are extracted from images, they are expressed as vectors representing visual word content, referred to as mid-level features. In this paper, we review a number of techniques for generating mid-level features, including two variants of Soft Assignment, Locality-constrained Linear Coding, and Sparse Coding. We also isolate the underlying properties that affect their performance. Moreover, we investigate various pooling methods that aggregate mid-level features into vectors representing images. Average pooling, Max-pooling, and a family of likelihood inspired pooling strategies are scrutinised. We demonstrate how both coding schemes and pooling methods interact with each other. We generalise the investigated pooling methods to account for the descriptor interdependence and introduce an intuitive concept of improved pooling. We also propose a coding-related improvement to increase its speed. Lastly, state-of-the-art performance in classification is demonstrated on Caltech101, Flower17, and ImageCLEF11 datasets.

international conference on image processing | 2011

Spatial Coordinate Coding to reduce histogram representations, Dominant Angle and Colour Pyramid Match

Piotr Koniusz; Krystian Mikolajczyk

Spatial Pyramid Match lies at a heart of modern object category recognition systems. Once image descriptors are expressed as histograms of visual words, they are further deployed across spatial pyramid with coarse-to-fine spatial location grids. However, such representation results in extreme histogram vectors of 200K or more elements increasing computational and memory requirements. This paper investigates alternative ways of introducing spatial information during formation of histograms. Specifically, we propose to apply spatial location information at a descriptor level and refer to it as Spatial Coordinate Coding. Alternatively, x, y, radius, or angle is used to perform semi-coding. This is achieved by adding one of the spatial components at the descriptor level whilst applying Pyramid Match to another. Lastly, we demonstrate that Pyramid Match can be applied robustly to other measurements: Dominant Angle and Colour. We demonstrate state-of-the art results on two datasets with means of Soft Assignment and Sparse Coding.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2017

Higher-Order Occurrence Pooling for Bags-of-Words: Visual Concept Detection

Piotr Koniusz; Fei Yan; Philippe Henri Gosselin; Krystian Mikolajczyk

In object recognition, the Bag-of-Words model assumes: i) extraction of local descriptors from images, ii) embedding the descriptors by a coder to a given visual vocabulary space which results in mid-level features, iii) extracting statistics from mid-level features with a pooling operator that aggregates occurrences of visual words in images into signatures, which we refer to as First-order Occurrence Pooling. This paper investigates higher-order pooling that aggregates over co-occurrences of visual words. We derive Bag-of-Words with Higher-order Occurrence Pooling based on linearisation of Minor Polynomial Kernel, and extend this model to work with various pooling operators. This approach is then effectively used for fusion of various descriptor types. Moreover, we introduce Higher-order Occurrence Pooling performed directly on local image descriptors as well as a novel pooling operator that reduces the correlation in the image signatures. Finally, First-, Second-, and Third-order Occurrence Pooling are evaluated given various coders and pooling operators on several widely used benchmarks. The proposed methods are compared to other approaches such as Fisher Vector Encoding and demonstrate improved results.

european conference on computer vision | 2016

Tensor Representations via Kernel Linearization for Action Recognition from 3D Skeletons

Piotr Koniusz; Anoop Cherian; Fatih Porikli

In this paper, we explore tensor representations that can compactly capture higher-order relationships between skeleton joints for 3D action recognition. We first define RBF kernels on 3D joint sequences, which are then linearized to form kernel descriptors. The higher-order outer-products of these kernel descriptors form our tensor representations. We present two different kernels for action recognition, namely (i) a sequence compatibility kernel that captures the spatio-temporal compatibility of joints in one sequence against those in the other, and (ii) a dynamics compatibility kernel that explicitly models the action dynamics of a sequence. Tensors formed from these kernels are then used to train an SVM. We present experiments on several benchmark datasets and demonstrate state of the art results, substantiating the effectiveness of our representations.

computer vision and pattern recognition | 2017

Domain Adaptation by Mixture of Alignments of Second-or Higher-Order Scatter Tensors

Piotr Koniusz; Yusuf Tas; Fatih Porikli

In this paper, we propose an approach to the domain adaptation, dubbed Second-or Higher-order Transfer of Knowledge (So-HoT), based on the mixture of alignments of second-or higher-order scatter statistics between the source and target domains. The human ability to learn from few labeled samples is a recurring motivation in the literature for domain adaptation. Towards this end, we investigate the supervised target scenario for which few labeled target training samples per category exist. Specifically, we utilize two CNN streams: the source and target networks fused at the classifier level. Features from the fully connected layers fc7 of each network are used to compute second-or even higher-order scatter tensors, one per network stream per class. As the source and target distributions are somewhat different despite being related, we align the scatters of the two network streams of the same class (within-class scatters) to a desired degree with our bespoke loss while maintaining good separation of the between-class scatters. We train the entire network in end-to-end fashion. We provide evaluations on the standard Office benchmark (visual domains) and RGB-D combined with Caltech256 (depth-to-rgb transfer). We attain state-of-the-art results.

british machine vision conference | 2009

Segmentation Based Interest Points and Evaluation of Unsupervised Image Segmentation Methods.

Piotr Koniusz; Krystian Mikolajczyk

This paper investigates segmentation based interest points for matching and recognition. We propose two simple methods for extracting features from the segmentation maps, which focus on the boundaries and centres of the gravity of the segments. In addition, this can be considered a novel approach for evaluating unsupervised image segmentation algorithms. Former evaluations aim at estimating segmentation quality by how well resulting segments adhere to the contours separating ground-truth foregrounds from backgrounds and therefore explicitly focus on particular objects of interest. In contrast, we propose to measure the robustness of segmentations by the repeatability of features extracted from segments on images related by various geometric and photometric transformations. Further, our evaluation provides a new insight into suitability of the segmentation methods for generating local features for image retrieval or recognition. Several segmentation methods are evaluated and compared to state-of-the art interest point detectors using the repeatability criteria as well as standard matching and recognition benchmarks.

international conference on image processing | 2011

Soft assignment of visual words as Linear Coordinate Coding and optimisation of its reconstruction error

Piotr Koniusz; Krystian Mikolajczyk

Visual Word Uncertainty also referred to as Soft Assignment is a well established technique for representing images as histograms by flexible assignment of image descriptors to a visual vocabulary. Recently, an attention of the community dealing with the object category recognition has been drawn to Linear Coordinate Coding methods. In this work, we focus on Soft Assignment as it yields good results amidst competitive methods. We show that one can take two views on Soft Assignment: an approach derived from Gaussian Mixture Model or special case of Linear Coordinate Coding. The latter view helps us propose how to optimise smoothing factor of Soft Assignment in a way that minimises descriptor reconstruction error and maximises classification performance. In turns, this renders tedious cross-validation towards establishing this parameter unnecessary and yields it a handy technique. We demonstrate state-of-the-art performance of such optimised assignment on two image datasets and several types of descriptors.

international conference on pattern recognition | 2010

On a Quest for Image Descriptors Based on Unsupervised Segmentation Maps

Piotr Koniusz; Krystian Mikolajczyk

This paper investigates segmentation-based image descriptors for object category recognition. In contrast to commonly used interest points the proposed descriptors are extracted from pairs of adjacent regions given by a segmentation method. In this way we exploit semi-local structural information from the image. We propose to use the segments as spatial bins for descriptors of various image statistics based on gradient, colour and region shape. Proposed descriptors are validated on standard recognition benchmarks. Results show they outperform state-of-the-art reference descriptors with 5.6x less data and achieve comparable results to them with 8.6x less data. The proposed descriptors are complementary to SIFT and achieve state-of-the-art results when combined together within a kernel based classifier.

computer vision and pattern recognition | 2016

Sparse Coding for Third-Order Super-Symmetric Tensor Descriptors with Application to Texture Recognition

Piotr Koniusz; Anoop Cherian

Super-symmetric tensors - a higher-order extension of scatter matrices - are becoming increasingly popular in machine learning and computer vision for modeling data statistics, co-occurrences, or even as visual descriptors. They were shown recently to outperform second-order approaches, however, the size of these tensors are exponential in the data dimensionality, which is a significant concern. In this paper, we study third-order supersymmetric tensor descriptors in the context of dictionary learning and sparse coding. For this purpose, we propose a novel non-linear third-order texture descriptor. Our goal is to approximate these tensors as sparse conic combinations of atoms from a learned dictionary. Apart from the significant benefits to tensor compression that this framework offers, our experiments demonstrate that the sparse coefficients produced by this scheme lead to better aggregation of high-dimensional data and showcase superior performance on two common computer vision tasks compared to the state of the art.

workshop on applications of computer vision | 2017

Higher-Order Pooling of CNN Features via Kernel Linearization for Action Recognition

Anoop Cherian; Piotr Koniusz; Stephen Gould

Most successful deep learning algorithms for action recognition extend models designed for image-based tasks such as object recognition to video. Such extensions are typically trained for actions on single video frames or very short clips, and then their predictions from sliding-windows over the video sequence are pooled for recognizing the action at the sequence level. Usually this pooling step uses the first-order statistics of frame-level action predictions. In this paper, we explore the advantages of using higherorder correlations, specifically, we introduce Higher-order Kernel (HOK) descriptors generated from the late fusion of CNN classifier scores from all the frames in a sequence. To generate these descriptors, we use the idea of kernel linearization. Specifically, a similarity kernel matrix, which captures the temporal evolution of deep classifier scores, is first linearized into kernel feature maps. The HOK descriptors are then generated from the higher-order cooccurrences of these feature maps, and are then used as input to a video-level classifier. We provide experiments on two fine-grained action recognition datasets, and show that our scheme leads to state-of-the-art results.

Explore More