Tomasz Trzcinski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tomasz Trzcinski is active.

Explore More

Publication

Featured researches published by Tomasz Trzcinski.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

BRIEF: Computing a Local Binary Descriptor Very Fast

Michael Calonder; Vincent Lepetit; Mustafa Özuysal; Tomasz Trzcinski; Christoph Strecha; Pascal Fua

Binary descriptors are becoming increasingly popular as a means to compare feature points very fast while requiring comparatively small amounts of memory. The typical approach to creating them is to first compute floating-point ones, using an algorithm such as SIFT, and then to binarize them. In this paper, we show that we can directly compute a binary descriptor, which we call BRIEF, on the basis of simple intensity difference tests. As a result, BRIEF is very fast both to build and to match. We compare it against SURF and SIFT on standard benchmarks and show that it yields comparable recognition accuracy, while running in an almost vanishing fraction of the time required by either.

computer vision and pattern recognition | 2013

Boosting Binary Keypoint Descriptors

Tomasz Trzcinski; C. Mario Christoudias; Pascal Fua; Vincent Lepetit

Binary key point descriptors provide an efficient alternative to their floating-point competitors as they enable faster processing while requiring less memory. In this paper, we propose a novel framework to learn an extremely compact binary descriptor we call Bin Boost that is very robust to illumination and viewpoint changes. Each bit of our descriptor is computed with a boosted binary hash function, and we show how to efficiently optimize the different hash functions so that they complement each other, which is key to compactness and robustness. The hash functions rely on weak learners that are applied directly to the image patches, which frees us from any intermediate representation and lets us automatically learn the image gradient pooling configuration of the final descriptor. Our resulting descriptor significantly outperforms the state-of-the-art binary descriptors and performs similarly to the best floating-point descriptors at a fraction of the matching time and memory footprint.

european conference on computer vision | 2012

Efficient discriminative projections for compact binary descriptors

Tomasz Trzcinski; Vincent Lepetit

Binary descriptors of image patches are increasingly popular given that they require less storage and enable faster processing. This, however, comes at a price of lower recognition performances. To boost these performances, we project the image patches to a more discriminative subspace, and threshold their coordinates to build our binary descriptor. However, applying complex projections to the patches is slow, which negates some of the advantages of binary descriptors. Hence, our key idea is to learn the discriminative projections so that they can be decomposed into a small number of simple filters for which the responses can be computed fast. We show that with as few as 32 bits per descriptor we outperform the state-of-the-art binary descriptors in terms of both accuracy and efficiency.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Learning Image Descriptors with Boosting

Tomasz Trzcinski; C. Mario Christoudias; Vincent Lepetit

We propose a novel and general framework to learn compact but highly discriminative floating-point and binary local feature descriptors. By leveraging the boosting-trick we first show how to efficiently train a compact floating-point descriptor that is very robust to illumination and viewpoint changes. We then present the main contribution of this paper—a binary extension of the framework that demonstrates the real advantage of our approach and allows us to compress the descriptor even further. Each bit of the resulting binary descriptor, which we call BinBoost, is computed with a boosted binary hash function, and we show how to efficiently optimize the hash functions so that they are complementary, which is key to compactness and robustness. As we do not put any constraints on the weak learner configuration underlying each hash function, our general framework allows us to optimize the sampling patterns of recently proposed hand-crafted descriptors and significantly improve their performance. Moreover, our boosting scheme can easily adapt to new applications and generalize to other types of image data, such as faces, while providing state-of-the-art results at a fraction of the matching time and memory footprint.

IEEE Transactions on Image Processing | 2014

Receptive Fields Selection for Binary Feature Description

Bin Fan; Qingqun Kong; Tomasz Trzcinski; Zhiheng Wang; Chunhong Pan; Pascal Fua

Feature description for local image patch is widely used in computer vision. While the conventional way to design local descriptor is based on expert experience and knowledge, learning-based methods for designing local descriptor become more and more popular because of their good performance and data-driven property. This paper proposes a novel data-driven method for designing binary feature descriptor, which we call receptive fields descriptor (RFD). Technically, RFD is constructed by thresholding responses of a set of receptive fields, which are selected from a large number of candidates according to their distinctiveness and correlations in a greedy way. Using two different kinds of receptive fields (namely rectangular pooling area and Gaussian pooling area) for selection, we obtain two binary descriptors RFDR and RFDG accordingly. Image matching experiments on the well-known patch data set and Oxford data set demonstrate that RFD significantly outperforms the state-of-the-art binary descriptors, and is comparable with the best float-valued descriptors at a fraction of processing time. Finally, experiments on object recognition tasks confirm that both RFDR and RFDG successfully bridge the performance gap between binary descriptors and their floating-point competitors.

Pattern Recognition Letters | 2012

Thick boundaries in binary space and their influence on nearest-neighbor search

Tomasz Trzcinski; Vincent Lepetit; Pascal Fua

Binary descriptors allow faster similarity computation than real-valued ones while requiring much less storage. As a result, many algorithms have recently been proposed to binarize floating-point descriptors so that they can be searched for quickly. Unfortunately, even if the similarity between vectors can be computed fast, exhaustive linear search remains impractical for truly large databases and approximate nearest neighbor (ANN) search is still required. It is therefore surprising that relatively little attention has been paid to the efficiency of ANN algorithms on binary vectors and this is the focus of this paper. We first show that binary-space Voronoi diagrams have thick boundaries, meaning that there are many points that lie at the same distance from two random points. This violates the implicit assumption made by most ANN algorithms that points can be neatly assigned to clusters centered around a set of cluster centers. As a result, state-of-the-art algorithms that can operate on binary vectors exhibit much lower performance than those that work with floating point ones. The above analysis is the first contribution of the paper. The second one is two effective ways to overcome this limitation, by appropriately randomizing either a tree-based algorithm or hashing-based one. In both cases, we show that we obtain precision/recall curves that are similar to those than can be obtained using floating point number calculation, but at much reduced computational cost.

computer vision and pattern recognition | 2017

Deep Alignment Network: A Convolutional Neural Network for Robust Face Alignment

Marek Kowalski; Jacek Naruniec; Tomasz Trzcinski

In this paper, we propose Deep Alignment Network (DAN), a robust face alignment method based on a deep neural network architecture. DAN consists of multiple stages, where each stage improves the locations of the facial landmarks estimated by the previous stage. Our method uses entire face images at all stages, contrary to the recently proposed face alignment methods that rely on local patches. This is possible thanks to the use of landmark heatmaps which provide visual information about landmark locations estimated at the previous stages of the algorithm. The use of entire face images rather than patches allows DAN to handle face images with large variation in head pose and difficult initializations. An extensive evaluation on two publicly available datasets shows that DAN reduces the state-of-the-art failure rate by up to 70%. Our method has also been submitted for evaluation as part of the Menpo challenge.

IEEE Transactions on Multimedia | 2017

Predicting Popularity of Online Videos Using Support Vector Regression

Tomasz Trzcinski; Przemyslaw Stefan Rokita

In this work, we propose a regression method to predict the popularity of an online video measured by its number of views. Our method uses Support Vector Regression with Gaussian radial basis functions. We show that predicting popularity patterns with this approach provides more precise and more stable prediction results, mainly thanks to the nonlinear character of the proposed method as well as its robustness. We prove the superiority of our method against the state of the art using datasets containing almost 24xa0000 videos from YouTube and Facebook. We also show that using visual features, such as the outputs of deep neural networks or scene dynamics’ metrics, can be useful for popularity prediction before content publication. Furthermore, we show that popularity prediction accuracy can be improved by combining early distribution patterns with social and visual features and that social features represent a much stronger signal in terms of video popularity prediction than the visual ones.

international syposium on methodologies for intelligent systems | 2017

Shallow Reading with Deep Learning: Predicting Popularity of Online Content Using only Its Title

Wojciech Stokowiec; Tomasz Trzcinski; Krzysztof Wołk; Krzysztof Marasek; Przemyslaw Stefan Rokita

With the ever decreasing attention span of contemporary Internet users, the title of online content (such as a news article or video) can be a major factor in determining its popularity. To take advantage of this phenomenon, we propose a new method based on a bidirectional Long Short-Term Memory (LSTM) neural network designed to predict the popularity of online content using only its title. We evaluate the proposed architecture on two distinct datasets of news articles and news videos distributed in social media that contain over 40,000 samples in total. On those datasets, our approach improves the performance over traditional shallow approaches by a margin of 15%. Additionally, we show that using pre-trained word vectors in the embedding layer improves the results of LSTM models, especially when the training set is small. To our knowledge, this is the first attempt of applying popularity prediction using only textual information from the title.

international conference on information systems | 2017

Speaker Diarization Using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings

Paweł Cyrta; Tomasz Trzcinski; Wojciech Stokowiec

In this paper we propose a new method of speaker diarization that employs a deep learning architecture to learn speaker embeddings. In contrast to the traditional approaches that build their speaker embeddings using manually hand-crafted spectral features, we propose to train for this purpose a recurrent convolutional neural network applied directly on magnitude spectrograms. To compare our approach with the state of the art, we collect and release for the public an additional dataset of over 6 hours of fully annotated broadcast material. The results of our evaluation on the new dataset and three other benchmark datasets show that our proposed method significantly outperforms the competitors and reduces diarization error rate by a large margin of over 30% with respect to the baseline.

Explore More