Sergey Zagoruyko | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sergey Zagoruyko is active.

Explore More

Publication

Featured researches published by Sergey Zagoruyko.

british machine vision conference | 2016

Wide Residual Networks

Sergey Zagoruyko; Nikos Komodakis

Deep residual networks were shown to be able to scale up to thousands of layers and still have improving performance. However, each fraction of a percent of improved accuracy costs nearly doubling the number of layers, and so training very deep residual networks has a problem of diminishing feature reuse, which makes these networks very slow to train. To tackle these problems, in this paper we conduct a detailed experimental study on the architecture of ResNet blocks, based on which we propose a novel architecture where we decrease depth and increase width of residual networks. We call the resulting network structures wide residual networks (WRNs) and show that these are far superior over their commonly used thin and very deep counterparts. For example, we demonstrate that even a simple 16-layer-deep wide residual network outperforms in accuracy and efficiency all previous deep residual networks, including thousand-layer-deep networks, achieving new state-of-the-art results on CIFAR, SVHN, COCO, and significant improvements on ImageNet. Our code and models are available at this https URL

computer vision and pattern recognition | 2015

Learning to compare image patches via convolutional neural networks

Sergey Zagoruyko; Nikos Komodakis

In this paper we show how to learn directly from image data (i.e., without resorting to manually-designed features) a general similarity function for comparing image patches, which is a task of fundamental importance for many computer vision problems. To encode such a function, we opt for a CNN-based model that is trained to account for a wide variety of changes in image appearance. To that end, we explore and study multiple neural network architectures, which are specifically adapted to this task. We show that such an approach can significantly outperform the state-of-the-art on several problems and benchmark datasets.

computer vision and pattern recognition | 2015

A MRF shape prior for facade parsing with occlusions

Mateusz Kozinski; Raghudeep Gadde; Sergey Zagoruyko; Guillaume Obozinski; Renaud Marlet

We present a new shape prior formalism for the segmentation of rectified facade images. It combines the simplicity of split grammars with unprecedented expressive power: the capability of encoding simultaneous alignment in two dimensions, facade occlusions and irregular boundaries between facade elements. We formulate the task of finding the most likely image segmentation conforming to a prior of the proposed form as a MAP-MRF problem over a 4-connected pixel grid, and propose an efficient optimization algorithm for solving it. Our method simultaneously segments the visible and occluding objects, and recovers the structure of the occluded facade. We demonstrate state-of-the-art results on a number of facade segmentation datasets.

Computer Vision and Image Understanding | 2017

Deep compare: A study on using convolutional neural networks to compare image patches

Sergey Zagoruyko; Nikos Komodakis

Abstract Comparing patches across images is probably one of the most fundamental tasks in computer vision and image analysis, that has given rise to the development of many hand-designed feature descriptors over the past years, including SIFT, that had a huge impact in the computer vision community. Yet, such manually designed descriptors may be unable to take into account in an optimal manner all the different factors that can affect the final appearance of image patches. On the other hand, nowadays one can easily gain access to (or even generate using available software) large datasets that contain patch correspondences between images. This begs the following question: can we make proper use of such datasets to automatically learn a similarity function for image patches ? Our goal in this work is to affirmatively address the above question. We show how to learn directly from image data (i.e., without resorting to manually-designed features) a general similarity function for comparing image patches. To encode such a function, we opt for a CNN-based model that is trained to account for a wide variety of changes in image appearance. To that end, we explore and study multiple neural network architectures, including novel NCC-networks, which are specifically adapted to this task. We show that such an approach can significantly outperform the state-of-the-art on several problems and benchmark datasets.

computer vision and pattern recognition | 2016

Depth Camera Based on Color-Coded Aperture

Vladimir Paramonov; Ivan Panchenko; Victor Bucha; Andrey Drogolyub; Sergey Zagoruyko

In this paper we present a single-lens single-frame passive depth sensor based on conventional imaging system with minor hardware modifications. It is based on colorcoded aperture approach and has high light-efficiency which allows capturing images even with handheld devices with small cameras. The sensor measures depth in millimeters in the whole frame, in contrast to prior-art approaches. Contributions of this paper are: (1) introduction of novel light-efficient coded aperture designs and corresponding algorithm modification, (2) depth sensor calibration procedure and disparity to depth conversion method, (3) a number of color-coded aperture based depth sensor implementations including a DSLR based prototype, a smartphone based prototype and a compact camera based prototype, (4) applications including real-time 3D scene reconstruction and depth based image effects.

arXiv: Computer Vision and Pattern Recognition | 2018

Compressing the Input for CNNs with the First-Order Scattering Transform

Edouard Oyallon; Eugene Belilovsky; Sergey Zagoruyko; Michal Valko

We study the first-order scattering transform as a candidate for reducing the signal processed by a convolutional neural network (CNN). We show theoretical and empirical evidence that in the case of natural images and sufficiently small translation invariance, this transform preserves most of the signal information needed for classification while substantially reducing the spatial resolution and total signal size. We demonstrate that cascading a CNN with this representation performs on par with ImageNet classification models, commonly used in downstream tasks, such as the ResNet-50. We subsequently apply our trained hybrid ImageNet model as a base model on a detection system, which has typically larger image inputs. On Pascal VOC and COCO detection tasks we demonstrate improvements in the inference speed and training memory consumption compared to models trained directly on the input image.

british machine vision conference | 2016