Is this you? Create Your Porfile

Kostadin Dabov

Tampere University of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kostadin Dabov is active.

Explore More

Publication

Featured researches published by Kostadin Dabov.

electronic imaging | 2006

Image denoising with block-matching and 3D filtering

Kostadin Dabov; Alessandro Foi; Vladimir Katkovnik; Karen O. Egiazarian

We present a novel approach to still image denoising based on effective filtering in 3D transform domain by combining sliding-window transform processing with block-matching. We process blocks within the image in a sliding manner and utilize the block-matching concept by searching for blocks which are similar to the currently processed one. The matched blocks are stacked together to form a 3D array and due to the similarity between them, the data in the array exhibit high level of correlation. We exploit this correlation by applying a 3D decorrelating unitary transform and effectively attenuate the noise by shrinkage of the transform coefficients. The subsequent inverse 3D transform yields estimates of all matched blocks. After repeating this procedure for all image blocks in sliding manner, the final estimate is computed as weighed average of all overlapping blockestimates. A fast and efficient algorithm implementing the proposed approach is developed. The experimental results show that the proposed method delivers state-of-art denoising performance, both in terms of objective criteria and visual quality.

electronic imaging | 2008

Image restoration by sparse 3D transform-domain collaborative filtering

Kostadin Dabov; Alessandro Foi; Vladimir Katkovnik; Karen O. Egiazarian

We propose an image restoration technique exploiting regularized inversion and the recent block-matching and 3D filtering (BM3D) denoising filter. The BM3D employs a non-local modeling of images by collecting similar image patches in 3D arrays. The so-called collaborative filtering applied on such a 3D array is realized by transformdomain shrinkage. In this work, we propose an extension of the BM3D filter for colored noise, which we use in a two-step deblurring algorithm to improve the regularization after inversion in discrete Fourier domain. The first step of the algorithm is a regularized inversion using BM3D with collaborative hard-thresholding and the seconds step is a regularized Wiener inversion using BM3D with collaborative Wiener filtering. The experimental results show that the proposed technique is competitive with and in most cases outperforms the current best image restoration methods in terms of improvement in signal-to-noise ratio.

electronic imaging | 2006

Shape-adaptive DCT for denoising and image reconstruction

Alessandro Foi; Kostadin Dabov; Vladimir Katkovnik; Karen O. Egiazarian

The shape-adaptive DCT (SA-DCT) can be computed on a support of arbitrary shape, but retains a computational complexity comparable to that of the usual separable block DCT. Despite the near-optimal decorrelation and energy compaction properties, application of the SA-DCT has been rather limited, targeted nearly exclusively to video compression. It has been recently proposed by the authors8 to employ the SA-DCT for still image denoising. We use the SA-DCT in conjunction with the directional LPA-ICI technique, which defines the shape of the transforms support in a pointwise adaptive manner. The thresholded or modified SA-DCT coefficients are used to reconstruct a local estimate of the signal within the adaptive-shape support. Since supports corresponding to different points are in general overlapping, the local estimates are averaged together using adaptive weights that depend on the regions statistics. In this paper we further develop this novel approach and extend it to more general restoration problems, with particular emphasis on image deconvolution. Simulation experiments show a state-of-the-art quality of the final estimate, both in terms of objective criteria and visual appearance. Thanks to the adaptive support, reconstructed edges are clean, and no unpleasant ringing artifacts are introduced by the fitted transform.

Multimedia Tools and Applications | 2014

Multimodal extraction of events and of information about the recording activity in user generated videos

Francesco Cricri; Kostadin Dabov; Igor Danilo Diego Curcio; Sujeet Shyamsundar Mate; Moncef Gabbouj

In this work we propose methods that exploit context sensor data modalities for the task of detecting interesting events and extracting high-level contextual information about the recording activity in user generated videos. Indeed, most camera-enabled electronic devices contain various auxiliary sensors such as accelerometers, compasses, GPS receivers, etc. Data captured by these sensors during the media acquisition have already been used to limit camera degradations such as shake and also to provide some basic tagging information such as the location. However, exploiting the sensor-recordings modality for subsequent higher-level information extraction such as interesting events has been a subject of rather limited research, further constrained to specialized acquisition setups. In this work, we show how these sensor modalities allow inferring information (camera movements, content degradations) about each individual video recording. In addition, we consider a multi-camera scenario, where multiple user generated recordings of a common scene (e.g., music concerts) are available. For this kind of scenarios we jointly analyze these multiple video recordings and their associated sensor modalities in order to extract higher-level semantics of the recorded media: based on the orientation of cameras we identify the region of interest of the recorded scene, by exploiting correlation in the motion of different cameras we detect generic interesting events and estimate their relative position. Furthermore, by analyzing also the audio content captured by multiple users we detect more specific interesting events. We show that the proposed multimodal analysis methods perform well on various recordings obtained in real live music performances.

conference on multimedia modeling | 2012

Sensor-based analysis of user generated video for multi-camera video remixing

Francesco Cricri; Igor Danilo Diego Curcio; Sujeet Shyamsundar Mate; Kostadin Dabov; Moncef Gabbouj

In this work we propose to exploit context sensor data for analyzing user generated videos. Firstly, we perform a low-level indexing of the recorded media with the instantaneous compass orientations of the recording device. Subsequently, we exploit the low level indexing to obtain a higher level indexing for discovering camera panning movements, classifying them, and for identifying the Region of Interest (ROI) of the recorded event. Thus, we extract information about the content without performing content analysis but by leveraging sensor data analysis. Furthermore, we develop an automatic remixing system that exploits the obtained high-level indexing for producing a video remix. We show that the proposed sensor-based analysis can correctly detect and classify camera panning and identify the ROI; in addition, we provide examples of their application to automatic video remixing.

international symposium on multimedia | 2011

Multimodal Event Detection in User Generated Videos

Francesco Cricri; Kostadin Dabov; Igor Danilo Diego Curcio; Sujeet Shyamsundar Mate; Moncef Gabbouj

Nowadays most camera-enabled electronic devices contain various auxiliary sensors such as accelerometers, gyroscopes, compasses, GPS receivers, etc. These sensors are often used during the media acquisition to limit camera degradations such as shake and also to provide some basic tagging information such as the location used in geo-tagging. Surprisingly, exploiting the sensor-recordings modality for high-level event detection has been a subject of rather limited research, further constrained to highly specialized acquisition setups. In this work, we show how these sensor modalities, alone or in combination with content-based analysis, allow inferring information about the video content. In addition, we consider a multi-camera scenario, where multiple user generated recordings of a common scene (e.g., music concerts, public events) are available. In order to understand some higher-level semantics of the recorded media, we jointly analyze the individual video recordings and sensor measurements of the multiple users. The detected semantics include generic interesting events and some more specific events. The detection exploits correlations in the camera motion and in the audio content of multiple users. We show that the proposed multimodal analysis methods perform well on various recordings obtained in real live music performances.

advances in multimedia | 2012

Multimodal semantics extraction from user-generated videos

Francesco Cricri; Kostadin Dabov; Mikko Roininen; Sujeet Shyamsundar Mate; Igor Danilo Diego Curcio; Moncef Gabbouj

User-generated video content has grown tremendously fast to the point of outpacing professional content creation. In this work we develop methods that analyze contextual information of multiple user-generated videos in order to obtain semantic information about public happenings (e.g., sport and live music events) being recorded in these videos. One of the key contributions of this work is a joint utilization of different data modalities, including such captured by auxiliary sensors during the video recording performed by each user. In particular, we analyze GPS data, magnetometer data, accelerometer data, video- and audio-content data. We use these data modalities to infer information about the event being recorded, in terms of layout (e.g., stadium), genre, indoor versus outdoor scene, and the main area of interest of the event. Furthermore we propose a method that automatically identifies the optimal set of cameras to be used in a multicamera video production. Finally, we detect the camera users which fall within the field of view of other cameras recording at the same public happening. We show that the proposed multimodal analysis methods perform well on various recordings obtained in real sport events and live music performances.

IEEE Transactions on Image Processing | 2007