Aykut Erdem
Hacettepe University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aykut Erdem.
Journal of Vision | 2013
Erkut Erdem; Aykut Erdem
To detect visually salient elements of complex natural scenes, computational bottom-up saliency models commonly examine several feature channels such as color and orientation in parallel. They compute a separate feature map for each channel and then linearly combine these maps to produce a master saliency map. However, only a few studies have investigated how different feature dimensions contribute to the overall visual saliency. We address this integration issue and propose to use covariance matrices of simple image features (known as region covariance descriptors in the computer vision community; Tuzel, Porikli, & Meer, 2006) as meta-features for saliency estimation. As low-dimensional representations of image patches, region covariances capture local image structures better than standard linear filters, but more importantly, they naturally provide nonlinear integration of different features by modeling their correlations. We also show that first-order statistics of features could be easily incorporated to the proposed approach to improve the performance. Our experimental evaluation on several benchmark data sets demonstrate that the proposed approach outperforms the state-of-art models on various tasks including prediction of human eye fixations, salient object detection, and image-retargeting.
international conference on computer graphics and interactive techniques | 2013
Levent Karacan; Erkut Erdem; Aykut Erdem
Recent years have witnessed the emergence of new image smoothing techniques which have provided new insights and raised new questions about the nature of this well-studied problem. Specifically, these models separate a given image into its structure and texture layers by utilizing non-gradient based definitions for edges or special measures that distinguish edges from oscillations. In this study, we propose an alternative yet simple image smoothing approach which depends on covariance matrices of simple image features, aka the region covariances. The use of second order statistics as a patch descriptor allows us to implicitly capture local structure and texture information and makes our approach particularly effective for structure extraction from texture. Our experimental results have shown that the proposed approach leads to better image decompositions as compared to the state-of-the-art methods and preserves prominent edges and shading well. Moreover, we also demonstrate the applicability of our approach on some image editing and manipulation tasks such as image abstraction, texture and detail enhancement, image composition, inverse halftoning and seam carving.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2008
Cagri Aslan; Aykut Erdem; Erkut Erdem; Sibel Tari
We present a new skeletal representation along with a matching framework to address the deformable shape recognition problem. The disconnectedness arises as a result of excessive regularization that we use to describe a shape at an attainably coarse scale. Our motivation is to rely on stable properties the shape instead of inaccurately measured secondary details. The new representation does not suffer from the common instability problems of the traditional connected skeletons, and the matching process gives quite successful results on a diverse database of 2D shapes. An important difference of our approach from the conventional use of skeleton is that we replace the local coordinate frame with a global Euclidean frame supported by additional mechanisms to handle articulations and local boundary deformations. As a result, we can produce descriptions that are sensitive to any combination of changes in scale, position, orientation and articulation, as well as invariant ones.
Journal of Artificial Intelligence Research | 2016
Raffaella Bernardi; Ruket Cakici; Desmond Elliott; Aykut Erdem; Erkut Erdem; Nazli Ikizler-Cinbis; Frank Keller; Adrian Muscat; Barbara Plank
Automatic description generation from natural images is a challenging problem that has recently received a large amount of interest from the computer vision and natural language processing communities. In this survey, we classify the existing approaches based on how they conceptualize this problem, viz., models that cast description as either generation problem or as a retrieval problem over a visual or multimodal representational space. We provide a detailed review of existing models, highlighting their advantages and disadvantages. Moreover, we give an overview of the benchmark image datasets and the evaluation measures that have been developed to assess the quality of machine-generated image descriptions. Finally we extrapolate future directions in the area of automatic image description generation.
Pattern Recognition | 2009
Emre Baseski; Aykut Erdem; Sibel Tari
Skeletal trees are commonly used in order to express geometric properties of the shape. Accordingly, tree-edit distance is used to compute a dissimilarity between two given shapes. We present a new tree-edit based shape matching method which uses a recent coarse skeleton representation. The coarse skeleton representation allows us to represent both shapes and shape categories in the form of depth-1 trees. Consequently, we can easily integrate the influence of the categories into shape dissimilarity measurements. The new dissimilarity measure gives a better within group versus between group separation, and it mimics the asymmetric nature of human similarity judgements.
Computer Graphics Forum | 2015
Okan Tarhan Tursun; Ahmet Oğuz Akyüz; Aykut Erdem; Erkut Erdem
Obtaining a high quality high dynamic range (HDR) image in the presence of camera and object movement has been a long‐standing challenge. Many methods, known as HDR deghosting algorithms, have been developed over the past ten years to undertake this challenge. Each of these algorithms approaches the deghosting problem from a different perspective, providing solutions with different degrees of complexity, solutions that range from rudimentary heuristics to advanced computer vision techniques. The proposed solutions generally differ in two ways: (1) how to detect ghost regions and (2) what to do to eliminate ghosts. Some algorithms choose to completely discard moving objects giving rise to HDR images which only contain the static regions. Some other algorithms try to find the best image to use for each dynamic region. Yet others try to register moving objects from different images in the spirit of maximizing dynamic range in dynamic regions. Furthermore, each algorithm may introduce different types of artifacts as they aim to eliminate ghosts. These artifacts may come in the form of noise, broken objects, under‐ and over‐exposed regions, and residual ghosting. Given the high volume of studies conducted in this field over the recent years, a comprehensive survey of the state of the art is required. Thus, the first goal of this paper is to provide this survey. Secondly, the large number of algorithms brings about the need to classify them. Thus the second goal of this paper is to propose a taxonomy of deghosting algorithms which can be used to group existing and future algorithms into meaningful classes. Thirdly, the existence of a large number of algorithms brings about the need to evaluate their effectiveness, as each new algorithm claims to outperform its precedents. Therefore, the last goal of this paper is to share the results of a subjective experiment which aims to evaluate various state‐of‐the‐art deghosting algorithms.
international conference on acoustics, speech, and signal processing | 2002
Aykut Erdem; Erkut Erdem; Yasemin Yardimci; Volkan Atalay; A. Enis Çetin
We describe a computer vision based mouse, which can control and command the cursor of a computer or a computerized system using a camera. In order to move the cursor on the computer screen the user simply moves the mouse shaped passive device placed on a surface within the viewing area of the camera. The video generated by the camera is analyzed using computer vision techniques and the computer moves the cursor according to mouse movements. The computer vision based mouse has regions corresponding to buttons for clicking. To click a button the user simply covers one of these regions with his/her finger.
Journal of Visual Communication and Image Representation | 2016
Osman Akin; Erkut Erdem; Aykut Erdem; Krystian Mikolajczyk
We propose a deformable part-based tracking framework based on correlation filters.We present a collaborative algorithm for tracking-by-detection with coupled local and global correlation filters.We introduce a simple yet natural model for handling scale changes.Our proposed tracker handles occlusion, scaling and fast motion issues better than the existing models.Our proposed tracker gives state-of-the-art results on two benchmark datasets while keeping the processing in real time. Correlation filters have recently attracted attention in visual tracking due to their efficiency and high performance. However, their application to long-term tracking is somewhat limited since these trackers are not equipped with mechanisms to cope with challenging cases like partial occlusion, deformation or scale changes. In this paper, we propose a deformable part-based correlation filter tracking approach which depends on coupled interactions between a global filter and several part filters. Specifically, local filters provide an initial estimate, which is then used by the global filter as a reference to determine the final result. Then, the global filter provides a feedback to the part filters regarding their updates and the related deformation parameters. In this way, our proposed collaborative model handles not only partial occlusion but also scale changes. Experiments on two large public benchmark datasets demonstrate that our approach gives significantly better results compared with the state-of-the-art trackers.
Neural Computation | 2012
Aykut Erdem; Marcello Pelillo
Graph transduction is a popular class of semisupervised learning techniques that aims to estimate a classification function defined over a graph of labeled and unlabeled data points. The general idea is to propagate the provided label information to unlabeled nodes in a consistent way. In contrast to the traditional view, in which the process of label propagation is defined as a graph Laplacian regularization, this article proposes a radically different perspective, based on game-theoretic notions. Within the proposed framework, the transduction problem is formulated in terms of a noncooperative multiplayer game whereby equilibria correspond to consistent labelings of the data. An attractive feature of this formulation is that it is inherently a multiclass approach and imposes no constraint whatsoever on the structure of the pairwise similarity matrix, being able to naturally deal with asymmetric and negative similarities alike. Experiments on a number of real-world problems demonstrate that the proposed approach performs well compared with state-of-the-art algorithms, and it can deal effectively with various types of similarity relations.
international conference on computer vision | 2015
Levent Karacan; Aykut Erdem; Erkut Erdem
Previous sampling-based image matting methods typically rely on certain heuristics in collecting representative samples from known regions, and thus their performance deteriorates if the underlying assumptions are not satisfied. To alleviate this, in this paper we take an entirely new approach and formulate sampling as a sparse subset selection problem where we propose to pick a small set of candidate samples that best explains the unknown pixels. Moreover, we describe a new distance measure for comparing two samples which is based on KL-divergence between the distributions of features extracted in the vicinity of the samples. Using a standard benchmark dataset for image matting, we demonstrate that our approach provides more accurate results compared with the state-of-the-art methods.