Alper Koz
Middle East Technical University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alper Koz.
IEEE Transactions on Circuits and Systems for Video Technology | 2007
Aljoscha Smolic; Karsten Mueller; Nikolce Stefanoski; Joern Ostermann; Atanas Gotchev; Gozde Bozdagi Akar; Georgios Triantafyllidis; Alper Koz
Research efforts on 3DTV technology have been strengthened worldwide recently, covering the whole media processing chain from capture to display. Different 3DTV systems rely on different 3D scene representations that integrate various types of data. Efficient coding of these data is crucial for the success of 3DTV. Compression of pixel-type data including stereo video, multiview video, and associated depth or disparity maps extends available principles of classical video coding. Powerful algorithms and open international standards for multiview video coding and coding of video plus depth data are available and under development, which will provide the basis for introduction of various 3DTV systems and services in the near future. Compression of 3D mesh models has also reached a high level of maturity. For static geometry, a variety of powerful algorithms are available to efficiently compress vertices and connectivity. Compression of dynamic 3D geometry is currently a more active field of research. Temporal prediction is an important mechanism to remove redundancy from animated 3D mesh sequences. Error resilience is important for transmission of data over error prone channels, and multiple description coding (MDC) is a suitable way to protect data. MDC of still images and 2D video has already been widely studied, whereas multiview video and 3D meshes have been addressed only recently. Intellectual property protection of 3D data by watermarking is a pioneering research area as well. The 3D watermarking methods in the literature are classified into three groups, considering the dimensions of the main components of scene representations and the resulting components after applying the algorithm. In general, 3DTV coding technology is maturating. Systems and services may enter the market in the near future. However, the research area is relatively young compared to coding of other types of media. Therefore, there is still a lot of room for improvement and new development of algorithms.
IEEE Transactions on Circuits and Systems for Video Technology | 2008
Alper Koz; A. Aydin Alatan
Imperceptibility requirement in video watermarking is more challenging compared with its image counterpart due to the additional dimension existing in video. The embedding system should not only yield spatially invisible watermarks for each frame of the video, but it should also take the temporal dimension into account in order to avoid any flicker distortion between frames. While some of the methods in the literature approach this problem by only allowing arbitrarily small modifications within frames in different transform domains, some others simply use implicit spatial properties of the human visual system (HVS), such as luminance masking, spatial masking, and contrast masking. In addition, some approaches exploit explicitly the spatial thresholds of HVS to determine the location and strength of the watermark. However, none of the former approaches have focused on guaranteeing temporal invisibility and achieving maximum watermark strength along the temporal direction. In this paper, temporal dimension is exploited for video watermarking by means of utilizing temporal sensitivity of the HVS. The proposed method utilizes the temporal contrast thresholds of HVS to determine the maximum strength of watermark, which still gives imperceptible distortion after watermark insertion. Compared with some recognized methods in the literature, the proposed method avoids the typical visual degradations in the watermarked video, while still giving much better robustness against common video distortions, such as additive Gaussian noise, video coding, frame rate conversions, and temporal shifts, in terms of bit error rate.
IEEE Transactions on Image Processing | 2010
Alper Koz; Cevahir Cigla; A. Aydin Alatan
With the advances in image based rendering (IBR) in recent years, generation of a realistic arbitrary view of a scene from a number of original views has become cheaper and faster. One of the main applications of this progress has emerged as free-view TV(FTV), where TV-viewers select freely the viewing position and angle via IBR on the transmitted multiview video. Noting that the TV-viewer might record a personal video for this arbitrarily selected view and misuse this content, it is apparent that copyright and copy protection problems also exist and should be solved for FTV. In this paper, we focus on this newly emerged problem by proposing a watermarking method for free-view video. The watermark is embedded into every frame of multiple views by exploiting the spatial masking properties of the human visual system. Assuming that the position and rotation of the virtual camera is known, the proposed method extracts the watermark successfully from an arbitrarily generated virtual image. In order to extend the method for the case of an unknown virtual camera position and rotation, the transformations on the watermark pattern due to image based rendering operations are analyzed. Based upon this analysis, camera position and homography estimation methods are proposed for the virtual camera. The encouraging simulation results promise not only a novel method, but also a new direction for watermarking research.
international conference on image processing | 2006
Alper Koz; Cevahir Cigla; A. Aydin Alatan
The recent advances in image based rendering (IBR) has pioneered a new technology, free-view television, in which TV-viewers select freely the viewing position and angle by the application of IBR on the transmitted multi-view video. Noting that the TV-viewer might also record a personal video for this arbitrarily selected view and misuse this content, it is apparent that copyright and copy protection problems also exist and should be solved for free-view TV. In this paper, we focus on this problem by proposing a watermarking method for free-view video. The watermark is embedded into every frame of multiple views by exploiting the spatial masking properties of the human visual system (HVS). Assuming that the position and rotation for the imagery view is known, the proposed method extracts the watermark successfully from an arbitrarily generated image. In order to extend the method for the case of an unknown imagery camera position and rotation, the modifications on the watermark pattern due to image based rendering operations are also analyzed. Based on this analysis, a camera position and homography estimation method is proposed considering the operations in image based rendering. The results show that the watermark detection is achieved successfully for the cases in which the imagery camera is arbitrarily located on the camera plane.
Signal Processing-image Communication | 2014
Alper Koz; Frederic Dufaux
Backward compatibility for high dynamic range image and video compression forms one of the essential requirements in the transition phase from low dynamic range (LDR) displays to high dynamic range (HDR) displays. In a recent work [1], the problems of tone mapping and HDR video coding are originally fused together in the same mathematical framework, and an optimized solution for tone mapping is achieved in terms of the mean square error (MSE) of the logarithm of luminance values. In this paper, we improve this pioneer study in three aspects by considering its three shortcomings. First, the proposed method [1] works over the logarithms of luminance values which are not uniform with respect to Human Visual System (HVS) sensitivity. We propose to use the perceptually uniform luminance values as an alternative for the optimization of tone mapping curve. Second, the proposed method [1] does not take the quality of the resulting tone mapped images into account during the formulation in contrary to the main goal of tone mapping research. We include the LDR image quality as a constraint to the optimization problem and develop a generic methodology to compromise the trade-off between HDR and LDR image qualities for coding. Third, the proposed method [1] simply applies a low-pass filter to the generated tone curves for video frames to avoid flickering during the adaptation of the method to the video. We instead include an HVS based flickering constraint to the optimization and derive a methodology to compromise the trade-off between the rate-distortion performance and flickering distortion. The superiority of the proposed methodologies is verified with experiments on HDR images and video sequences.
visual communications and image processing | 2012
Alper Koz; Frederic Dufau
Backward compatibility for high dynamic range image and video compression forms one of the essential requirements in the transition phase from low dynamic range (LDR) displays to high dynamic range (HDR) displays. In a recent work [1], an optimized solution for tone mapping and inverse tone mapping of HDR images is achieved in terms of mean square error (MSE) of the logarithm of luminance values of HDR image pixels for backward-compatible compression. A disadvantage of this approach was to use non uniform luminance values according to Human perception for minimization, which causes quite non-natural over-illumination in the produced LDR images. In this paper, we propose to use perceptually uniform luminance values as an alternative for the optimization of tone mapping curve. The results indicate that the proposed approach gives better performance (0.5-1 dB gains) in terms of Perceptually Uniform Peak Signal to Noise Ratio (PU-PSNR) and produces more realistic LDR images.
Proceedings of SPIE | 2012
Alper Koz; Frederic Dufaux
High dynamic range (HDR) video compression has until now been approached by using the high profile of existing state-of-the-art H.264/AVC (Advanced Video Coding) codec or by separately encoding low dynamic range (LDR) video and the residue resulted from the estimation of HDR video from LDR video. Although the latter approach has a distinctive advantage of providing backward compatibility to 8-bit LDR displays, the superiority of one approach to the other in terms of the rate distortion trade-off has not been verified yet. In this paper, we first give a detailed overview of the methods in these two approaches. Then, we experimentally compare two approaches with respect to different objective and perceptual metrics, such as HDR mean square error (HDR MSE), perceptually uniform peak signal to noise ratio (PU PSNR) and HDR visible difference predictor (HDR VDP). We first conclude that the optimized methods for backward compatibility to 8-bit LDR displays are superior to the method designed for high profile encoder both for 8-bit and 12-bit mappings in terms of all metrics. Second, using higher bit-depths with a high profile encoder is giving better rate-distortion performances than employing an 8-bit mapping with an 8-bit encoder for the same method, in particular when the dynamic range of the video sequence is high. Third, rather than encoding of the residue signal in backward compatible methods, changing the quantization step size of the LDR layer encoder would be sufficient to achieve a required quality. In other words, the quality of tone mapping is more important than residue encoding for the performance of HDR image and video coding.
international conference on image processing | 2005
Alper Koz; A. Aydin Alatan
A novel oblivious video watermarking technique based on temporal sensitivity of human visual system (HVS) is proposed. The method exploits the temporal contrast thresholds of HVS to determine the maximum strength of watermark, which still gives imperceptible distortion after watermark insertion. Compared to the other methods in the literature, which do not use any HVS properties explicitly [F. Deguillaume et al 1999] or exploits only spatial properties of HVS [F. Hartung et al, 1998], the proposed method guarantees to avoid flickering problem in the watermarked video and gives better robustness results to video distortions, such as additive Gaussian noise, ITU H.263+ coding at medium bit rates and frame averaging in terms of bit error rate.
international conference on image processing | 2013
Alper Koz; Frederic Dufaux
Backward compatibility to low dynamic range (LDR) displays is an important requirement for high dynamic range (HDR) image and video coding in order to enable a successful transition to HDR technology. In a recent work [1], an optimized solution for tone mapping and inverse tone mapping of HDR images is achieved in terms of mean square error (MSE) of the logarithm of luminance values of HDR image pixels for backward-compatible compression. Although this pioneer optimization approach provides a well settled mathematical framework for tone mapping, one of its important shortcomings is not to take the quality of the resulting LDR images into account during the formulation. In this paper, we include the LDR image quality as a constraint to optimization problem and develop a methodology to compromise the trade-off between HDR image quality and LDR image quality during HDR image and video coding. The developed methodology is verified on HDR images by showing the increase (decrease) in the quality of generated LDR images while losing (gaining) from the rate-distortion performance of HDR image coding.
international conference on image processing | 2002
Alper Koz; A. Aydin Alatan
The spatial resolution of the human visual system (HVS) decreases rapidly away from the point of fixation (foveation point). By exploiting this fact, we propose a watermarking approach that embeds the watermark energy into the image periphery according to foveation-based HVS contrast thresholds. Compared to other HVS-based watermarking methods, the simulation results demonstrate an improvement in the robustness of the proposed approach against image degradations, such as JPEG compression, cropping and additive Gaussian noise, in terms of subjective measures, based on foveation. In addition, the method proposed for still images is adapted for video and the robustness of the adapted method is tested against ITU H.263+ coding.