Gokce Dane | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gokce Dane is active.

Explore More

Publication

Featured researches published by Gokce Dane.

international conference on computer communications and networks | 2007

Temporal Quality Evaluation for Enhancing Compressed Video

Kai-Chieh Yang; Gokce Dane; Khaled Helmi El-Maleh

This paper proposes a metric to quantify the effect of the video frame loss according to their impact toward perceived temporal quality. This metric utilizes information obtained from pixel domain and particularly aims at measuring the temporal video quality degradation caused by both regular and irregular frame loss. As one application, the proposed temporal quality metric is used to evaluate the benefit of adaptive thresholding in frame skipping algorithms at the encoder. Temporal quality metric shows high prediction accuracy compared to subjective quality evaluation. Furthermore, it is shown by the experimental results that proposed temporal quality metric precisely differentiates between different frame skipping approaches and can be effectively used to evaluate them. With the help of the proposed quality metric, encoders can be designed to drop frames effectively with minimal perceptual video quality degradation.

international conference on image processing | 2008

Low-complexity temporal error concealment by motion vector processing for mobile video applications

Gokce Dane; Yan Ye; Yen-Chi Lee

In this paper, we propose a novel motion vector processing (MVP) approach for temporal error concealment (TEC) in mobile video applications. Most existing TEC techniques estimate the lost motion vectors by minimizing a given distortion such as boundary variation or neighbor matching in pixel domain, which requires high power consumption compared to approaches that do not use pixel information. In wireless video applications, power consumption is a crucial factor and adoption of video postprocessing algorithms by mobile devices depends greatly on the complexity of the algorithms. The proposed algorithm uses only the received motion vectors without pixel data in the concealment process which achieves concealed video quality comparable to that of pixel domain approach but at much lower computational complexity. The performance gain is achieved by using new motion vector processing approaches that include (i) frame-to- frame motion detection, (ii) local motion classification and (iii) motion trajectory tracking. Furthermore, the proposed MVP-TEC approach provides better video quality compared to other TEC methods that do not perform motion analysis.

computer vision and pattern recognition | 2017

LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

Subarna Tripathi; Gokce Dane; Byeongkeun Kang; Vasudev Bhaskaran; Truong Q. Nguyen

Deep Convolutional Neural Networks (CNN) are the state-of-the-art performers for the object detection task. It is well known that object detection requires more com- putation and memory than image classification. In this work, we propose LCDet, a fully-convolutional neural net- work for generic object detection that aims to work in em- bedded systems. We design and develop an end-to-end TensorFlow(TF)-based model. The detection works by a single forward pass through the network. Additionally, we employ 8-bit quantization on the learned weights. As a use case, we choose face detection and train the proposed model on images containing a varying number of faces of different sizes. We evaluate the face detection perfor- mance on publicly available dataset FDDB and Widerface. Our experimental results show that the proposed method achieves comparative accuracy comparing with state-of- the-art CNN-based face detection methods while reducing the model size by 3× and memory-BW by 3 - 4× compar- ing with one of the best real-time CNN-based object de- tector YOLO [23]. Our 8-bit fixed-point TF-model pro- vides additional 4× memory reduction while keeping the accuracy nearly as good as the floating point model and achieves 20× performance gain compared to the floating point model. Thus the proposed model is amenable for em- bedded implementations and is generic to be extended to any number of categories of objects.

international conference on computer communications and networks | 2007

A Multi-Mode Video Object Segmentation Scheme for Wireless Video Applications

Gokce Dane; Khaled Helmi El-Maleh; Haohong Wang

This paper presents a new multi-mode video object segmentation scheme to automatically segment head-and-shoulder objects from a video sequence. The proposed system supports two modes of segmentation: intra-mode and inter-mode. The intra-mode segmentation integrates skin detection, facial feature localization and verification, object shape approximation, split-and-merge region growing, and object region selection and guarantees good performance of the proposed segmentation scheme. The inter-mode object segmentation makes use of background modeling and subtraction to take advantage of temporal correlation of video frame while increasing the robustness of the segmentation and speeding up the performance. The main contributions of the proposed method includes: (i) a robust and efficient background modeling in inter-mode object segmentation, (ii) a facial feature verification and a multi-face separation algorithm in intra-mode segmentation. The performance of the algorithm is illustrated by simulation results carried out on various head-and-shoulder video sequences.

Applications of Digital Image Processing XL | 2017

Low-complexity object detection with deep convolutional neural network for embedded systems

Byeongkeun Kang; Subarna Tripathi; Gokce Dane; Truong Nguyen

We investigate low-complexity convolutional neural networks (CNNs) for object detection for embedded vision applications. It is well-known that consolidation of an embedded system for CNN-based object detection is more challenging due to computation and memory requirement comparing with problems like image classification. To achieve these requirements, we design and develop an end-to-end TensorFlow (TF)-based fully-convolutional deep neural network for generic object detection task inspired by one of the fastest framework, YOLO.1 The proposed network predicts the localization of every object by regressing the coordinates of the corresponding bounding box as in YOLO. Hence, the network is able to detect any objects without any limitations in the size of the objects. However, unlike YOLO, all the layers in the proposed network is fully-convolutional. Thus, it is able to take input images of any size. We pick face detection as an use case. We evaluate the proposed model for face detection on FDDB dataset and Widerface dataset. As another use case of generic object detection, we evaluate its performance on PASCAL VOC dataset. The experimental results demonstrate that the proposed network can predict object instances of different sizes and poses in a single frame. Moreover, the results show that the proposed method achieves comparative accuracy comparing with the state-of-the-art CNN-based object detection methods while reducing the model size by 3× and memory-BW by 3 − 4× comparing with one of the best real-time CNN-based object detectors, YOLO. Our 8-bit fixed-point TF-model provides additional 4× memory reduction while keeping the accuracy nearly as good as the floating-point model. Moreover, the fixed- point model is capable of achieving 20× faster inference speed comparing with the floating-point model. Thus, the proposed method is promising for embedded implementations.

Proceedings of SPIE | 2013

Multiview synthesis for autostereoscopic displays

Gokce Dane; Vasudev Bhaskaran

Autostereoscopic (AS) displays spatially multiplex multiple views, providing a more immersive experience by enabling users to view the content from different angles without the need of 3D glasses. Multiple views could be captured from multiple cameras at different orientations, however this could be expensive, time consuming and not applicable to some applications. The goal of multiview synthesis in this paper is to generate multiple views from a stereo image pair and disparity map by using various video processing techniques including depth/disparity map processing, initial view interpolation, inpainting and post-processing. We specifically emphasize the need for disparity processing when there is no depth information is available that is associated with the 2D data and we propose a segmentation based disparity processing algorithm to improve disparity map. Furthermore we extend the texture based 2D inpainting algorithm to 3D and further improve the hole-filling performance of view synthesis. The benefit of each step of the proposed algorithm is demonstrated with comparison to state of the art algorithms in terms of visual quality and PSNR metric. Our system is evaluated in an end-to-end multi view synthesis framework where only stereo image pair is provided as input to the system and 8 views are outputted and displayed in 8-view Alioscopy AS display.

Proceedings of SPIE | 2012

Frame rate up-conversion assisted with camera auto exposure information

Liang Liang; Bob R. Hung; Gokce Dane

Frame rate up conversion (FRC) is the process of converting between different frame rates for targeted display formats. Besides scanning format applications for large displays, FRC can be used to increase the frame rate of video at the receiver end for video telephony, video streaming or playback applications for mobile platforms where bandwidth savings are crucial. Many algorithms have been proposed for decoder/receiver side FRC. However, most of them are from video encoding/decoding point of view. We systematically studied the strategies of utilizing the camera 3A (auto exposure, auto white balance and auto focus) information to assist FRC process, while in this paper we focus on the technique using camera exposure information to assist the decoder FRC. In the proposed strategy the exposure information as well as other camera 3A related information is packetized as the meta data which is attached to the corresponding frame and transmitted together with the main video bit stream to the decoder side for FRC assistance. The meta data contains information such as zooming, auto focus, AE (auto exposure), AWB (auto white balance) statistics, scene change detection, global motion detected from motion sensors. The proposed meta data consists of camera specific information which is different than just sending motion vectors or mode information to aid FRC process. Compared to traditional FRC approaches used in mobile platforms, the proposed approach is a low-complexity, low-power solution which is crucial in resource constrained environments such as mobile platforms.

Eurasip Journal on Image and Video Processing | 2012

Quality of multimedia experience

Gokce Dane; Lina J. Karam; Khaled Helmi El-Maleh; Vittorio Baroncini; Touradj Ebrahimi

Reference EPFL-ARTICLE-177930doi:10.1186/1687-5281-2012-5View record in Web of Science Record created on 2012-06-08, modified on 2017-05-10

asilomar conference on signals, systems and computers | 2006

Efficient Motion Accuracy Search for Global Motion Vector Coding

Gokce Dane; Cheolhong An; Truong Q. Nguyen

One of the drawbacks of using global motion estimation in motion-compensated video coding is the overhead associated with the number and accuracy of motion parameters. Although prediction gain increases with increased motion accuracy, higher accuracy is not affordable since this increases motion bit rate as well. Therefore, an efficient global motion vector coding algorithm is necessary for improving the prediction gain. In this paper, we present a new solution for global motion vector coding which finds the best motion accuracy for each frame. Given a fixed motion bit-rate for global motion vectors, the proposed algorithm allocates the bits among differential motion vectors optimally by a best-accuracy search method. Secondly, we propose a novel mean square error modeling approach which reduces the complexity of the optimal search algorithm. The experimental results show that the proposed algorithm give higher peak signal-to-noise ratio (PSNR) compared to the global motion vector coding algorithm proposed in MPEG- 4 standard.

Archive | 2008