Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Heeseok Oh is active.

Publication


Featured researches published by Heeseok Oh.


IEEE Transactions on Image Processing | 2015

3D Visual Discomfort Predictor: Analysis of Disparity and Neural Activity Statistics

Jincheol Park; Heeseok Oh; Sanghoon Lee; Alan C. Bovik

Being able to predict the degree of visual discomfort that is felt when viewing stereoscopic 3D (S3D) images is an important goal toward ameliorating causative factors, such as excessive horizontal disparity, misalignments or mismatches between the left and right views of stereo pairs, or conflicts between different depth cues. Ideally, such a model should account for such factors as capture and viewing geometries, the distribution of disparities, and the responses of visual neurons. When viewing modern 3D displays, visual discomfort is caused primarily by changes in binocular vergence while accommodation in held fixed at the viewing distance to a flat 3D screen. This results in unnatural mismatches between ocular fixations and ocular focus that does not occur in normal direct 3D viewing. This accommodation vergence conflict can cause adverse effects, such as headaches, fatigue, eye strain, and reduced visual ability. Binocular vision is ultimately realized by means of neural mechanisms that subserve the sensorimotor control of eye movements. Realizing that the neuronal responses are directly implicated in both the control and experience of 3D perception, we have developed a model-based neuronal and statistical framework called the 3D visual discomfort predictor (3D-VDP) that automatically predicts the level of visual discomfort that is experienced when viewing S3D images. 3D-VDP extracts two types of features: 1) coarse features derived from the statistics of binocular disparities and 2) fine features derived by estimating the neural activity associated with the processing of horizontal disparities. In particular, we deploy a model of horizontal disparity processing in the extrastriate middle temporal region of occipital lobe. We compare the performance of 3D-VDP with other recent discomfort prediction algorithms with respect to correlation against recorded subjective visual discomfort scores, and show that 3D-VDP is statistically superior to the other methods.


IEEE Transactions on Image Processing | 2016

Stereoscopic 3D Visual Discomfort Prediction: A Dynamic Accommodation and Vergence Interaction Model

Heeseok Oh; Sanghoon Lee; Alan C. Bovik

The human visual system perceives 3D depth following sensing via its binocular optical system, a series of massively parallel processing units, and a feedback system that controls the mechanical dynamics of eye movements and the crystalline lens. The process of accommodation (focusing of the crystalline lens) and binocular vergence is controlled simultaneously and symbiotically via cross-coupled communication between the two critical depth computation modalities. The output responses of these two subsystems, which are induced by oculomotor control, are used in the computation of a clear and stable cyclopean 3D image from the input stimuli. These subsystems operate in smooth synchronicity when one is viewing the natural world; however, conflicting responses can occur when viewing stereoscopic 3D (S3D) content on fixed displays, causing physiological discomfort. If such occurrences could be predicted, then they might also be avoided (by modifying the acquisition process) or ameliorated (by changing the relative scene depth). Toward this end, we have developed a dynamic accommodation and vergence interaction (DAVI) model that successfully predicts visual discomfort on S3D images. The DAVI model is based on the phasic and reflex responses of the fast fusional vergence mechanism. Quantitative models of accommodation and vergence mismatches are used to conduct visual discomfort prediction. Other 3D perceptual elements are included in the proposed method, including sharpness limits imposed by the depth of focus and fusion limits implied by Panums fusional area. The DAVI predictor is created by training a support vector machine on features derived from the proposed model and on recorded subjective assessment results. The experimental results are shown to produce accurate predictions of experienced visual discomfort.


IEEE Transactions on Image Processing | 2013

Visually Weighted Compressive Sensing: Measurement and Reconstruction

Hyungkeuk Lee; Heeseok Oh; Sanghoon Lee; Alan C. Bovik

Compressive sensing (CS) makes it possible to more naturally create compact representations of data with respect to a desired data rate. Through wavelet decomposition, smooth and piecewise smooth signals can be represented as sparse and compressible coefficients. These coefficients can then be effectively compressed via the CS. Since a wavelet transform divides image information into layered blockwise wavelet coefficients over spatial and frequency domains, visual improvement can be attained by an appropriate perceptually weighted CS scheme. We introduce such a method in this paper and compare it with the conventional CS. The resulting visual CS model is shown to deliver improved visual reconstructions.


Magnetic Resonance Imaging | 2014

Visually weighted reconstruction of compressive sensing MRI

Heeseok Oh; Sanghoon Lee

Compressive sensing (CS) enables the reconstruction of a magnetic resonance (MR) image from undersampled data in k-space with relatively low-quality distortion when compared to the original image. In addition, CS allows the scan time to be significantly reduced. Along with a reduction in the computational overhead, we investigate an effective way to improve visual quality through the use of a weighted optimization algorithm for reconstruction after variable density random undersampling in the phase encoding direction over k-space. In contrast to conventional magnetic resonance imaging (MRI) reconstruction methods, the visual weight, in particular, the region of interest (ROI), is investigated here for quality improvement. In addition, we employ a wavelet transform to analyze the reconstructed image in the space domain and fully utilize data sparsity over the spatial and frequency domains. The visual weight is constructed by reflecting the perceptual characteristics of the human visual system (HVS), and then applied to ℓ1 norm minimization, which gives priority to each coefficient during the reconstruction process. Using objective quality assessment metrics, it was found that an image reconstructed using the visual weight has higher local and global quality than those processed by conventional methods.


IEEE Transactions on Image Processing | 2015

3D visual discomfort predictor: Analysis of horizontal disparity and neural activity statistics

Jincheol Park; Heeseok Oh; Sanghoon Lee; Alan C. Bovik

Being able to predict the degree of visual discomfort that is felt when viewing stereoscopic 3D (S3D) images is an important goal toward ameliorating causative factors, such as excessive horizontal disparity, misalignments or mismatches between the left and right views of stereo pairs, or conflicts between different depth cues. Ideally, such a model should account for such factors as capture and viewing geometries, the distribution of disparities, and the responses of visual neurons. When viewing modern 3D displays, visual discomfort is caused primarily by changes in binocular vergence while accommodation in held fixed at the viewing distance to a flat 3D screen. This results in unnatural mismatches between ocular fixations and ocular focus that does not occur in normal direct 3D viewing. This accommodation vergence conflict can cause adverse effects, such as headaches, fatigue, eye strain, and reduced visual ability. Binocular vision is ultimately realized by means of neural mechanisms that subserve the sensorimotor control of eye movements. Realizing that the neuronal responses are directly implicated in both the control and experience of 3D perception, we have developed a model-based neuronal and statistical framework called the 3D visual discomfort predictor (3D-VDP) that automatically predicts the level of visual discomfort that is experienced when viewing S3D images. 3D-VDP extracts two types of features: 1) coarse features derived from the statistics of binocular disparities and 2) fine features derived by estimating the neural activity associated with the processing of horizontal disparities. In particular, we deploy a model of horizontal disparity processing in the extrastriate middle temporal region of occipital lobe. We compare the performance of 3D-VDP with other recent discomfort prediction algorithms with respect to correlation against recorded subjective visual discomfort scores, and show that 3D-VDP is statistically superior to the other methods.


IEEE Transactions on Image Processing | 2017

Blind Deep S3D Image Quality Evaluation via Local to Global Feature Aggregation

Heeseok Oh; Sewoong Ahn; Jongyoo Kim; Sanghoon Lee

Previously, no-reference (NR) stereoscopic 3D (S3D) image quality assessment (IQA) algorithms have been limited to the extraction of reliable hand-crafted features based on an understanding of the insufficiently revealed human visual system or natural scene statistics. Furthermore, compared with full-reference (FR) S3D IQA metrics, it is difficult to achieve competitive quality score predictions using the extracted features, which are not optimized with respect to human opinion. To cope with this limitation of the conventional approach, we introduce a novel deep learning scheme for NR S3D IQA in terms of local to global feature aggregation. A deep convolutional neural network (CNN) model is trained in a supervised manner through two-step regression. First, to overcome the lack of training data, local patch-based CNNs are modeled, and the FR S3D IQA metric is used to approximate a reference ground-truth for training the CNNs. The automatically extracted local abstractions are aggregated into global features by inserting an aggregation layer in the deep structure. The locally trained model parameters are then updated iteratively using supervised global labeling, i.e., subjective mean opinion score (MOS). In particular, the proposed deep NR S3D image quality evaluator does not estimate the depth from a pair of S3D images. The S3D image quality scores predicted by the proposed method represent a significant improvement over those of previous NR S3D IQA algorithms. Indeed, the accuracy of the proposed method is competitive with FR S3D IQA metrics, having ~ 91% correlation in terms of MOS.Previously, no-reference (NR) stereoscopic 3D (S3D) image quality assessment (IQA) algorithms have been limited to the extraction of reliable hand-crafted features based on an understanding of the insufficiently revealed human visual system or natural scene statistics. Furthermore, compared with full-reference (FR) S3D IQA metrics, it is difficult to achieve competitive quality score predictions using the extracted features, which are not optimized with respect to human opinion. To cope with this limitation of the conventional approach, we introduce a novel deep learning scheme for NR S3D IQA in terms of local to global feature aggregation. A deep convolutional neural network (CNN) model is trained in a supervised manner through two-step regression. First, to overcome the lack of training data, local patch-based CNNs are modeled, and the FR S3D IQA metric is used to approximate a reference ground-truth for training the CNNs. The automatically extracted local abstractions are aggregated into global features by inserting an aggregation layer in the deep structure. The locally trained model parameters are then updated iteratively using supervised global labeling, i.e., subjective mean opinion score (MOS). In particular, the proposed deep NR S3D image quality evaluator does not estimate the depth from a pair of S3D images. The S3D image quality scores predicted by the proposed method represent a significant improvement over those of previous NR S3D IQA algorithms. Indeed, the accuracy of the proposed method is competitive with FR S3D IQA metrics, having ~ 91% correlation in terms of MOS.


IEEE Transactions on Image Processing | 2017

Enhancement of Visual Comfort and Sense of Presence on Stereoscopic 3D Images

Heeseok Oh; Jongyoo Kim; Jinwoo Kim; Taewan Kim; Sanghoon Lee; Alan C. Bovik

Conventional stereoscopic 3D (S3D) displays do not provide accommodation depth cues of the 3D image or video contents being viewed. The sense of content depths is thus limited to cues supplied by motion parallax (for 3D video), stereoscopic vergence cues created by presenting left and right views to the respective eyes, and other contextual and perspective depth cues. The absence of accommodation cues can induce two kinds of accommodation vergence mismatches (AVM) at the fixation and peripheral points, which can result in severe visual discomfort. With the aim of alleviating discomfort arising from AVM, we propose a new visual comfort enhancement approach for processing S3D visual signals to deliver a more comfortable 3D viewing experience at the display. This is accomplished via an optimization process whereby a predictive indicator of visual discomfort is minimized, while still aiming to maintain the viewer’s sense of 3D presence by performing a suitable parallax shift, and by directed blurring of the signal. Our processing framework is defined on 3D visual coordinates that reflect the nonuniform resolution of retinal sensors and that uses a measure of 3D saliency strength. An appropriate level of blur that corresponds to the degree of parallax shift is found, making it possible to produce synthetic accommodation cues implemented using a perceptively relevant filter. By this method, AVM, the primary contributor to the discomfort felt when viewing S3D images, is reduced. We show via a series of subjective experiments that the proposed approach improves visual comfort while preserving the sense of 3D presence.


pacific rim conference on multimedia | 2015

Implementation of Human Action Recognition System Using Multiple Kinect Sensors

Beom Kwon; Do Young Kim; Junghwan Kim; Inwoong Lee; Jongyoo Kim; Heeseok Oh; Hak-Sub Kim; Sanghoon Lee

Human action recognition is an important research topic that has many potential applications such as video surveillance, human-computer interaction and virtual reality combat training. However, many researches of human action recognition have been performed in single camera system, and has low performance due to vulnerability to partial occlusion. In this paper, we propose a human action recognition system using multiple Kinect sensors to overcome the limitation of conventional single camera based human action recognition system. To test feasibility of the proposed system, we use the snapshot and temporal features which are extracted from three-dimensional (3D) skeleton data sequences, and apply the support vector machine (SVM) for classification of human action. The experiment results demonstrate the feasibility of the proposed system.


IEEE Transactions on Image Processing | 2015

3D visual discomfort predictor

Jincheol Park; Heeseok Oh; Sanghoon Lee; Alan C. Bovik

Being able to predict the degree of visual discomfort that is felt when viewing stereoscopic 3D (S3D) images is an important goal toward ameliorating causative factors, such as excessive horizontal disparity, misalignments or mismatches between the left and right views of stereo pairs, or conflicts between different depth cues. Ideally, such a model should account for such factors as capture and viewing geometries, the distribution of disparities, and the responses of visual neurons. When viewing modern 3D displays, visual discomfort is caused primarily by changes in binocular vergence while accommodation in held fixed at the viewing distance to a flat 3D screen. This results in unnatural mismatches between ocular fixations and ocular focus that does not occur in normal direct 3D viewing. This accommodation vergence conflict can cause adverse effects, such as headaches, fatigue, eye strain, and reduced visual ability. Binocular vision is ultimately realized by means of neural mechanisms that subserve the sensorimotor control of eye movements. Realizing that the neuronal responses are directly implicated in both the control and experience of 3D perception, we have developed a model-based neuronal and statistical framework called the 3D visual discomfort predictor (3D-VDP) that automatically predicts the level of visual discomfort that is experienced when viewing S3D images. 3D-VDP extracts two types of features: 1) coarse features derived from the statistics of binocular disparities and 2) fine features derived by estimating the neural activity associated with the processing of horizontal disparities. In particular, we deploy a model of horizontal disparity processing in the extrastriate middle temporal region of occipital lobe. We compare the performance of 3D-VDP with other recent discomfort prediction algorithms with respect to correlation against recorded subjective visual discomfort scores, and show that 3D-VDP is statistically superior to the other methods.


IEEE Transactions on Image Processing | 2016

Visual Presence: Viewing Geometry Visual Information of UHD S3D Entertainment

Heeseok Oh; Sanghoon Lee

To maximize the presence experienced by humans, visual content has evolved to achieve a higher visual presence in a series of high definition (HD), ultra HD (UHD), 8K UHD, and 8K stereoscopic 3D (S3D). Several studies have introduced visual presence delivered from content when viewing UHD S3D from a content analysis perspective. Nevertheless, no clear definition has been presented for visual presence, and only a subjective evaluation has been relied upon. The main reason for this is that there is a limitation to defining visual presence via the use of content information itself. In this paper, we define the visual presence for each viewing environment, and investigate a novel methodology to measure the experienced visual presence when viewing both 2D and 3D via the definition of a new metric termed volume of visual information by quantifying the influence of the viewing geometry between the display and viewer. To achieve this goal, the viewing geometry and display parameters for both flat and atypical displays are analyzed in terms of human perception by introducing a novel concept of pixel-wise geometry. In addition, perceptual weighting through analysis of content information is performed in accordance with monocular and binocular vision characteristics. In the experimental results, it is shown that the constructed model based on the viewing geometry, content, and perceptual characteristics has a high correlation of about 84% with subjective evaluations.To maximize the presence experienced by humans, visual content has evolved to achieve a higher visual presence in a series of high definition (HD), ultra HD (UHD), 8K UHD, and 8K stereoscopic 3D (S3D). Several studies have introduced visual presence delivered from content when viewing UHD S3D from a content analysis perspective. Nevertheless, no clear definition has been presented for visual presence, and only a subjective evaluation has been relied upon. The main reason for this is that there is a limitation to defining visual presence via the use of content information itself. In this paper, we define the visual presence for each viewing environment, and investigate a novel methodology to measure the experienced visual presence when viewing both 2D and 3D via the definition of a new metric termed volume of visual information by quantifying the influence of the viewing geometry between the display and viewer. To achieve this goal, the viewing geometry and display parameters for both flat and atypical displays are analyzed in terms of human perception by introducing a novel concept of pixel-wise geometry. In addition, perceptual weighting through analysis of content information is performed in accordance with monocular and binocular vision characteristics. In the experimental results, it is shown that the constructed model based on the viewing geometry, content, and perceptual characteristics has a high correlation of about 84% with subjective evaluations.

Collaboration


Dive into the Heeseok Oh's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alan C. Bovik

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge