Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiongkuo Min is active.

Publication


Featured researches published by Xiongkuo Min.


international symposium on circuits and systems | 2014

Visual attention data for image quality assessment databases

Xiongkuo Min; Guangtao Zhai; Zhongpai Gao; Ke Gu

Images usually contain areas that particularly attract peoples attention and visual attention is an important feature of human visual system (HVS). Visual attention had been shown to be effective in improving performance of existing image quality assessment (IQA) metrics. However, with the quick advancement of IQA research, the booming of open IQA databases calls for associated comprehensive and accurate visual attention dataset. Despite of the large number of existing computational attention/saliency models, the most accurate measure of human attention is still human based. In this research, we first conduct extensive eye tracking experiments for all the pristine images from the seven widely used IQA databases (LIVE, TID2008, CSIQ, Toyama, LIVE Multiply Distortion, IVC and A57 databases). Then we propose a gaze-duration adaptive weighting approach to generate saliency maps from the eye tracking data. When applied on the IQA databases, experimental results suggest that accuracy of benchmark quality metrics, e.g. PSNR and SSIM can be systematically improved, outperforming existing saliency datasets. Both the eye tracking data and the saliency maps in this research will be made publicly available at gvsp.sjtu.edu.cn.


IEEE Transactions on Industrial Electronics | 2017

A Fast Reliable Image Quality Predictor by Fusing Micro- and Macro-Structures

Ke Gu; Leida Li; Hong Lu; Xiongkuo Min; Weisi Lin

A fast reliable computational quality predictor is eagerly desired in practical image/video applications, such as serving for the quality monitoring of real-time coding and transcoding. In this paper, we propose a new perceptual image quality assessment (IQA) metric based on the human visual system (HVS). The proposed IQA model performs efficiently with convolution operations at multiscales, gradient magnitude, and color information similarity, and a perceptual-based pooling. Extensive experiments are conducted using four popular large-size image databases and two multiply distorted image databases, and results validate the superiority of our approach over modern IQA measures in efficiency and efficacy. Our metric is built on the theoretical support of the HVS with lately designed IQA methods as special cases.


international conference on multimedia and expo | 2016

Blind quality assessment of compressed images via pseudo structural similarity

Xiongkuo Min; Guangtao Zhai; Ke Gu; Yuming Fang; Xiaokang Yang; Xiaolin Wu; Jiantao Zhou; Xianming Liu

Block-based compression causes severe pseudo structures. We find that the pseudo structures of images compressed by different levels show some degree of similarity. So we propose to evaluate the quality of compressed images via the similarity between pseudo structures of two images. To obtain a “reference” image, we introduce the most distorted image (MDI), which is derived from the distorted image and suffers from the highest degree of compression. The proposed pseudo structural similarity (PSS) model calculates the similarity between pseudo structures of the distorted image and MDI. Pseudo structures of the distorted image become similar to the MDIs under the condition of severe compression. Via comparative tests, the proposed PSS model, on one hand, is shown to be comparable to state-of-the-art competitors, and on the other hand, it is not only good at assessing natural scene images but also performs the best in the hotly-researched screen content image (SCI) database. It deserves to mention that PSS is able to boost the performance of mainstream general-purpose no-reference (NR) quality measures.


international symposium on circuits and systems | 2014

Information security display system based on temporal psychovisual modulation

Zhongpai Gao; Guangtao Zhai; Xiongkuo Min

This paper introduces an information security display system using temporal psychovisual modulation (TPVM). TPVM was proposed as a new information display technology using the interplay of signal processing, optoelectronics and psychophysics. Since the human visual system cannot detect quick temporal changes above the flicker fusion frequency (about 60 Hz) and yet modern display technologies offer much higher refresh rates, there is a chance for a single display to simultaneously serve different contents to multiple observers. A TPVM display broadcasts a set of images called atom frames at a high speed, and those atom frames are then weighted by liquid crystal (LC) shutter based viewing devices that are synchronized with the display before entering the human visual system and fusing into the desired visual stimuli. And through different viewing devices, people can see different information. In this work, we develop a TPVM based information security display prototype. There are two kinds of viewers, those authorized viewers with the viewing devices who can see the secret information and those unauthorized viewers (bystanders) without the viewing devices who only see mask/disguise images. The prototype is built on a 120 Hz LCD screen with synchronized LC shutter glasses that were originally developed for stereoscopic display. The system is written in C++ language with SDKs of Nvidia 3D Vision, DirectX, CEGUI, MuPDF and etc. We also added human-computer interaction support of the system using Kinect. The information security display system developed in this work serves as a proof-of-concept of the TPVM paradigm, as well as a testbed for future research of TPVM technology.


IEEE Transactions on Image Processing | 2017

Unified Blind Quality Assessment of Compressed Natural, Graphic, and Screen Content Images

Xiongkuo Min; Kede Ma; Ke Gu; Guangtao Zhai; Zhou Wang; Weisi Lin

Digital images in the real world are created by a variety of means and have diverse properties. A photographical natural scene image (NSI) may exhibit substantially different characteristics from a computer graphic image (CGI) or a screen content image (SCI). This casts major challenges to objective image quality assessment, for which existing approaches lack effective mechanisms to capture such content type variations, and thus are difficult to generalize from one type to another. To tackle this problem, we first construct a cross-content-type (CCT) database, which contains 1,320 distorted NSIs, CGIs, and SCIs, compressed using the high efficiency video coding (HEVC) intra coding method and the screen content compression (SCC) extension of HEVC. We then carry out a subjective experiment on the database in a well-controlled laboratory environment. Moreover, we propose a unified content-type adaptive (UCA) blind image quality assessment model that is applicable across content types. A key step in UCA is to incorporate the variations of human perceptual characteristics in viewing different content types through a multi-scale weighting framework. This leads to superior performance on the constructed CCT database. UCA is training-free, implying strong generalizability. To verify this, we test UCA on other databases containing JPEG, MPEG-2, H.264, and HEVC compressed images/videos, and observe that it consistently achieves competitive performance.Digital images in the real world are created by a variety of means and have diverse properties. A photographical natural scene image (NSI) may exhibit substantially different characteristics from a computer graphic image (CGI) or a screen content image (SCI). This casts major challenges to objective image quality assessment, for which existing approaches lack effective mechanisms to capture such content type variations, and thus are difficult to generalize from one type to another. To tackle this problem, we first construct a cross-content-type (CCT) database, which contains 1,320 distorted NSIs, CGIs, and SCIs, compressed using the high efficiency video coding (HEVC) intra coding method and the screen content compression (SCC) extension of HEVC. We then carry out a subjective experiment on the database in a well-controlled laboratory environment. Moreover, we propose a unified content-type adaptive (UCA) blind image quality assessment model that is applicable across content types. A key step in UCA is to incorporate the variations of human perceptual characteristics in viewing different content types through a multi-scale weighting framework. This leads to superior performance on the constructed CCT database. UCA is training-free, implying strong generalizability. To verify this, we test UCA on other databases containing JPEG, MPEG-2, H.264, and HEVC compressed images/videos, and observe that it consistently achieves competitive performance.


international conference on multimedia and expo | 2014

Information security display system based on Spatial Psychovisual Modulation

Chunjia Hu; Guangtao Zhai; Zhongpai Gao; Xiongkuo Min

Privacy protection is of increasing importance in this era of information explosion. This paper introduces an information security display system based on the idea of Spatial Psycho-visual Modulation (SPVM). With the rapid advance of modern manufacturing techniques, display devices now support very high pixel density (e.g. the retina display of Apple). Meanwhile the human visual system (HVS) cannot distinguish image signals with spatial frequency above a threshold, as predicted by the contrast sensitivity function (CSF). Therefore, it is now possible for us to devise a type of information security display using the mismatch between resolutions of modern display devices and the HVS. Given the desired visual stimuli for both the bystanders and the authorized users, we propose a method to design display signals accordingly. We select polarization as a way to effectively differentiate the bystanders and the authorized users. Operationally, applying complementary polarization to light emitted from different spatial sections of the display could in itself be a challenge. Fortunately, the development of stereoscopic display technologies has made available a lot of polarization based spatial multiplexing type of display devices. Hardware of the information security display system is based on a polarization based stereoscopic screen made by LG. Software of the information security display system is written in C++ with SDKs of DirectX and etc. Kinect is also included into our system to enhance the experience of human-computer interaction. Extended experimental results will be given in this paper to justify the effectiveness and robustness of the system. The developed system serves both as a proof-of-concept of the SPVM method, as well as a test bed for future research of SPVM based display technology.


visual communications and image processing | 2013

Brightness preserving video contrast enhancement using S-shaped Transfer function

Ke Gu; Guangtao Zhai; Min Liu; Xiongkuo Min; Xiaokang Yang; Wenjun Zhang

This paper presents an efficient perceptual model inspired efficient video contrast enhancement algorithm. We propose a S-shaped transfer function for image pixel values that effectively improves the perceived contrast while preserving brightness of the scene. The S-shaped transfer function has only one control parameter that can be adaptively chosen for different video contents, such as sports, cartoon, news, and landscape programs. Then, the input image brightness is further preserved, in order to maintain the perception of human visual system (HVS) to some special scenes, such as dark scene and seaside scene. Experiments and comparative study on VQEG Phase I test database demonstrate that the proposed S-shaped Transfer function based Brightness Preserving (STBP) contrast enhancement algorithm outperforms various histogram equalization based methods such as HE, DSIHE, RSIHE and WTHE, yet with much lower computational complexity.


international conference on multimedia and expo | 2014

Influence of compression artifacts on visual attention

Xiongkuo Min; Guangtao Zhai; Zhongpai Gao; Chunjia Hu

Visual attention is an important function of the human visual system (HVS). In the long term research of visual attention, various computational models have been proposed with encouraging results. However, most of those work were conducted on images with ideal visual quality. In practice, outputs of most visual communication systems contain different levels of artifacts, e.g. noise, blurring, blockiness and etc. Therefore, it is interesting to investigate the impacts of artifacts on visual attention. In this paper, we question into the problem of how the widely encountered JPEG compression artifacts affect visual attention. We designed eye-tracking experiments on images with different levels of compression and viewing time and quantitatively compared the recorded eye movement data. We found that compression level does have impacts on visual attention, and yet this influence can be negligible for low levels of compression. For high levels of compression, the visual artifacts alter visual attention in a systematic way. Dependence of the influence on viewing duration was also analyzed and it was observed that too short or too long viewing time reduces the impact of compression artifacts on visual attention.


Signal Processing-image Communication | 2018

The prediction of head and eye movement for 360 degree images

Yucheng Zhu; Guangtao Zhai; Xiongkuo Min

Abstract Estimating salient areas of visual stimuli which are liable to attract viewers’ visual attention is a challenging task because of the high complexity of cognitive behaviors in the brain. Many researchers have been dedicated to this field and obtained many achievements. Some application areas, ranging from computer vision, computer graphics, to multimedia processing, can benefit from saliency detection, considering that the detected saliency has depicted the visual importance of different areas of the visual stimuli. As for the 360 degree visual stimuli, images and videos should record the whole scene in the 3D world, so the resolutions of panoramic images and videos are usually very high. However, when watching 360 degree stimuli, observers can only see part of the scene in the view port, which is presented to the eyes of the observers through the Head Mounted Display (HMD). So sending the whole video, or rendering the whole scene may result in the waste of resources. Thus if we can predict the current field of view, then focuses can be put to the streaming and rendering of the scene in the current field of view. Further more, if we can predict salient areas in the scene, then more fine processing can be done to the visually important areas. The prediction of salient regions for traditional images and videos have been extensively studied. However, conventional saliency prediction methods are not fully adequate for 360 degree contents, because 360 degree stimuli own some unique characteristics. Related study in this area is limited. In this paper, we study the problem of predicting head movement, head–eye motion, and scanpath of viewers when they are watching 360 degree images in the commodity HMDs. Three types of data are specifically analyzed. The first is the head movement data, which can be regarded as the movement of the view port. The second is the head–eye motion data which combines the motion of the head and the movement of the eye within the view port. The third is the scan-paths data of observers in the entire panorama which record the position information as well as the time information. And our model is designed to predict the saliency maps for the first two, and the scanpaths for the last one. Experimental results demonstrate the effectiveness of our model.


Signal Processing | 2018

Saliency-induced reduced-reference quality index for natural scene and screen content images

Xiongkuo Min; Ke Gu; Guangtao Zhai; Menghan Hu; Xiaokang Yang

Abstract Massive content composed of both natural scene and screen content has been generated with the increasing use of wireless computing and cloud computing, which call for general image quality assessment (IQA) measures working for both natural scene images (NSIs) and screen content images (SCIs). In this paper, we develop a saliency-induced reduced-reference (SIRR) IQA measure for both NSIs and SCIs. Image quality and visual saliency are two widely studied and closely related research topics. Traditionally, visual saliency is often used as a weighting map in the final pooling stage of IQA. Instead, we detect visual saliency as a quality feature since different types and levels of degradation can strongly influence saliency detection. Image quality is described by the similarity between two images’ saliency maps. In SIRR, saliency is detected through a binary image descriptor called “image signature”, which significantly reduces the reference data. We perform extensive experiments on five large-scale NSI quality assessment databases including LIVE, TID2008, CSIQ, LIVEMD, CID2013, as well as two recently constructed SCI QA databases, i.e., SIQAD and QACS. Experimental results show that SIRR is comparable to state-of-the-art full-reference and reduced-reference IQA measures in NSIs, and it can outperform most competitors in SCIs. The most important is that SIRR is a cross-content-type measure, which works efficiently for both NSIs and SCIs. The MATLAB source code of SIRR will be publicly available with this paper.

Collaboration


Dive into the Xiongkuo Min's collaboration.

Top Co-Authors

Avatar

Guangtao Zhai

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Xiaokang Yang

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Zhongpai Gao

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Ke Gu

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Chunjia Hu

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Yucheng Zhu

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Zhaohui Che

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Huiyu Duan

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Jing Liu

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Menghan Hu

Shanghai Jiao Tong University

View shared research outputs
Researchain Logo
Decentralizing Knowledge