Manish Narwaria
Nanyang Technological University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Manish Narwaria.
IEEE Transactions on Image Processing | 2012
Anmin Liu; Weisi Lin; Manish Narwaria
In this paper, we propose a new image quality assessment (IQA) scheme, with emphasis on gradient similarity. Gradients convey important visual information and are crucial to scene understanding. Using such information, structural and contrast changes can be effectively captured. Therefore, we use the gradient similarity to measure the change in contrast and structure in images. Apart from the structural/contrast changes, image quality is also affected by luminance changes, which must be also accounted for complete and more robust IQA. Hence, the proposed scheme considers both luminance and contrast-structural changes to effectively assess image quality. Furthermore, the proposed scheme is designed to follow the masking effect and visibility threshold more closely, i.e., the case when both masked and masking signals are small is more effectively tackled by the proposed scheme. Finally, the effects of the changes in luminance and contrast-structure are integrated via an adaptive method to obtain the overall image quality score. Extensive experiments conducted with six publicly available subject-rated databases (comprising of diverse images and distortion types) have confirmed the effectiveness, robustness, and efficiency of the proposed scheme in comparison with the relevant state-of-the-art schemes.
IEEE Transactions on Neural Networks | 2010
Manish Narwaria; Weisi Lin
Objective image quality estimation is useful in many visual processing systems, and is difficult to perform in line with the human perception. The challenge lies in formulating effective features and fusing them into a single number to predict the quality score. In this brief, we propose a new approach to address the problem, with the use of singular vectors out of singular value decomposition (SVD) as features for quantifying major structural information in images and then support vector regression (SVR) for automatic prediction of image quality. The feature selection with singular vectors is novel and general for gauging structural changes in images as a good representative of visual quality variations. The use of SVR exploits the advantages of machine learning with the ability to learn complex data patterns for an effective and generalized mapping of features into a desired score, in contrast with the oft-utilized feature pooling process in the existing image quality estimators; this is to overcome the difficulty of model parameter determination for such a system to emulate the related, complex human visual system (HVS) characteristics. Experiments conducted with three independent databases confirm the effectiveness of the proposed system in predicting image quality with better alignment with the HVSs perception than the relevant existing work. The tests with untrained distortions and databases further demonstrate the robustness of the system and the importance of the feature selection.
systems man and cybernetics | 2012
Manish Narwaria; Weisi Lin
We study the use of machine learning for visual quality evaluation with comprehensive singular value decomposition (SVD)-based visual features. In this paper, the two-stage process and the relevant work in the existing visual quality metrics are first introduced followed by an in-depth analysis of SVD for visual quality assessment. Singular values and vectors form the selected features for visual quality assessment. Machine learning is then used for the feature pooling process and demonstrated to be effective. This is to address the limitations of the existing pooling techniques, like simple summation, averaging, Minkowski summation, etc., which tend to be ad hoc. We advocate machine learning for feature pooling because it is more systematic and data driven. The experiments show that the proposed method outperforms the eight existing relevant schemes. Extensive analysis and cross validation are performed with ten publicly available databases (eight for images with a total of 4042 test images and two for video with a total of 228 videos). We use all publicly accessible software and databases in this study, as well as making our own software public, to facilitate comparison in future research.
IEEE Transactions on Image Processing | 2014
Yuming Fang; Junle Wang; Manish Narwaria; Patrick Le Callet; Weisi Lin
Saliency detection techniques have been widely used in various 2D multimedia processing applications. Currently, the emerging applications of stereoscopic display require new saliency detection models for stereoscopic images. Different from saliency detection for 2D images, depth features have to be taken into account in saliency detection for stereoscopic images. In this paper, we propose a new stereoscopic saliency detection framework based on the feature contrast of color, intensity, texture, and depth. Four types of features including color, luminance, texture, and depth are extracted from DC-T coefficients to represent the energy for image patches. A Gaussian model of the spatial distance between image patches is adopted for the consideration of local and global contrast calculation. A new fusion method is designed to combine the feature maps for computing the final saliency map for stereoscopic images. Experimental results on a recent eye tracking database show the superior performance of the proposed method over other existing ones in saliency estimation for 3D images.
Journal of Electronic Imaging | 2015
Manish Narwaria; Rafal Mantiuk; Mattheiu Perreira Da Silva; Patrick Le Callet
Abstract. With the emergence of high-dynamic range (HDR) imaging, the existing visual signal processing systems will need to deal with both HDR and standard dynamic range (SDR) signals. In such systems, computing the objective quality is an important aspect in various optimization processes (e.g., video encoding). To that end, we present a newly calibrated objective method that can tackle both HDR and SDR signals. As it is based on the previously proposed HDR-VDP-2 method, we refer to the newly calibrated metric as HDR-VDP-2.2. Our main contribution is toward improving the frequency-based pooling in HDR-VDP-2 to enhance its objective quality prediction accuracy. We achieve this by formulating and solving a constrained optimization problem and thereby finding the optimal pooling weights. We also carried out extensive cross-validation as well as verified the performance of the new method on independent databases. These indicate clear improvement in prediction accuracy as compared with the default pooling weights. The source codes for HDR-VDP-2.2 are publicly available online for free download and use.
IEEE Transactions on Image Processing | 2012
Manish Narwaria; Weisi Lin; Ian Vince McLoughlin; Sabu Emmanuel; Liang-Tien Chia
We present a new image quality assessment algorithm based on the phase and magnitude of the 2-D discrete Fourier transform. The basic idea is to compare the phase and magnitude of the reference and distorted images to compute the quality score. However, it is well known that the human visual systems sensitivity to different frequency components is not the same. We accommodate this fact via a simple yet effective strategy of non-uniform binning of the frequency components. This process also leads to reduced space representation of the image thereby enabling the reduced-reference (RR) prospects of the proposed scheme. We employ linear regression to integrate the effects of the changes in phase and magnitude. In this way, the required weights are determined via proper training and hence more convincing and effective. Last, using the fact that phase usually conveys more information than magnitude, we use only the phase for RR quality assessment. This provides the crucial advantage of further reduction in the required amount of reference image information. The proposed method is, therefore, further scalable for RR scenarios. We report extensive experimental results using a total of nine publicly available databases: seven image (with a total of 3832 distorted images with diverse distortions) and two video databases (totally 228 distorted videos). These show that the proposed method is overall better than several of the existing full-reference algorithms and two RR algorithms. Additionally, there is a graceful degradation in prediction performance as the amount of reference image information is reduced thereby confirming its scalability prospects. To enable comparisons and future study, a Matlab implementation of the proposed algorithm is available at http://www.ntu.edu.sg/home/wslin/reduced_phase.rar.
IEEE Transactions on Multimedia | 2012
Manish Narwaria; Weisi Lin; Anmin Liu
Objective video quality assessment (VQA) is the use of computational models to evaluate the video quality in line with the perception of the human visual system (HVS). It is challenging due to the underlying complexity, and the relatively limited understanding of the HVS and its intricate mechanisms. There are three important issues that arise in objective VQA in comparison with image quality assessment: 1) the temporal factors apart from the spatial ones also need to be considered, 2) the contribution of each factor (spatial and temporal) and their interaction to the overall video quality need to be determined, and 3) the computational complexity of the resultant method. In this paper, we seek to tackle the first issue by utilizing the worst case pooling strategy and the variations of spatial quality along the temporal axis with proper analysis and justification. The second issue is addressed by the use of machine learning; we believe this to be more convincing since the relationship between the factors and the overall quality is derived via training with substantial ground truth (i.e., subjective scores). Experiments conducted using publicly available video databases show the effectiveness of the proposed full-reference (FR) algorithm in comparison to the relevant existing VQA schemes. Focus has also been placed on demonstrating the robustness of the proposed method to new and untrained data. To that end, cross-database tests have been carried out to provide a proper perspective of the performance of proposed scheme as compared to other VQA methods. The third issue regarding the computational costs also plays a key role in determining the feasibility of a VQA scheme for practical deployment given the large amount of data that needs to be processed/analyzed in real time. A limitation of many existing VQA algorithms is their higher computational complexity. In contrast, the proposed scheme is more efficient due to its low complexity without jeopardizing the prediction accuracy.
Signal Processing-image Communication | 2015
Manish Narwaria; Matthieu Perreira Da Silva; Patrick Le Callet
High dynamic range (HDR) signals fundamentally differ from the traditional low dynamic range (LDR) ones in that pixels are related (proportional) to the physical luminance in the scene (i.e. scene-referred). For that reason, the existing LDR video quality measurement methods may not be directly used for assessing quality in HDR videos. To address that, we present an objective HDR video quality measure (HDR-VQM) based on signal pre-processing, transformation, and subsequent frequency based decomposition. Video quality is then computed based on a spatio-temporal analysis that relates to human eye fixation behavior during video viewing. Consequently, the proposed method does not involve expensive computations related to explicit motion analysis in the HDR video signal, and is therefore computationally tractable. We also verified its prediction performance on a comprehensive, in-house subjective HDR video database with 90 sequences, and it was found to be better than some of the existing methods in terms of correlation with subjective scores (for both across sequence and per sequence cases). A software implementation of the proposed scheme is also made publicly available for free download and use. HighlightsThe paper presents one of the first objective method for high dynamic range video quality estimation.It is based on analysis of short term video segments taking into account human viewing behavior.The method described in the paper would be useful in scenarios where HDR video quality needs to be determined in an HDR video chain study.
Pattern Recognition | 2012
Manish Narwaria; Weisi Lin; A. Enis Cetin
Measurement of image quality is of fundamental importance to numerous image and video processing applications. Objective image quality assessment (IQA) is a two-stage process comprising of the following: (a) extraction of important information and discarding the redundant one, (b) pooling the detected features using appropriate weights. These two stages are not easy to tackle due to the complex nature of the human visual system (HVS). In this paper, we first investigate image features based on two-dimensional (2D) mel-cepstrum for the purpose of IQA. It is shown that these features are effective since they can represent the structural information, which is crucial for IQA. Moreover, they are also beneficial in a reduced-reference scenario where only partial reference image information is used for quality assessment. We address the second issue by exploiting machine learning. In our opinion, the well established methodology of machine learning/pattern recognition has not been adequately used for IQA so far; we believe that it will be an effective tool for feature pooling since the required weights/parameters can be determined in a more convincing way via training with the ground truth obtained according to subjective scores. This helps to overcome the limitations of the existing pooling methods, which tend to be over simplistic and lack theoretical justification. Therefore, we propose a new metric by formulating IQA as a pattern recognition problem. Extensive experiments conducted using six publicly available image databases (totally 3211 images with diverse distortions) and one video database (with 78 video sequences) demonstrate the effectiveness and efficiency of the proposed metric, in comparison with seven relevant existing metrics.
IEEE Transactions on Audio, Speech, and Language Processing | 2012
Manish Narwaria; Weisi Lin; Ian Vince McLoughlin; Sabu Emmanuel; Liang-Tien Chia
Objective speech quality assessment is a challenging task which aims to emulate human judgment in the complex and time consuming task of subjective assessment. It is difficult to perform in line with the human perception due the complex and nonlinear nature of the human auditory system. The challenge lies in representing speech signals using appropriate features and subsequently mapping these features into a quality score. This paper proposes a nonintrusive metric for the quality assessment of noise-suppressed speech. The originality of the proposed approach lies primarily in the use of Mel filter bank energies (FBEs) as features and the use of support vector regression (SVR) for feature mapping. We utilize the sensitivity of FBEs to noise in order to obtain an effective representation of speech towards quality assessment. In addition, the use of SVR exploits the advantages of kernels which allow the regression algorithm to learn complex data patterns via nonlinear transformation for an effective and generalized mapping of features into the quality score. Extensive experiments conducted using two third party databases with different noise-suppressed speech signals show the effectiveness of the proposed approach.