On Addressing the Impact of ISO Speed upon PRNU and Forgery Detection
11 On Addressing the Impact of ISO Speed uponPRNU and Forgery Detection
Yijun Quan, Chang-Tsun Li
Senior Member, IEEE
Abstract —Photo Response Non-Uniformity (PRNU) has beenused as a powerful device fingerprint for image forgery detectionbecause image forgeries can be revealed by finding the absenceof the PRNU in the manipulated areas. The correlation betweenan image’s noise residual with the device’s reference PRNU isoften compared with a decision threshold to check the existenceof the PRNU. A PRNU correlation predictor is usually usedto determine this decision threshold assuming the correlationis content-dependent. However, we found that not only thecorrelation is content-dependent, but it also depends on thecamera sensitivity setting.
Camera sensitivity , commonly knownby the name of
ISO speed , is an important attribute in digitalphotography. In this work, we will show the PRNU correlation’sdependency on ISO speed. Due to such dependency, we postulatethat a correlation predictor is ISO speed-specific, i.e. reliablecorrelation predictions can only be made when a correlationpredictor is trained with images of similar ISO speeds to the imagein question . We report the experiments we conducted to validatethe postulate. It is realized that in the real-world, informationabout the ISO speed may not be available in the metadata tofacilitate the implementation of our postulate in the correlationprediction process. We hence propose a method called Content-based Inference of ISO Speeds (CINFISOS) to infer the ISOspeed from the image content.
I. I
NTRODUCTION W HEN a digital image is used in a forensic investigationor presented as evidence to the court, it is importantto authenticate the image to ensure its content is free frommanipulation. Thus, image forgery detection draws substan-tial attentions from researchers. Among different techniquesdeveloped for image forgery detection, Photo Response Non-Uniformity (PRNU) based methods have shown their uniquestrength. PRNU is a sensor pattern noise intrinsically em-bedded in images. It arises as a result of the manufacturingimperfections of silicon wafers in image sensors. As such,pixels on a sensor would have a non-uniform response tothe incident light and introduce a unique pattern noise to theimage, which can be treated as the fingerprint of a device.Many different algorithms have been proposed for PRNU-based source camera identification [1]–[11] and image forgerydetection [12]–[17]. In most of these works, PRNU is uti-lized by computing the image-wise or block-wise correlationsbetween the source device’s reference PRNU and the testimage’s PRNU. The corresponding image-wise (source cameraidentification) or pixel-wise decision (forgery detection) can bemade by comparing the correlations with a decision threshold.
Y. Quan is with the Department of Computer Science, University of War-wick, Coventry, CV4 7AL United Kingdom, e-mail: [email protected]. Li is with the School of Information Technology, Deakin University,Geelong VIC 3220 Australia, e-mail: [email protected].
The PRNU is often estimated in the form of the noiseresidual of an image. The noise residual can be extractedfrom an image by simply subtracting the de-noised imagefrom the original image. By nature, PRNU is a weak noise.The existence of camera artifacts and other PRNU-irrelevantnoises (e.g. shot noise, thermal noise, etc.) in an image’snoise residual can reduce the correlation between the noiseresidual and the device’s reference PRNU. It becomes anon-trivial problem to separate the inter-class (images fromdifferent source devices) from the intra-class (images fromthe same source device) correlations. It becomes particularlyproblematic when the PRNU quality in the noise residual ispoor such that these two types of correlations’ distributionscan have large overlaps.Despite a large number of works that have been doneto better extract, estimate and enhance the PRNU [3]–[6],[9], [13], [18]–[22], the overlap between inter- and intra-class correlations cannot be completely avoided. Thus, manyresearchers have been working on refining the choice ofthe decision thresholds to better separate the two classes,especially for image forgery detection [13], [15], [17]. Thedecision thresholds are often set with reference to the expectedintra-class correlations predicted by a correlation predictor.The correlation between an image’s noise residual and thedevice’s reference PRNU reflects the strength of the PRNU inthe image. As the strength of the PRNU is multiplicative ofthe pixel intensity and some highly textured image content orpost-processing may damage the PRNU’s quality, correlationprediction should be performed in an adaptive manner. Acontent-dependent correlation predictor is proposed by Chen et al. in [13], which formulates the correlation predictor as aregressor model of four image features, namely the intensity , texture , signal-flattening and a texture-intensity combinativeterm . This correlation predictor has been adopted by manyPRNU-based forgery detection algorithms (e.g. [13]–[17]).Due to the complex nature of the PRNU correlation, despitedifferent attempts to re-engineer the correlation predictor overthe past decade, we have not witnessed much success. Thus,the digital forensic community still relies greatly on the corre-lation predictor from [13] for PRNU-based forgery detection.However, over the last decade, we have also witnessedgreat advancement in the digital camera industry, especially insensor design. Such advancement also brings new challengesto PRNU-based digital forensics. Therefore, we have observeda few issues about the correlation predictor proposed in [13].An important feature ignored by the correlation predictor is thecamera sensitivity setting, which is commonly known by thename of ISO speed. The ISO speed together with the shutter a r X i v : . [ c s . MM ] J un speed and the aperture size are the three parameters, whichcontrol an image’s exposure in digital photography. The shutterspeed and aperture size control the number of photons arrivingat the image sensor during the exposure process while the ISOspeed determines the camera’s signal gain. In real-life, pho-tographers may face many physical restrictions on the aperturesize and shutter speed. Such restrictions require more freedomof choice in ISO speeds to achieve the desired exposure. Thus,many camera manufacturers have been working on improvingsensor performance and providing more and higher ISO speedsto digital cameras. While the improvements have been broughtto sensor technology, it is also a known fact that high ISOspeeds may introduce more noise to an image. As a result, thequality of the PRNU left in the noise residual will be reducedwhen a high ISO speed is used. A recent work presented in[23] empirically shows that different ISO speeds may affectthe performance of PRNU-based source camera identification.With camera manufacturers increasingly supporting broaderranges of ISO speed settings on digital cameras and mobiledevices, a proper analysis of the ISO speed’s influence onPRNU-based image forensics, especially on the correlations,needs to be carried out.As this work focuses on the correlation between an image’snoise residual with its reference PRNU, for simplicity, we willcall it the correlation . The contribution of this work can besummarized as follows: • We first analytically and empirically proved in SectionII that the correlation between an images noise residualand its reference PRNU is not only content-dependentas previously known, but also dependent on the camerasensitivity setting (i.e. the
ISO speed ). • We then validate our postulate in Section III that, dueto such ISO speed dependency, reliable predictions ofthe correlation between an images noise residual and itsreference PRNU can only be accurately made when acorrelation predictor is trained on images of similar ISOspeeds to the image in question. • Base on the postulate, we propose an ISO specific corre-lation prediction process. Recognizing that in the real-world, information about the ISO speed may not beavailable to facilitate the implementation of our postu-late in the correlation prediction process, we propose amethod called Content-based Inference of ISO Speeds(CINFISOS, / ’[email protected]@s /) in Section IV to infer the ISOspeed from the image content.In order to carry out this in-depth investigation into howthe ISO speed can affect PRNU-based image forensics, weuse the purposefully built Warwick Image Forensics Dataset[24]. The images in this dataset are taken with diverse ex-posure parameter settings. The dataset involves 14 camerasand images of various scenes. In particular, for 20 differentscenes for each camera, multiple images of the same sceneare shot with varying ISO speeds and exposure times. Thus,these images allow us to conduct studies on the ISO speed’sinfluence on the correlation. II. ISO S
PEED D EPENDENT C ORRELATION
In this section, we demonstrate that an image’s ISO speedcan affect its correlation. As a general noise model can becomplicated, to show the existence of such an
ISO Speed-Correlation relationship in a concise manner, we use a specialcase to prove this relationship analytically and then empiricallyshow it with more general cases. The special case considered isa single color channel of a flat-field RAW image, from whichwe expect the same value for every pixel if they are noise-free. To conduct PRNU-based pixel-wise forgery detection, thecorrelation between the noise-residual of a block centered ateach pixel and the corresponding block of the reference PRNUis calculated. Let z be a noise residual within a block N i centered at pixel i and ω be the reference fingerprint within thecorresponding block. Assume both z and ω are standardized,which means they follow the normal distribution N (0 , . Wecan model both signals as the sum of a PRNU component anda PRNU-irrelevant part. At pixel j ∈ N i : ( ω j = x j + α j z j = y j + β j (1)where x and y are the PRNU components of ω and z while α and β are the PRNU-irrelevant noises. As for a flat-fieldimage, we can approximate its PRNU component, x in thiscase, as a normal distribution N (0 , σ x ) and α conforms to N (0 , − σ x ) . For intra-class pairs, x and y represent thesame PRNU. As they may differ in strength, without losinggenerality, we can express y as N (0 , σ y ) with σ y = √ λσ x and y = √ λ x . α and β are mutually independent. So whenwe compute the correlation ρ i of the block N i , the correlation ρ i becomes: ρ i ∼ N ( µ i , Σ i ) (2)with ( µ i = σ x σ y = √ λσ x Σ i = (1 + λσ x ) / | N i | (3)From the above expression, we can see that the expectedcorrelation value, µ i , is proportional to the standard deviation σ y of the PRNU component, y , in the image’s noise residual, z . Based on the Poissonian-Gaussian noise model [25]–[27],we can see that the ISO speed would affect this standarddeviation σ y and eventually exert influence on the PRNUcorrelations.The relationship between the camera gain, g , which isdirectly determined by the camera’s ISO speed, and the noisyraw pixel intensity, I , is analyzed in [26]. The raw pixelintensity is proportional to the number of electrons countedon the sensor. Photo-electron conversion is the main sourceof the electrons collected from the sensor. [26] considers thePoissonian statistics of the incident photon counting processas follows. At pixel i , the number of the counted electronsis the sum of the electrons generated from photo-electronconversion N pi and dark electrons N ti from the thermalnoise. It is assumed that the variance of the thermal noiseis uniform across the sensor and all other electronic noises can be modeled as a zero-mean Gaussian noise with variance s . So the raw pixel intensity, I i , at pixel i , can be written as: I i ∼ g · [ p + P ( η i N pi + N ti − p ) + N (0 , s )] (4)where P ( · ) represents the Poisson distribution and η i is thephoton-electron conversion rate at pixel i . p is a base pedestalparameter introduced in the camera design to provide an offset-from-zero of the pixel’s output intensity. For each pixel, as alarge number of electrons are counted, the normal approxima-tion of Poisson distribution can be exploited. Therefore, I i canbe modeled as: I i ∼ N ( ϕ i , gϕ i + t ) (5)with ( t = g s − g p ϕ i = g · ( η i N pi + N ti ) , (6) ϕ can be viewed as the expected pixel intensity. Notice thatthis model from [26] has not yet considered the PRNU. Toinclude the PRNU in this model, we write the photo-electronconversion rate η i as the following expression by consideringthe non-uniform response of each pixel to the photons: η i = ¯ η (1 + k i ) , (7)where ¯ η is the average photo-electron conversion rate and k i is the PRNU factor at pixel i . k follows normal distribution N (0 , σ k ) . As we are considering the case of a flat-field imagehere so we can fix the number of photons, N p i , collected atevery pixel. By expanding Equation (5), we have: I i ∼ N ((1 + k i ) ϕ − gk i N ti , g (1 + k i ) ϕ + t − g k i N ti ) (8)As in most cases, both the PRNU and the thermal noise areweak noises. We can ignore the terms involving k i N ti . Whenwe consider a block N i , often it consists of thousands of pixels(e.g. 4096 pixels for a × block). Such a large numberof pixels allow us to approximate the overall distribution ofthe pixel values in this block by another normal distribution.By substituting t of Equation (8) with the expression for t in Equation (6), we approximate the distribution of the pixelvalues in block N i as: I N i ∼ : N ( ϕ, ϕ σ k + gϕ + g s − g p ) (9)We expect the de-noised version of this block to have pixels ofuniform intensity, ϕ . Thus, we can approximate the varianceof the noise residual of this block as: σ res ≈ ϕ σ k + gϕ + g s − g p (10)The PRNU component in the noise residual has a varianceof ϕ σ k . By normalizing the noise residual, the standarddeviation of the PRNU component in the normalized noiseresidual becomes: σ y = s ϕ σ k ϕ σ k + gϕ + g s − g p (11)Clearly, σ y is dependent on the camera gain g . By substitutingthis expression back to Equation (3), we can conclude that thecorrelation ρ i can be affected by the camera gain g and thusaffected by ISO speed. Notice that when we introduce PRNU by considering differ-ent photo-electron conversion rate, η i , at each pixel to the rawpixel intensity model from [26], the noise residual variancemodel described in Equation ( ) becomes a quadratic func-tion of the expected pixel intensity ϕ , which can be expressedas: σ res = Aϕ + Bϕ + C (12)with A = σ k B = gC = g s − g p (13)It differs from the linear model in [26]. We will empiricallyvalidate Equation (10) to show the physical importance of thePRNU term, ϕ σ k , in the equation despite the approximationsmade.We use four cameras for the test, namely a Nikon D7200, aCanon 6D MKII, a Canon 80D, and a Canon M6. Each of thefour cameras can generate 14-bits RAW images, which meanstheir pixel values can vary between the range of [0 , .To better show the physical meaning of the coefficients inEquation (10), we standardize the pixel values to the range of [0 , . To validate Equation (10), we plot the variance of thenoise in the flat-field images against different pixel values inFig.1. We use the cameras to take images of a screen of flatcolor. Each camera’s ISO speed is set to 100. The exposuretime is varied to change the pixel intensity for different shots.As the cameras use Bayer-filter as their color filtering array(CFA), we subsample the RAW images with a stride of 2 inboth vertical and horizontal directions to make sure the pixelswe test are from the same color channel. Despite the set-up, theimages are not completely flat due to other camera artifacts,e.g. vignetting. Thus, we use the method from [26] to estimatethe expected pixel value and variance for multiple imageblocks from each noisy RAW image. Fig.1 shows the fittingof Equation (12) to the experiment data, which is computedusing ordinary least squares (OLS) [28]. A good agreementbetween the model and the data can be observed.In addition to showing the good agreement of the derivedmodel and the real data, we would like to show the physicalmeaning of the first order coefficient, B = g in the model aswell. We use the RAW images from the same Canon 6D MKIIfrom the previous test for this test. We repeat the previousexperiment four times but set the cameras’ ISO speed to ISO200, 400, 800, and 1600, respectively. Again, we fit Equation(12) to the data. As for the same camera, despite the changeof ISO speed, we can assume that the PRNU factor on thesensor should remain the same and so does the variance ofthe PRNU factor, σ k . Thus, it is reasonable for us to fix thesecond order coefficient A = σ k to . × − , the valueestimated from Fig.1, in Equation (12) for these fittings andthe corresponding fittings generated using OLS are shown inFig.2. Once again, good agreement between the fitted curveand the data can be observed. In addition, we show a log − log plot of the estimated first order coefficients B from Fig.1(b)and 2 against the ISO speed of their corresponding images inFig.3. We fitted a straight line to the plot given slope close (a) Nikon D7200 (b) Canon 6D MKII Pixel Intensity N o i s e V a r i a n ce -5 Pixel Intensity N o i s e V a r i a n ce -4 (c) Canon 80D (d) Canon M6 Pixel Intensity N o i s e V a r i a n ce -4 Pixel Intensity N o i s e V a r i a n ce -4 Fig. 1: Plots of noise’s variance σ res against pixel intensity ϕ ,with a quadratic fitting (red curve) as described by Equation(10) and (12), of RAW flat-field ISO 100 images from fourcameras: (a) Nikon D7200, (b) Canon 6D MKII, (c) Canon80D and (d) Canon M6. The fitted coefficients for Equation(12) for each image are: (a) A = 1 . × − , B = 2 . × − , C = − . × − , (b) A = 5 . × − , B = 1 . × − , C = − . × − , (c) A = 3 . × − , B = 4 . × − , C = − . × − , (d) A = 4 . × − , B = 4 . × − , C = − . × − to . As a camera’s ISO speed is proportional to its cameragain, g , Fig.3 validates our noise model from Equation (10)with B = g . Therefore, it confirms that the correlation modelis dependent on ISO speed.The above conclusions are made for the special conditionwhen we consider the images to be RAW flat-field image.When we take post-processings (e.g. color interpolation andJPEG compression) and the influence due to the image contentinto consideration, the noise model could become rather com-plicated. This is both because the PRNU is multiplicative ofimage content and image content may propagate into the noiseresidual due to imperfect denoising. And actually, higher ISOimages are more likely to suffer from strong JPEG compres-sion and imperfect denoising (see supplementary material).Thus, though Equation (10) cannot be translated directly tothe general conditions, all the factors suggest a higher ISOspeed can introduce more PRNU-irrelevant noise. As a result,this will reduce the proportion of signals corresponding to thePRNU in the noise residual and eventually reduce the corre-lation. We use Fig.4 to empirically show that the correlationis dependent on the image’s ISO speed when post-processingslike de-mosaicing, gamma correction, JPEG compression, etc.,are applied to a non-flat RAW image.The images shown in Fig.4 are from a Canon 6D MKIIcamera in the Warwick Image Forensics Dataset. All theimages shown here are saved in the JPEG format by thecamera’s default setting. Images of two scenes are taken (a) ISO200 (b) ISO400 Pixel Intensity N o i s e V a r i a n ce -4 Pixel Intensity N o i s e V a r i a n ce -4 (c) ISO800 (d) IS01600 Pixel Intensity N o i s e V a r i a n ce -4 Pixel Intensity N o i s e V a r i a n ce -4 Fig. 2: Plots of noise’s variance σ res against pixel intensity ϕ of images with different ISO speed from a Canon 6D MKII.We fit Equation (10) to the plots with a fixed second ordercoefficient, A = σ k = 5 . × − , estimated from Fig.1(b).The first order coefficient B , for the four fittings are: (a) B =2 . × − , (b) B = 5 . × − , (c) B = 1 . × − and(d) B = 2 . × − Fig. 3: log - log plot of the estimated first order coefficient B against the ISO speeds of the images used to estimate B . Astraight line is fitted with a slope of . under different ISO speeds using different exposure timesto ensure that every image can reach the same exposurelevel. Thus, there is nearly no difference in pixel intensitybetween the images of the same scene. As the PRNU isa multiplicative signal, having images of the same pixelintensity of the same image content allows us to make a faircomparison with ISO speed’s impact on the correlation. Thecorrelation heat maps in Fig.4 are computed by correlatingthe noise residuals from the images’ green channel with thedevice’s reference green channel PRNU. The reference PRNUis extracted from 50 flat-field images. The block size for thecomputation of the correlation at each pixel is × pixels. We use yellow to show high correlation regions andblue to show the opposite. Apparently, as the ISO speed ISO 100 ISO 400 ISO 1600 ISO 6400Scene 1CorrelationMap
Scene 2CorrelationMap
Fig. 4: Image of two different scene from a Canon 6D MKII from the Warwick Image Forensics Dataset. The images aretaken with different ISO speeds. The exposure time for each image is set accordingly to let the images of the same scene havesimilar exposure level. The block-wise correlation maps are computed with a block size of × pixels. The color barsused for the correlation maps are at the right hand side, next to the ISO 6400 correlation maps.increases, the correlation map shows more regions with lowcorrelation. It can be concluded that despite these imageswith complex image content have undergone post-processing,their correlation with the reference PRNU is still dependenton the image’s ISO speed.III. ISO S PEED ’ S I MPACT UPON CORRELATIONPREDICTION
A correlation predictor is an important component of manyPRNU-based tampering localization methods. Many PRNU-based tampering localization methods are applied by compar-ing the block-wise correlations with a decision threshold setaccording to the predicted correlation. As a result, the choiceof the decision threshold and the performance of these methodscan be greatly affected by the accuracy of the correlationprediction. As the correlation is content dependent, withoutconsidering the ISO speed, [13] models the correlation as afunction of four image features, namely the intensity, texture,signal flattening and a texture-intensity combinative term.However, due to the correlation’s dependency on the ISOspeed, we postulate that: a correlation predictor can onlyproduce accurate predictions for images with the same ISOspeed as the training images . And we call such a correlationpredictor as a matching ISO correlation predictor. ISO 3200 for Panasonic Lumix TZ90 1 and TZ90 2
To show the ISO speed’s influence on correlation predictorand validate our postulate, we first compare the performanceof the correlation predictors trained with (a) images withmixed ISO speeds and (b) images with the same ISO speedas the test images. We did the test on 13 cameras fromthe Warwick Image Forensics Dataset (An Olympus EM10MKII camera from the dataset doesn’t show the existenceof PRNU. Thus it is not included in this test). 50 flat-fieldimages from each camera are used to extract the cameras’reference fingerprints. For each camera, we select images fromthree ISO speeds to form three test sets, namely ISO 100,800, and 6400, apart from the two Panasonic LumixTZ90,which do not have ISO 6400. For these two cameras, we teston ISO 3200 images instead. Accordingly, we trained threematching ISO correlation predictors, each with 20 imagesof the corresponding ISO speed following the method from[13]. The correlations are computed between image blocks of × pixels. To make the comparison, for each camera, wetrained another correlation predictor with 20 images randomlyselected from the 60 images used for the training of thecamera’s three matching ISO correlation predictors. We callthis correlation predictor as a mixed ISO correlation predictor.Block-wise correlation predictions are made for the test sets.For each set, we computed the coefficient of determination( r ) and the root mean square error (RMSE) for the matchingISO and mixed ISO correlation predictors as shown in TableI. We highlighted the better performance for each test set interms of larger r and smaller RMSE with bold font. TABLE I: r and RMSE from correlation predictions made from matching ISO and mixed ISO correlation predictors for 13cameras in Warwick Image Forensics Dataset Matching ISOCorrelation Predictor Mixed ISOCorrelation PredictorISO 100 ISO 800 ISO 6400 ISO 100 ISO 800 ISO r RMSE r RMSE r RMSE r RMSE r RMSE r RMSECanon 6D TABLE II: r and RMSE for the correlation predictors gen-erated from the matching and non-matching ISO correlationpredictors for 9 cameras from Dresden Image Dataset Matching ISOCorrelation Predictor Non-matching ISOCorrelation Predictor r RMSE r RMSECanon Ixus55 0
The matching ISO correlation predictors show superiorperformance over the mixed correlation predictors for all testsets except for the two Fujifilm XA-10 and the two PanasonicLumix TZ90 at high ISO speeds. These two models of camerasare more prone to strong noise at high ISO speeds. As aresult, the correlations with their reference PRNU becomeclose to zero despite different image features. Due to therelatively large variance of the correlations introduced bythe PRNU-irrelevant signal in the noise residuals, neither ofthe correlation predictors managed to produce large r forthe correlation predictions. However, by using the MatchingISO correlation predictor for these cameras, we notice smallRMSE still can be observed. This is particularly important asthe correlation predictors would not generate predictions thatdeviate too much from the actual correlation. False positivescan be significantly reduced when we apply these correlationpredictors for forgery detection.In addition to the test on the Warwick Image ForensicsDataset, the experiments are extended to 9 cameras from theDresden Image Dataset [29] as well. In the Dresden ImageDataset, about 150 images of natural scenes are produced byeach camera. However, as the dataset was created withoutconsidering the ISO speed as an influential factor, the images’ISO speeds span over many different values. For most ISOspeeds, the number of images available is not enough forus to train a matching ISO correlation predictor using themethod mentioned above and to test it with the matching ISO images. So we test the matching ISO correlation predictor onthe most popular ISO speed from each camera only, each with20 test images. For each camera, we trained a matching ISOcorrelation predictor with 20 images of the same ISO speed asthe test images and another 20 images are selected randomlyfrom all the images available for the training of the mixedISO correlation predictor. r and RMSE of the predictionsare shown in Table II. Again, the superior performance ofthe matching ISO correlation predictors can be observed inevery case. Both the tests on images from Warwick ImageForensics Dataset and Dresden Image Dataset show that theperformance of a correlation predictor may degenerate bycompletely ignoring the impact of ISO speed and trainedimages of mixed ISO speed.Knowing that we cannot ignore the ISO speed in thecorrelation prediction training process, we also would like toinvestigate how mismatched ISO speeds of training and testingimages would affect correlation prediction and subsequentforgery detection. In specific, we would like to investigateto what extent, a correlation predictor trained with imageswith a particular ISO speed can predict reliable correlationwith images taken at other ISO speeds without significantlyinfluencing the forgery detection results. We use Fig.5 todemonstrate the potential outcomes of forgery detection whenthe training image’s ISO speed is significantly different fromthe test image’s ISO speed.Fig.5 shows the forgery detection results from tamperedimages with ISO speed 100, 800 and 6400 from a CanonM6. Images of the same scene taken at different ISO speedsare manipulated using Adobe Photoshop. For each image, thetampered region is replaced by using Photoshop’s content-aware filling function, which leaves the tampered region ata similar noise level as its surrounding regions. We apply theBayesian-MRF forgery detection algorithm from [15] to theimages. For all the images, we set the same parameters for theforgery detection algorithm: with the interaction parameter β set to and probability prior p set to . . The detectionresults show that the forgery detection algorithm works thebest in terms of false detections when it is equipped withthe matching ISO correlation predictor. We also notice that OriginalImage ForgeryImage ForgeryMap ISO 100correlationpredictor ISO 800correlationpredictor ISO 6400correlationpredictorISO 100forgery imageISO 800forgery imageISO 6400forgery imageFig. 5: Forgery detection results on realistic forgeries from a Canon M6 with images of ISO speed 100, 800 and 6400. Theimages are taken with different exposure time to let them have similar exposure level. The Bayesian-MRF forgery detectionalgorithm is applied with the interaction parameter β set to and probability prior p set to . . The true detections arecoloured with green and red for false detections. Missed tampered pixels are shown in white.when we use ISO 100 correlation predictor for the forgerydetection of the ISO 6400 forgery, despite the tampered regionis correctly identified, there are a lot of false positives in theresult. And when ISO 6400 correlation predictor is used forthe detection of forgery in ISO 100 forgery image, while theentire authentic region is regarded as tampered, there are partsof the tampered region still undetected.To explain these observations, we have to consider the twopotential outcomes of using images of different ISO speeds forthe training of correlation predictors: the predicted correlationbeing either overestimated or underestimated.Overestimation of the correlations (when correlation pre-dictions are larger than the actual values) often occur whenwe use a correlation predictor trained with images of lowerISO speeds than the test image’s ISO speed. As the actualintra-class correlations will be smaller than the predictedcorrelation, the corresponding pixels are more likely to belabeled as tampered, which results in an increased number offalse detections as we have seen in Fig.5. This is particularlyharmful to real-life forensics. For most forgery detectionalgorithms, the authenticity of a pixel is checked by comparingits actual correlation with a threshold set with reference tothe predicted correlations and expected inter-class correlation,which is expected to be zero. Though the actual algorithmscan be different with more complexity by considering thedistribution of the correlations from both inter- and intra-classas well as neighboring pixels’ correlations, the comparisonof whether the actual correlation sits closer to the predictedcorrelation or inter-class correlation when the correlation isoverestimated can be a good indicator of how likely falsedetections can be introduced by a correlation predictor. Thuswe would like to compare the two values: d = ρ − ¯ ρ inter ,which is the relative position from the inter-class correlation, ¯ ρ inter , to the actual computed correlation ρ and d = ¯ ρ intra − ρ ,which is the relative position of the actual correlation, ρ , to thepredicted intra-class correlation, ¯ ρ intra . Instead of comparing the L distances, we compare these two values to focusmore on the situation when the correlation is overestimated,which causes the actual correlation to be a value between theexpected inter-class correlation and predicted correlation. Weestimate ¯ ρ inter as zero and use the predicted correlation toestimate ¯ ρ predict , and it gives d − d ≈ ρ − ¯ ρ predict . When d − d is negative, it indicates that the correlation has a largechance of being misidentified as an inter-class correlation.Again, use the camera Canon M6 as an example, we showthe percentages of the image blocks with d − d smallerthan in Fig.6 when we use an ISO 100 and 800 correlationpredictors to predict for test images with ISO speed numberof stops above the training images. The plot shows that whenthe test images’ ISO speeds are within the one-stop rangeof the training images’ ISO speed, there is only a relativelysmall portion of blocks (i.e. less than ) with d − d smaller than 0 for both ISO 100 and ISO 800 correlationpredictors. As the deviation from the test images’ ISO speedto the training images’ ISO speed increases, we start to see ahigher percentage from Fig.6, indicating an increased numberof false detections could be introduced into forgery detectionresults. As we approximate d − d as ρ − ¯ ρ predict , it becomesan universal problem when ρ < ¯ ρ predict .Base on the correlation model derived from Equation (11)and observations from experiments, we found that for imageblocks of the same scene from images taken at different ISOspeeds, it is generally true that the block-wise correlation inan image taken with ISO speed G is twice larger than thecorrelation of the corresponding block from an image taken atISO speed G = 2 G . Thus, we claim that G = 2 G is a safechoice to be set as the largest ISO speed a correlation predictortrained with images of ISO speed G can reliably predict for.Similar behavior can be observed on other cameras as well andwe show the receiver operating characteristic (ROC) curve forforgery detection in Fig.7 for further validation. ISO 100 Correlation PredictorISO 800 Correlation Predictor
Fig. 6: A plot of the percentages of image blocks with d − d smaller than 0 against the number of ISO stops the test image’sISO speed is above the ISO speed of the images used totrain the correlation predictor for a Canon M6. The percentageindicates the portion of the authentic image blocks at risk ofbeing misidentified as tampered blocks by forgery detectionalgorithms.Each ROC curve in Fig.7 is plotted by running the Bayesian-Markov random field (MRF) based forgery detection algorithmfrom [15] on 80 synthetic forgery images at each of the7 presented ISO speeds. Three correlation predictors, eachtrained with 20 natural images taken at ISO speed 100, 800and 6400, respectively, are used to predict the correlations forthe forged images. We vary the interaction parameter β in therange of [1 , and the probability prior p between [0 , toset different combinations of the parameters for the algorithm.This allows us to generate the enveloping curves for the ROCsto show the best performance. The 80 synthetic forged imagesare generated from 20 full-sized authentic images. From eachfull-sized image, we select 4 regions of × pixels.We replace the center of each × pixel region’scenter with a tampered patch of × pixels. The patchused to replace the center is cropped from the same originalimage but from a different position to ensure that it does nothave the same PRNU. In fact, we fully facilitate the WarwickImage Forensics Dataset which provides images of the samecontent at different ISO speeds. This allows us to generatethe synthetic forged images in the way that for one syntheticforged image at one ISO speed, we can find images of thesame content at other ISO speeds as well. By doing this, Fig.7not only allows us to compare the performance of differentcorrelation predictors for forged images at one ISO speed butwe can also systematically compare the performance of onecorrelation predictor for different ISO speeds.We run the test on different cameras from Warwick ImageForensics Dataset. To save space, we only show the ROCcurves of two most representative cameras, a Canon M6 and aSigma SdQuattro in Fig.7. Canon M6 represents the camerasthat can generate relatively less noisy images (with a largepeak to noise ratio (PSNR)) for most ISO speeds from thecamera while Sigma SdQuattro represents the cameras whose image quality is highly dependent on the selected ISO speed.The false positive rate (FPR) and true positive rate (TPR)are computed at the pixel-level. As for real-life tamperinglocalization application, we usually require the method toproduce a small FPR, thus we focus on the range of [0 , . of FPR in the plots.From Fig. 7, we first notice that for ISO 100, 800 and6400 forgery images, the matching ISO correlation predictorworks the best in both cameras in almost every case. The onlyexception is for Sigma SdQuattro ISO 6400 forgery images. Inthis case, despite the ISO 6400 correlation predictor can makepredictions accurately as we have seen from Table I, none ofthe three correlation predictors can produce accurate detec-tions. This is because, for high ISO images from this camera,the images’ intra-class correlations are generally very closeto zero and hard to be separated from inter-class correlations.For such images, PRNU-based methods may not be the besttool to perform forgery localization. However, the ISO specificcorrelation predictor can still be helpful in such a scenario as itwill be able to accurately predict the correlations close to zero.Thus, the users can be warned that the PRNU based methodsmay not be suitable under such a scenario. Overall, the resultsshow the benefit of using a matching ISO correlation predictorfor forgery detection.For both cameras, we observe that the detection resultsof using the ISO 100 correlation predictors (i.e. predictorstrained with images taken at ISO speed 100) are better whenthe forged image’s ISO speed is smaller than 400. While theCanon M6’s relatively good high PSNR at higher ISO speedsallows the ISO 100 correlation predictor to perform reasonablywell for a forged image with ISO speed up to 1600, it isnot the case for the Sigma SdQuattro camera. From ISO 400and above, the ISO 100 correlation predictor for the SigmaSdQuattro starts to struggle. And the similar effect can beobserved for ISO 800 correlation predictors when they areused to predict for images with ISO speed much higher than800. Thus, it conforms to our argument that a predictor trainedwith images taken at ISO speed G can perform reliably on theimages taken at an ISO speed G that is lower than or equalto G . While depending on the camera, some correlationpredictors may perform when the test image’s ISO speed isabove the range, the above argument provides a safe rangefor the choice of correlation predictor’s training ISO speedwithout risking too many false detections.Fig.7 also shows the situation when the correlation pre-dictors underestimate the test image’s correlations. Underes-timation often occurs when we use a correlation predictortrained with images of a much higher ISO speed than thetest image’s ISO speed. In the plots, we noticed that theISO 6400 correlation predictors, especially for the Canon M6camera, appear to have difficulty in correctly localizing theforgery for images with low ISO speed. This is because whenthe correlation predictor underestimates the correlations, iteventually reduces the forgery detection algorithm’s capabil-ity of correctly identifying tampered pixels. Thus, to avoidthe underestimation but still provides a practical range fromwhich a training ISO speed can be conveniently selected, weempirically set the lower bound of the ISO speed a correlation Canon M6
ISO 100 Forgery Image
ISO 100ISO 800ISO 6400
ISO 200 Forgery Image
ISO 400 Forgery Image
ISO 800 Forgery Image
ISO 1600 Forgery Image
ISO 3200 Forgery Image
ISO 6400 Forgery Image
Sigma SdQuattro
ISO 100 Forgery Image
ISO 100ISO 800ISO 6400
ISO 200 Forgery Image
ISO 400 Forgery Image
ISO 800 Forgery Image
ISO 1600 Forgery Image
ISO 3200 Forgery Image
ISO 6400 Forgery Image
Fig. 7: Receiver Operating Characteristic (ROC) curves of tampering localization using Bayesian-MRF forgery detection method[15] on synthetic forgeries taken at different ISO speeds from a Canon M6 and Sigma SdQuattro. The legend shows the ISOspeeds corresponding to the correlation predictors used to generate the ROC curves.predictor can be used for to half of the ISO speed of itstraining images. From the plots, we see by using this range,the corresponding detection results either outperform othercorrelation predictors or are on par with the best performance.Altogether, we conclude that for a test image taken at ISOspeed G , using correlation predictors trained with imagesof ISO speed, G , which is in the one-stop range of G ( G ∈ [ G / , G ] ) can produce forgery detection resultwithout risking false detections being excessively introduceddue to the correlation predictor.IV. ISO SPECIFIC CORRELATION PREDICTION PROCESS
Observing the ISO speed’s impact on correlation prediction,we concluded that reliable correlation predictions should bemade in an ISO specific way. Thus, we propose an ISO specificcorrelation prediction process. To predict correlations for animage of ISO speed G , we have to use a correlation predictor,preferably trained with images of the same ISO speed at G , or similar to G . An ISO speed G is considered as similar to G if G is in the one-stop range of G . The images used forthe training of the correlation predictor should cover diverseimage feature settings: including both bright and dark scenes,highly textured and flat patterns, etc. To cover such a diverseset of image features, it usually requires a large number ofimages. Thus, a good correlation predictor should be trainedwith no less than 20 full-sized images. With a relatively largecollection of images of good feature diversity taken at an ISOspeed similar to the test image, the weight for each definedfeature can be learned following the process presented in [13]for the correlation predictor.In order to complete the correlation prediction process, weneed to have the knowledge of the ISO speed G to findimages of the same or similar ISO speeds to form the trainingset. However, as the image in question may have undergonesome unknown manipulations, either on its image content ormetadata, the ISO speed information presented in the metadata Fig. 8: A demonstration of the idea behind the proposed ISOspeed inferring method. We expect patches from differentimages to show similar noise characteristics if they havesimilar content and the same ISO speed. The example shows apatch from an ISO 3200 query image. It shows similar noisecharacteristics with a patch of similar content from an ISO3200 training image.can be unreliable or even unavailable. Thus, we can often facethe problem when we have an image of unknown ISO speedand we would like to select images with the closest ISO speedto the image to train a correlation predictor.As a known factor, for the same camera, the higher theISO speed is, the higher the level of noise is introducedto the content of images. Thus, it is intuitive to infer animage’s ISO speed by exploiting its noise characteristics inthe content. Based on the Poissonian-Gaussian noise model[25], methods are proposed in [26], [27], [30] to infer thecamera gain, g , from a RAW image, which then can bedirectly related to the camera’s ISO speed. Despite thesemethods showing promising performance on RAW images,as the noise model generally cannot be applied directly tonon-RAW image formats, their performance is suboptimaland cannot be practically used to infer a JPEG image’s ISOspeed. Furthermore, for similar reasons, though many noiselevel estimation algorithms [31]–[34] may work well on RAWimages to give clues about an image’s ISO speed, JPEG imagesstill pose challenges. As JPEG is one of the most commonimage formats, being able to identify a JPEG image’s ISOspeed is a prerequisite for ISO specific correlation prediction.Though finding an accurate noise model for a JPEG imagecan be of great complexity, we can simplify this problemby making the following assumption: image patches from thesame camera with similar content and JPEG quality factorshould show similar noise characteristics if they are of thesame ISO speed, and vice versa as shown in Fig.8. Thus,we propose a method called Content-based Inference of ISOSpeed (CINFISOS, pronounced as / ’[email protected]@s /) to determinean image’s ISO speed by doing patch-wise noise comparisonwith patches of similar content from images taken with thesame camera at different ISO speeds.Consider the case when we have a query image, Q , and t candidate training sets, S = { S , ..., S t } , each consists of multiple images and the sets are with different ISO speeds.We would like to find the set with the ISO speed closest tothe query image Q . The query image is first partitioned intoa set of non-overlapping patches, P = { p i } , each patch ofsize d × d pixels. As we would like to use the patches to bestrepresent the image’s noise characteristics, patches with toomany dark and saturated pixels in any color channel shouldbe removed. We consider the patches in the RGB color space.For each pixel q in the j th channel of the patch, p ji , the pixelis considered as dark or saturated if its pixel value I ( q ) is notin the range [ λ , λ ] : U ( q ) = ( , if I ( q ) < λ or I ( q ) > λ , otherwise (14)The i th patch is excluded from ˆ P if ∀ j ( P q ∈ p ji U ( q ) > λ τ d ) ,when the ratio of the dark or saturated pixels in every channelof the patch is over a limit λ τ . In addition to removing thedark and saturated pixels, the image’s noise characteristics canbe better revealed by including only the less textured patches.Thus, we only keep m least textured patches in P Q , the set ofpatches that we believe can best represent the query image’snoise characteristics. To evaluate how textured a patch is,we use the texture feature definition from [13] but extendsits definition to patches of three color channels by a simplesummation: f T ( p i ) = X j =1 ( 1 d X q ∈ p ji
11 + var ( F ( q )) ) (15)where F () is the high-pass filter and var () measures thevariance of × neighbourhood. The feature f T is definedin the range [0 , with lower values for more texture patches.We select m least textured patches from ˆ P to form the set ofqualified query image patches P Q : P Q = { p i | ( p i ∈ ˆ P ) ∧ ( f T ( p i ) > f T m +1 ) } (16) f T m +1 is the texture feature of the m +1 th least textured patchfrom ˆ P . As P Q only contains patches with relatively smoothtexture, we can approximate their image content by applyinga low pass filter. We implement the method of finding patcheswith similar content using a block-matching method similar to[35]. The distance between two patches in each color channelis measured as the Euclidean distance between the discretecosine transforms (DCT) of the two with hard thresholdingapplied. And the overall distance between two patches is thesummation of the distances in the three color channels: ∆( p i , p k ) = X j =1 k Γ( DCT ( p ji ) , λ DCT ) − Γ( DCT ( p jk ) , λ DCT ) k (17)where Γ( x, λ DCT ) is the hard thresholding operation: Γ( x, λ DCT ) = ( x, if x > λ DCT , , otherwise (18)For each patch p i in P Q , from each candidate training set S k , n patches with the least distance to p i will be selected. Though the exhaustive search for the patches with the shortestdistance is computationally expensive, this step can be easilyparallelized. We call this set of selected patches as P ik . Wedefine the distance, which measures the sum of the absolutedifferences in noise characteristics in all three color channelsfrom each patch p i in P Q to each candidate training imageset S k , as: D ( p i , S k ) = X j =1 ( | var ( p ji − ˜ p ji ) − n X p l ∈P ik var ( p jl − ˜ p jl ) | ) (19)where ˜ p jl is the low-pass filtered version of the patch p l of the j th channel: ˜ p jl = IDCT (Γ(
DCT ( p jl ) , λ DCT )) (20)For each patch p i in P Q , it will have a vote for a candidatetraining set, S k , who has the smallest D ( p i , S k ) . The candidatetraining set with the closest ISO speed to the query imagewill be determined by a simple majority vote from all thepatches in P Q . The ISO speed that receives the majority voteswill be deemed as the ISO speed of the query image and thecorrelation predictor can be trained with the correspondingimages. V. E XPERIMENTS
A. Inferring ISO speed with CINFISOS
To test the performance of the proposed CINFISOS, weconduct experiments on our Warwick Image Forensics Dataset.In the previous section, we concluded that for a correlationpredictor trained with ISO speed G , reliable correlationpredictions can be made for images taken with ISO speedin the range of [ G / , G ] . Therefore, to select a correlationpredictor trained with images of an ISO speed suitable for theimage in question, the inferred ISO speed only needs to bewithin the one-stop range of the real value. As a result, weonly need a few candidate training sets, S k , to cover a broadrange of ISO speeds to give reliable correlation predictions.In our experiments, for each camera in the Warwick ImageForensics Dataset, we have three candidate training sets withimages of ISO speed 100, 800 and 6400, respectively (withthe exception for the two Panasonic Lumix TZ90, of we selectthe ISO 3200 candidate training set instead of the ISO 6400training set). These three ISO speeds are selected as they covera broad range of commonly used ISO speeds. Besides, wedeliberately avoid overlapping between the one-stop range ofthe ISO speeds, each of the three candidate ISO speed canpredict for, to make it easier for the performance evaluation.To apply CINFISOS, we set the following parameters. Thesize of each query image patch is × pixels. m = 50 is thenumber of patches in the qualified query set P Q . λ DCT is set to . in a similar manner as how it is set in [35]. For eachquery patch, we find similar patches from each candidateset. For each camera in the Warwick Image Forensics Datasetapart from the two Panasonic Lumix TZ90, we have 20 queryimages, each with ISO speed 100, 200, 400, 800, 1600, 3200and 6400 in the JPEG format. Each candidate training setconsists of 20 images. For the two Panasonic Lumix TZ90, in addition to the fact that ISO 6400 images are unavailable,we also excluded ISO 1600 query images as both ISO 800and 3200 can be considered as inferred correctly.We run the experiment with a desktop equipped with anIntel Core i7-9700K CPU. With the afore-mentioned setup,it takes around 130 seconds for CINFISOS to run on a full-resolution query image (e.g. × pixels for an imagefrom a Canon 6D MKII), including the exhaustive searchfor similar patches among 60 full-resolution training images.The patch-level accuracy, which measures the percentage ofpatches voting correctly for the inferred ISO speed, is reportedin Table III. We notice that the accuracy varies greatly betweencameras at different ISO speeds but the accuracy is above0.5 in every case. It means that overall, every single patch ismore likely to vote correctly. Given this patch-level accuracy,a . accuracy at the image-level is observed with only 9out of 1880 test images wrongly inferred. B. Forgery detection with ISO specific correlation prediction
The high accuracy of CINFISOS in identifying the ISOspeed of an image within its one-stop range allows us toconduct the proposed ISO specific correlation prediction pro-cess even when we do not know the test image’s ISO speed.Thus, we would like to test the performance of the proposedISO specific correlation prediction process in terms of forgerydetection.We apply the Bayesian-MRF forgery detection algo-rithm[15] on the synthetic forgery images from two cameras:a Canon M6 and a Sigma SdQuattro for the test. The imagesare the same as the ones used in Section III. There are560 synthetic images from each camera and they are equallydistributed over 7 different ISO speeds (namely ISO speed 100,200, 400, 800, 1600, 3200 and 6400). We carry out the pro-posed ISO specific correlation prediction process in two ways:(a) using the proposed CINFISOS to determine whether acorrelation predictor is suitable for the test image, and (b) withan oracle correlation predictor. With the aforementioned one-stop range setting, we only need three correlation predictors,namely an ISO 100, an ISO 800 and an ISO 6400 correlationpredictor to cover the whole range of the ISO speeds we needto predict for with CINFISOS. We apply CINFISOS on eachsynthetic image to determine which of the three correlationpredictors should be used to produce the predictions of eachimage. The oracle correlation predictor uses a matching-ISOcorrelation predictor for each image according to its ISO speedinformation. We trained 7 different correlation predictors forthe 7 different ISO speeds presented in this test, each with 20natural images, to realize the oracle correlation predictor.We compare the forgery detection results by our proposedISO specific correlation prediction process against the resultsby using correlation predictions with a mixed ISO correlationpredictor and an ISO 100 correlation predictor. Mixed ISOcorrelation predictors represent the situation when we selecttraining images randomly without considering the images’ ISOspeeds. Thus, the mixed ISO correlation predictors’ perfor-mance can be viewed as the baseline for the forgery detectionresults when we disregard the impact from ISO speed on TABLE III: Patch level accuracy of the proposed ISO speed inferring method on images from Warwick Image Forensics Dataset
ISO 100 ISO 200 ISO 400 ISO 800 ISO 1600 ISO 3200 ISO 6400Canon 6D 0.954 0.843 0.619 0.740 0.637 0.755 0.806Canon 6D MKII 0.999 0.952 0.593 0.795 0.764 0.723 0.744Canon 80D 0.990 0.893 0.789 0.882 0.851 0.879 0.997Canon M6 1.000 0.937 0.682 0.869 0.836 0.911 0.983Fujifilm XA 10 1 0.725 0.574 0.543 0.666 0.612 0.704 0.668Fujifilm XA 10 2 0.699 0.602 0.587 0.673 0.578 0.625 0.654Nikon D7200 0.998 0.891 0.734 0.859 0.800 0.860 0.918Olympus EM10 MKII 0.989 0.928 0.631 0.694 0.712 0.697 0.731Panasonic Lumix TZ90 1 0.961 0.802 0.554 0.581 N.A. 0.720 N.A.Panasonic Lumix TZ90 2 0.908 0.769 0.580 0.576 N.A. 0.708 N.A.Sigma SdQuttro 0.881 0.825 0.512 0.716 0.565 0.601 0.642Sony Alpha68 0.948 0.883 0.714 0.850 0.748 0.863 0.993Sony RX100 1 0.913 0.856 0.741 0.869 0.677 0.549 0.648Sony RX100 2 0.998 0.915 0.791 0.837 0.625 0.610 0.763 (a) Canon M6
MixedISO 100CINFISOSOracle (b) Sigma SdQuattro
Fig. 9: The ROC curves depicting the performance of detectorwith various correlation predictors tested on 560 syntheticforgery images of 7 different ISO speeds for two cameras (a)a Canon M6 and (b) a Sigma SdQuattro. Forgery detectionsare carried out with the Bayesian-MRF forgery detectionalgorithm from [15] with correlation predictions generatedfrom (i) a mixed ISO correlation predictor (ii) an ISO 100correlation predictor (iii) the proposed ISO specific correlationprediction process with CINFISOS and (iv) the proposedISO specific correlation prediction process with an oraclecorrelation predictor. correlation prediction completely. For each camera, the mixedISO correlation predictor is trained with 20 training imagesrandomly selected from the 60 images of three different ISOspeeds. The ISO 100 correlation predictor is the same as theone used in our proposed ISO specific correlation predictionprocess. We vary the interaction parameter β and the probabil-ity prior p for the Bayesian-MRF forgery detection methodto generate the enveloping ROC curves. Each data point onthe curve is generated by summing the detection results of the560 synthetic images from each camera. The ROC curves forthe detection results are shown in Fig.9. We focus on the lowfalse positive rate range of [0 , . .Unsurprisingly, the detection result from the oracle corre-lation predictor comes as the best above all the predictorsfor both cameras. However, the detection results based on theproposed CINFISOS are comparable to the oracle correlationpredictor’s ones. It shows the effectiveness of the proposedCINFISOS and validates that the one-stop range for ISOspeed prediction is a feasible choice without significantlysacrificing the forgery detection performance. In comparison,the mixed ISO and ISO 100 correlation predictors have worseperformance. Though in Fig.7, we have noticed that the ISO100 correlation predictor can predict well for images with ISOspeed up to 1600, its poor performance on images of higherISO speed is evident. Thus, it is not a good choice to use acorrelation predictor trained with low ISO speed for all theimages. To conclude, the proposed ISO specific correlationprediction process shows superior performance in terms offorgery detection. VI. C ONCLUSION
In this work, we did both analytical and empirical studieson the impact of different camera sensitivity (ISO speed)settings on PRNU-based digital forensics. First, we show howthe correlation between an image’s noise residual with thedevice’s reference PRNU can be dependent on the image’s ISOspeed. With this dependency in mind, we empirically showhow mismatched ISO speeds may influence the correlationprediction process. Thus, we proposed an ISO-specific corre-lation prediction process to be used in PRNU-based forgerydetection. To address the problem that the information aboutthe ISO speed of an image may not be available, a methodcalled Content-based Inference of ISO Speed (CINFISOS) is proposed to infer the image’s ISO speed from its content.Clear improvements are observed in correlation predictionsand forgery detection results by applying our proposed ISOspecific correlation prediction process with CINFISOS. Bypointing out the influence of camera sensitivity setting onPRNU-based forensic methods, the provided solutions fromthis work can make the forensic analysis more reliable andtrustworthy. A CKNOWLEDGMENT
This work is supported by the EU Horizon 2020 MarieSklodowska-Curie Actions through the project entitled Com-puter Vision Enabled Multimedia Forensics and People Iden-tification (Project No. 690907, Acronym: IDENTITY)R
EFERENCES[1] J. Lukas, J. Fridrich, and M. Goljan, “Digital Camera Identification fromSensor Pattern Noise,”
IEEE Transactions on Information Forensics andSecurity , vol. 1, no. 2, pp. 205–214, 2006.[2] Y. Sutcu, S. Bayram, H. T. Sencar, and N. Memon, “Improvementson Sensor Noise Based Source Camera Identification,” in , Jul 2007, pp. 24–27.[3] C.-T. Li, “Source Camera Identification Using Enhanced Sensor PatternNoise,”
IEEE Transactions on Information Forensics and Security ,vol. 5, no. 2, pp. 280–287, 2010.[4] W. van Houten and Z. Geradts, “Using Anisotropic Diffusion forEfficient Extraction of Sensor Noise in Camera Identification,”
Journalof Forensic Sciences , vol. 57, no. 2, pp. 521–527, 2012.[5] A. Cooper, “Improved Photo Response Non-Uniformity (PRNU) basedSource Camera Identification,”
Forensic Science International , vol. 226,no. 1-3, pp. 132–141, 2013.[6] F. Gisolf, A. Malgoezar, T. Baar, and Z. Geradts, “Improving SourceCamera Identification Using a Simplified Total Variation based NoiseRemoval Algorithm,”
Digital Investigation , vol. 10, no. 3, pp. 207–214,2013.[7] X. Kang, J. Chen, K. Lin, and A. Peng, “A Context-Adaptive SPNPredictor for Trustworthy Source Camera Identification,”
EURASIPJournal on Image and Video Processing , vol. 2014, no. 1, p. 19, 2014.[8] M. Al-Ani, F. Khelifi, A. Lawgaly, and A. Bouridane, “A NovelImage Filtering Approach for Sensor Fingerprint Estimation in SourceCamera Identification,” in . IEEE, 2015,pp. 1–5.[9] H. Zeng and X. Kang, “Fast Source Camera Identification Using ContentAdaptive Guided Image Filter,”
Journal of Forensic Sciences , vol. 61,no. 2, pp. 520–526, 2016.[10] A. Lawgaly and F. Khelifi, “Sensor Pattern Noise Estimation Based OnImproved Locally Adaptive DCT Filtering And Weighted Averaging ForSource Camera Identification and Verification,”
IEEE Transactions onInformation Forensics and Security , vol. 12, no. 2, pp. 392–404, Feb2017.[11] R. Li, C.-T. Li, and Y. Guan, “Inference of a Compact Representationof Sensor Fingerprint for Source Camera Identification,”
Pattern Recog-nition , vol. 74, no. 2, pp. 556–567, Jul 2017.[12] J. Luk´aˇs, J. Fridrich, and M. Goljan, “Detecting Digital Image ForgeriesUsing Sensor Pattern Noise,” in
Electronic Imaging 2006 . InternationalSociety for Optics and Photonics, 2006, pp. 60 720Y–60 720Y.[13] M. Chen, J. Fridrich, M. Goljan, and J. Luk´as, “Determining ImageOrigin and Integrity using Sensor Noise,”
IEEE Transactions on Infor-mation Forensics and Security , vol. 3, no. 1, pp. 74–90, 2008.[14] G. Chierchia, S. Parrilli, G. Poggi, L. Verdoliva, and C. Sansone,“PRNU-based Detection of Small-size Image Forgeries,” in , Jul 2011,pp. 1–6.[15] G. Chierchia, G. Poggi, C. Sansone, and L. Verdoliva, “A Bayesian-MRF Approach for PRNU-Based Image Forgery Detection,”
IEEETransactions on Information Forensics and Security , vol. 9, no. 4, pp.554–567, Apr 2014. [16] G. Chierchia, D. Cozzolino, G. Poggi, C. Sansone, and L. Verdoliva,“Guided Filtering for PRNU-Based Localization of Small-Size ImageForgeries,” in , May 2014, pp. 6231–6235.[17] P. Korus and J. Huang, “Multi-Scale Analysis Strategies in PRNU-BasedTampering Localization,”
IEEE Transactions on Information Forensicsand Security , vol. 12, no. 4, pp. 809–824, Apr 2017.[18] A. Lawgaly, F. Khelifi, and A. Bouridane, “Weighted Averaging-BasedSensor Pattern Noise Estimation for Source Camera Identification,” in , Oct2014, pp. 5357–5361.[19] X. Kang, Y. Li, Z. Qu, and J. Huang, “Enhancing Source CameraIdentification Performance With a Camera Reference Phase Sensor Pat-tern Noise,”
IEEE Transactions on Information Forensics and Security ,vol. 7, no. 2, pp. 393–402, Apr 2012.[20] X. Lin and C.-T. Li, “Preprocessing Reference Sensor Pattern Noise viaSpectrum Equalization,”
IEEE Transactions on Information Forensicsand Security , vol. 11, no. 1, pp. 126–140, Jan 2016.[21] C.-T. Li and Y. Li, “Color-Decoupled Photo Response Non-Uniformityfor Digital Image Forensics,”
IEEE Transactions on Circuits and Systemsfor Video Technology , vol. 22, no. 2, pp. 260–271, Feb 2012.[22] Z. Qu, X. Kang, J. Huang, and Y. Li, “Forensic sensor pattern noiseextraction from large image data set,” in , May 2013,pp. 3023–3027.[23] L. Lin, W. Chen, Y. Wang, S. Reinder, Y. Guan, J. Newman, and M. Wu,“The Impact of Exposure Settings in Digital Image Forensics,” in , Oct2018, pp. 540–544.[24] Y. Quan, C.-T. Li, Y. Zhou, and L. Li, “Warwick image forensics datasetfor device fingerprinting in multimedia forensics,” in ”Proceedings ofIEEE International conference on Multimedia and Expo” , Jul 2020.[25] G. E. Healey and R. Kondepudy, “Radiometric CCD Camera Calibrationand Noise Estimation,”
IEEE Transactions on Pattern Analysis andMachine Intelligence , vol. 16, no. 3, pp. 267–276, Mar 1994.[26] A. Foi, M. Trimeche, V. Katkovnik, and K. Egiazarian, “PracticalPoissonian-Gaussian Noise Modeling and Fitting for Single-Image Raw-Data,”
IEEE Transactions on Image Processing , vol. 17, no. 10, pp.1737–1754, Oct 2008.[27] A. Foi, “Clipped Noisy Images: Heteroskedastic Modeling and PracticalDenoising,”
Signal Processing , vol. 89, no. 12, pp. 2609–2629, 2009.[28] K. F. Riley, M. P. Hobson, and S. J. Bence,
Mathematical Methodsfor Physics and Engineering: A Comprehensive Guide . CambridgeUniversity Press, 2006.[29] T. Gloe and R. Bhme, “The ‘Dresden Image Database’ for Benchmark-ing Digital Image Forensics,” in
Proceedings of the 25th Symposium OnApplied Computing (ACM SAC 2010) , vol. 2, 2010, pp. 1585–1591.[30] T. H. Thai, R. Cogranne, and F. Retraint, “Camera Model IdentificationBased on the Heteroscedastic Noise Model,”
IEEE Transactions onImage Processing , vol. 23, no. 1, pp. 250–263, 2014.[31] C. Liu, W. T. Freeman, R. Szeliski, and S. B. Kang, “Noise estimationfrom a single image,” in , vol. 1. IEEE, 2006, pp.901–908.[32] X. Liu, M. Tanaka, and M. Okutomi, “Single-Image Noise Level Esti-mation for Blind Denoising,”
IEEE Transactions on Image Processing ,vol. 22, no. 12, pp. 5226–5237, Dec 2013.[33] D. Zoran and Y. Weiss, “Scale invariance and noise in natural images,”in , Sep.2009, pp. 2209–2216.[34] S. Nam, Y. Hwang, Y. Matsushita, and S. J. Kim, “A Holistic Approachto Cross-Channel Image Noise Modeling and Its Application to ImageDenoising,” in
Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , 2016, pp. 1683–1691.[35] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image Denoisingby Sparse 3-D Transform-domain Collaborative Filtering,”
IEEE Trans-actions on Image Processing , vol. 16, no. 8, pp. 2080–2095, 2007. Yijun Quan received the BA degree in Natural Sci-ence from Trinity College, University of Cambridge,UK, in 2015, the MSc degree in Computer Sciencefrom University of Warwick, UK, in 2016. He is cur-rently a Ph.D. Candidate at University of Warwick.He was a visiting scholar to South China Universityof Technology (SCUT) under Marie Sklodowska-Curie fellowship in 2018. His research interestsinclude multimedia forensics and security, machinelearning, image processing and computational pho-tography.
Chang-Tsun Li received the BSc degree in elec-trical engineering from National Defence University(NDU), Taiwan, in 1987, the MSc degree in com-puter science from U.S. Naval Postgraduate School,USA, in 1992, and the PhD degree in computer sci-ence from the University of Warwick, UK, in 1998.He was an associate professor of the Departmentof Electrical Engineering at NDU during 1998-2002and a visiting professor of the Department of Com-puter Science at U.S. Naval Postgraduate School inthe second half of 2001. He was a professor of theDepartment of Computer Science at the University of Warwick (UK) untilJanuary 2017 and a professor of Charles Sturt University (Australia) fromJanuary 2017 to February 2019. He is currently a professor of the Schoolof Information Technology at Deakin University, Australia. His researchinterests include multimedia forensics and security, biometrics, data mining,machine learning, data analytics, computer vision, image processing, patternrecognition, bioinformatics, and content-based image retrieval. The outcomesof his multimedia forensics research have been translated into award-winningcommercial products protected by a series of international patents and havebeen used by a number of police forces and courts of law around the world. Heis currently the EURASIP Journal of Image and Video Processing (JIVP) andAssociate Editor of IET Biometrics. He involved in the organisation of manyinternational conferences and workshops and also served as member of theinternational program committees for several international conferences. He isalso actively contributing keynote speeches and talks at various internationalevents.
Supplementary Material for ‘
On Addressing theImpact of ISO Speed upon PRNU and ForgeryDetection ’ Yijun Quan, Chang-Tsun Li
Senior Member, IEEE A CASE STUDY ON
JPEG
COMPRESSION ’ S IMPACT IMAGESOF DIFFERENT
ISO
SPEEDS
As different ISO speeds can introduce different levels ofnoise to the images, such behavior would impact the referencePRNU extraction process as well. A typical method to extracta device’s reference PRNU is averaging the noise residualsfrom flat-field images (images of flattened content, e.g. purecolor boards, etc.). The use of flat-field images can mostlyavoid the distortion due to image content (e.g. texture, edges,etc.). For a flat-field RAW image, we can approximate itsnoise model according to Equation (1), which means its noiseresidual consists of both the PRNU and PRNU-irrelevant parts.By averaging the noise residuals of multiple flat-field imagesfrom the same device with similar quality of the PRNU, theirPRNU-irrelevant part can get attenuated and thus a betterapproximation of the PRNU can be obtained.In real-life forensics, the images available for the referenceextraction may not be RAW images but in some compressedformat, e.g. JPEG images, similar behavior is expected. Also,due to the influence of ISO speed, it is reasonable for us toexpect that, with the same number of images, the referencePRNU extracted from lower ISO speed images would be ofbetter quality than the one extracted from images with higherISO speeds. To verify this, we test the PRNU extracted fromvarying numbers (from 1 to 50) of flat-field images withdifferent ISO speeds (100, 800 and 6400) from three cameras,namely a Canon 6D MKII, a Nikon D7200 and a SigmaSdQuattro. The images used in this test are JPEG images of aflat color panel and are straight out of the three cameras. Toensure a fair comparison between different ISO speeds, we setthe JPEG compression quality to the best available setting oneach camera for every ISO speed. The quality of the extractedPRNUs is examined by computing the correlation betweenthem and another reference PRNU of the same camera, whichin our case is computed from 100 images with ISO speed of100. We call the PRNUs generated from the one hundred ISO100 images as the sample PRNUs .In theory, the three sample PRNUs may still differ from theground truth slightly, the correlation between them and the oneextracted from the test images are still representative to tellthe difference between the quality of PRNU generated fromimages of different ISO speeds, as we can see from Fig. 1.From the figures, we can confirm that the lower ISO speedgenerates PRNU of better quality. For each ISO speed, thecorrelation increases as the number of images used to extract the reference PRNU increases.Furthermore, for different ISO speeds from the same cam-era, the correlation curves shown in Fig.1 tend to converge todifferent values. It means that no matter how many images areused to extract the reference PRNU, the ones from images ofhigher ISO speeds can be of worse quality than the ones froma sufficient number of images of lower ISO speeds. Such aphenomenon suggests the incompatibility of PRNU’s extractedfrom higher ISO speed images with the sample PRNU.We found that this is mainly due to the reason that thePRNU signal remaining in higher ISO images is more proneto the low-pass filtering like JPEG compression despite theimages are saved under the same JPEG compression qualityfactor. As the higher ISO flat-field images are noisier, theyhave more high frequency signals in the image. Thus, when alow-pass filter is applied to them to reduce the amount of highfrequency signal remaining in the images, the PRNU signal,which is also a high frequency signal, is more likely to bevitiated.In Fig. 2, we use the auto-correlations of the flat-fieldimages’ noise residual to demonstrate such an effect. For arandom noise, as the value of each pixel is independent, itsauto-correlation should have a single peak at (0 , and is zeroelsewhere. However, due to post-processing, especially theJPEG compression, the auto-correlation will spread over multi-ple pixels and the extend of this spreading can be an indicatorof how severe the post-processing may distort the extractednoise residual. From Fig. 2, for each of the three cameras,we clearly see the trend that as the ISO speed increases,the spreading reaches further. Furthermore, the symmetricspreading shapes as we observed from the plots for the ISO6400 images are more likely to be from JPEG compression,which compresses signals of a certain frequency in the images.Color interpolation (also known as demosaicking) at eachpixel involves the colors of the pixels within a neighborhood,which means the color at each pixel does “spread” acrossa certain neighborhood. Interestingly, unlike the Bayer filterused on the sensors in Canon 6D MKII and Nikon D7200, theFoveon X3 sensor in the Sigma SdQuattro has a stacked colorfiltering array, which does not require color interpolation. Thespreading of the auto-correlation can still be observed withthe Sigma SdQuattro. This evidence further justifies that thefurther spreading of the auto-correlation is more likely to becaused by JPEG compression instead of color interpolation. a r X i v : . [ c s . MM ] J un (a) Canon 6D MKII (b) Nikon D7200 (c) Sigma SdQuattro Fig. 1. The plots show how the number of JPEG images used for reference PRNU extraction may affect the quality of the extracted reference PRNU fromthree cameras: (a) Canon 6D MKII, (b) Nikon D7200 and (c) Sigma SdQuattro. We use the correlation between the extracted reference PRNU with anotherreference PRNU extracted from 100 flat-field images of ISO speed 100 to indicate the quality of the extracted reference PRNU.
Canon 6D MKII
ISO100 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
ISO800 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
ISO6400 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
Nikon D7200
ISO100 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
ISO800 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
ISO6400 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
Sigma SdQuattro
ISO100 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
ISO800 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
ISO6400 -10 -8 -6 -4 -2 0 2 4 6 8 101086420-2-4-6-8-10
Fig. 2. The auto-correlation of noise residuals from images of different ISO speeds from 3 cameras. Rather than a single peak at (0 , , auto-correlationshave values spread over multiple pixel ranges. As the figure focuses on how far the spreading of auto-correlation reaches, the color bar focus on the rangeof [0 , . . Values bigger than the upper limit .05