Spectral video construction from RGB video: Application to Image Guided Neurosurgery
aa r X i v : . [ c s . C V ] D ec Spectral video construction from RGB video: Application to ImageGuided Neurosurgery
Md. Abul Hasnat · Jussi Parkkinen · Markku Hauta-KasariAbstract
Spectral imaging has received enormous interestin the field of medical imaging modalities. It providesa powerful tool for the analysis of different organsand non-invasive tissues. Therefore, significant amount ofresearch has been conducted to explore the possibilityof using spectral imaging in biomedical applications. Toobserve spectral image information in real time duringsurgery and monitor the temporal changes in the organsand tissues is a demanding task. Available spectral imagingdevices are not sufficient to accomplish this task with anacceptable spatial and spectral resolution. A solution tothis problem is to estimate the spectral video from RGBvideo and perform visualization with the most prominentspectral bands. In this research, we propose a frameworkto generate neurosurgery spectral video from RGB video.A spectral estimation technique is applied on each RGBvideo frames. The RGB video is captured using a digitalcamera connected with an operational microscope dedicatedto neurosurgery. A database of neurosurgery spectral imagesis used to collect training data and evaluate the estimationaccuracy. A searching technique is used to identify the besttraining set. Five different spectrum estimation techniquesare experimented to indentify the best method. Although thisframework is established for neurosurgery spectral videogeneration, however, the methodology outlined here wouldalso be applicable to other similar research.
Keywords
Spectral Imaging · Spectral Estimation · ImageGuided Surgery.
Md. Abul Hasnat, E-mail: [email protected] LIRIS, ´Ecole Centrale de Lyon, Ecully, France.Jussi Parkkinen, E-mail: jussi.parkkinen@uef.fiMarkku Hauta-Kasari, E-mail: markku.hauta-kasari@uef.fiSchool of Computing, University of Eastern Finland, Joensuu, Finland.
The objective of spectral imaging is to provide an accuraterepresentation of the color information of an object beingimaged. Spectral imaging is an imaging technnique whereeach pixel contains a full wavelength spectrum of all pointsin the image scene. Spectral image allows reproduction ofcolor image under various illuminations, analyze imagesof individual spectral band and study computational colorvision and color constancy [29]. Individual spectral bandimage provides better discrimination of objects of interest,segmentation, recognition and classification of objects.Spectral imaging technology has been used in various fieldsof applications [12] (technology: automatic inspection,remote sensing; industry: paper, ink, painting, printing,wood, textile, agriculture, food, chemical and medicine;service: medical). Ongoing research work on exploring thepossibility of its use in several other areas reveals theimportance of spectral imaging.The task of spectral imaging can be accomplished invarious ways. One approach is to use different narrowband filters with a grayscale image capturing device. Afilter wheel consisting of narrowband filters (interferencefilter systems [22]), an acousto-optic tunable filter (AOTF)[4, 24], or a liquid crystal tunable filter (LCTF) [24] areoften employed in this kind of imaging systems. Therefore,in these systems an image is captured by scanning inthe spectral domain. Another approach is based on spatialscanning, where the spectral information is measured lineby line [14]. The main differences between the availablespectral imaging systems are in accuracy, speed andscanning methods. The spectral and the spatial resolutionsof the line scanning cameras (PGP based devices [14]) areusually very high and one line spectral measurement canbe done rapidly. However, for capturing the whole imagethey are quite slow in comparison to the other imagingsystems. Among the filter based devices, LCTF [24] and
Md. Abul Hasnat et al.
AOTF [4, 24] systems are faster than interference filtersystems.An important drawback of the available spectral imagingdevices is the speed of image acquisition. Despite theinvent of many recent spectral imaging devices for fasteracquisition, a near real time spectral image capturing devicewith reasonable spatial resolution is yet to become available.In order to overcome this limitation, several alternativeapproaches have been proposed [35, 34, 8, 26]. Oneapproach is to produce a custom designed spectral imagerwith identified spectral range of interest [35]. Anothersimilar approach [34] is to add spectral bands with 3-RGBbands. A different approach is to recover spectral reflectancein a scene with a conventional RGB camera using multiplexillumination [26]. Beside these techniques, there are severalcomputational approaches to obtain spectral reflectance withthe use of conventional 3 band camera through spectrumestimation [29, 11, 31, 13, 5, 21, 6, 16, 15, 23, 33, 30, 28].These approaches estimate spectral reflectance using variousmathematical methods with the aid of prior knowledge.The aim of these estimation approaches is to reduce thecost and complexity of the image acquisition system whilepreserving its colorimetric and spectral accuracy. The priorknowledge includes the information about the spectra ofthe standard color patches or target object, correspondingresponse of traditional imaging device, sensor sensitivity,illumination spectra and noise.A reasonable number of estimation techniques havebeen proposed in literature and usefulness of thesetechniques was evaluated for variety of different purposes[29, 11, 31, 13, 5, 21, 6, 16, 15, 23, 33, 30, 28]. Consideringthe behavior of the methods and availability of requireddata, these methods are categorized in [29, 6]. One of themost widely used techniques is the Wiener estimation [29,11, 31, 21], which minimizes the mean square error (MSE)between the recovered and measured spectral reflectance.In the Wiener method, estimation matrix can be computedfrom different set of spectra information. One set of spectralinformation is the reflectance spectra of training samples,the illumination spectrum and the device sensitivity [29, 11].The alternative set consists of only reflectance spectra ofthe training set and corresponding RGB values [31, 21].A modification of the Wiener estimation is accomplishedby a technique called the Pseudoinverse transformationmethod [29], which is also called the least square method in[21]. This technique uses regression analysis [25] betweenthe known spectral reflectance and corresponding deviceresponse values. Another category of the methods are thelinear methods where the spectral reflectance is representedas a linear combination of the orthonormal basis vectors[29, 6]. Similar to the modification of the Wiener estimationmethod, a regression analysis is employed to modify thelinear method. In this modification, a relation is established between the weight column vectors for the basis vectorsand corresponding sensor responses (digital counts ofthe trichromatic camera). This modified method is calledthe
Imai-Berns method [29, 5, 21, 16, 15]. One of themajor foci of the
Imai-Berns method is to determine theoptimal number of channels and basis vectors for sufficientestimation. According to their suggestion, the number ofchannels should be the same or larger than the numberof basis vectors. They tested various combinations for thenumber of eigenvectors and channels and found that sixchannels is the best compromise between accuracy and thecost of adding more sets of trichromatic channels. Shi andHealey further investigated to determine the optimal numberof basis vectors [28]. In their color scanner calibrationexperiment, they have shown that, multiple solutions existsfor same device response when the number of basis vectorsis more than the number of channels. In most devicesthree channels are used. However, in most cases threebasis vectors will not be sufficient to represent accuratelya large set of reflectance functions. In order to overcomethis situation, Shi and Healey proposed a new method[29, 5, 21, 28] that recovers the spectral reflectances withmore basis vectors than sensors while still increasing theaccuracy of the recovery.Spectral imaging has received increasing interest in thefield of medical imaging modalities [9, 10, 19, 1, 18, 2,20, 7, 27, 32]. With the invent of new spectral imagingbased technologies which are customized for medicalimaging applications, spectral imaging has been used forimaging diabetic retinopathy [9], imaging neurosurgicaltarget tissues [2], determine an adequate tumor-free margin[18], evaluating the degeneration of articular cartilage[19], distinguishing normal and neoplastic tissues in thebrain [10], detecting cancerous tissue [20], detection ofthe sentinel lymph node in gynecologic oncology [7] anddetection of oral neoplasia [27]. Moreover, various use ofspectral imaging technology in medical field are listed inliterature [1], such as evaluation of tissue oxygen, diagnosisof hemorrhagic shock and detection of chronic mesentericischemia. Spectral imaging technology has proven itscapability to produce reliable information for the surgeonswith sufficient sensitivity in order to discriminating differenttypes of tissues. Previous studies have already shown thatthe delineation of neoplastic tissue from normal tissuecan be enhanced by optical spectroscopy [32] and spectralimaging devices [10]. In addition to this, spectral analysiscan be expected to provide useful clinical informationfor other purposes such as monitoring of blood flow orunwanted changes in normal tissues during surgeries.Real time image capture and visualization forintra-operative image-guided surgery [18, 20, 7] providessignificant benefit to the surgeons. It is an immense interestfrom the surgeons to observe spectral images real time pectral video construction from RGB video: Application to Image Guided Neurosurgery 3 during surgery and monitor important information suchas temporal changes in the organs and tissues as well asvisual difference between various target tissues. Therefore,a spectral video system is necessary to incorporate duringsurgery. Keereweer et. al. [18] briefly discussed about theissues related to real time optical image guided surgery.They mentioned about several devices to accomplishthis task where the images are considered to be takenwithin the range of NIR wavelengths. A multispectralreal-time fluorescence imaging [7] is used for detecting thesentinel lymph node where the multispectral signals fromdifferent cameras are combined to yield true quantitativefluorochrome bio-distribution. Most recently, Leitner et. al.[20] presents a multispectral video endoscopy system whichcomprises of 8 spectral band and 40 frames per second.They need to deal with image registration issues in order toobtain accurate spectral image of a moving organ or tissue.The limitations of the abovementioned real time systemsare: very expensive, portability, limited number of spectralbands, additional processing of individual spectral bandimage and lower frame rate.Currently no spectral video capturing device existswhich has sufficiently high spatial resolution and covers awide range of spectrum. A possible solution is that, ratherthan using spectral video system, a RGB video camera incooperation with a spectrum estimation technique can beused. Then spectral video can be generated from RGB videoin near real time. This research work is motivated fromthis solution, and proposes a framework to generate spectralvideo from neurosurgery RGB video. The RGB video iscaptured using a digital camera connected to an operationalmicroscope dedicated to neurosurgery [2]. The proposedframework successfully overcomes the limitations of theexisting solutions in terms of expense, portability, rangeand number of spectral bands and frame rate. Moreover, webelieve that the framework is extendable and customizabledepending on the demand of applications.Our key contribution in this research is to proposea complete framework for generating multispectral videofrom RGB video. Rather than proposing device orientedsolution, our proposed solution relies on computationalestimation and machine learning based approach. In orderto ensure best estimate of the spectra from rgb, five differentestimation methods are implemented and their performanceis evaluated. The best training dataset is determined tocompute the desire estimation matrix which can be used forany further estimation of the neurosurgery video frames.We discuss the methodology of spectral estimation andexperimental procedure in Section 2, then present the resultsand discuss about them in section 3 and finally section 4draws conclusions.
In this section, first we present several spectral estimationtechniques in Sec. 2.1 and then present the completeframework and experimental procedure for spectral videogeneration in Sec. 2.2.2.1 Spectral estimation methodsThe basic idea of spectral estimation is to compute spectralresponse of a pixel value consisting RGB responses froma digital camera. The estimated spectral response will be ahigher dimensional representation of the color signal. Withthe assumption that all surfaces are Lambertian and there isno fluorescence, the device response r of a camera systemwith its associative reflectance r ( l ) under an illuminant L ( l ) can be modeled by [6]: r i = Z l S i ( l ) L ( l ) r ( l ) d l + d (1)where, r i is the response for the i th camera channel, S i ( l ) is the spectral sensitivity of that channel, d representsmeasurement noise and l represents the wavelength. Mostcommonly there are 3 channels (R, G and B) of the camera,which means the usual value of i =1, 2 or 3. Eq. (1) can berepresented in terms of vectors and matrices as [29]: r = SLr + d (2)where, S (spectral sensitivities of sensors) is an M × Nmatrix, L (spectral power distribution of an illuminant) is aN × N diagonal matrix with the samples along the diagonal, r is a N × d is aM × N represents the numberof samples over the visible wavelengths, and M representsthe number of sensors. Now, considering the camera systemmodel in Eq. (1) and (2), our goal is to directly estimate thespectral reflectance ˆ r from the device response r . Next, webriefly present the details of several estimation methods thatwe experiment in this research. Let us consider that image pixels are measured withreflectance r . After that, the same pixel values are measuredwith a multichannel digital camera with device response r . We assume that the spectrum of the illumination L , sensitivity of the multichannel digital device S andassociative noise of the camera d are already known to us. Wiener estimation method [29, 11, 31, 21] was developed onthe basis of minimum mean square error (MMSE) criterion.If we recall Eq. (2) and consider Q = SL ( Q will be a matrixof dimension M × N) then we can rewrite it as [11]: r = Qr + d (3) Md. Abul Hasnat et al.
From Eq. (3), the
Wiener estimation can be obtained byoperating a matrix W to the device response r as:ˆ r = W r (4)were, the operator W is determined such that it satisfythe MMSE criterion, i.e. it minimizes the ensembleaverage between the original ( r ) and estimated (ˆ r ) spectralreflectances [11]: hk r − ˆ r k i = hk r − W r k i → min (5)where, hi denote an ensemble average. The explicit form[29, 11] of the Wiener estimation is given by: W = R ss Q t (cid:0) QR ss Q t + R dd (cid:1) − (6)where, R ss denotes the autocorrelation matrix of spectralreflectances of the learning samples and R dd denotesthe autocorrelation matrix of measured noise. Theseautocorrelation matrices are computed as: R ss = h rr t i , R dd = h dd t i (7)Eq. (6) computes the Wiener estimation matrix [29, 11]considering the spectral reflectance of training spectrum( r ), spectrum of illumination ( L ) and device sensitivityinformation ( S ). An alternative way [31] to compute theestimation matrix from the spectral reflectance of trainingspectrum ( r ) and device response ( r ) as: W = R r r R − rr (8)where, R r r is a cross-correlation matrix between vectors r and r , and R rr is an autocorrelation matrix of vector r .These two correlation matrices are defined as R r r = h r r t i , R rr = h rr t i (9)Therefore, the spectral reflectance (ˆ r ) can be estimatedby using the estimation matrix ( W computed from Eq. (6) or(8)) into the Eq. (4). The idea of pseudoinverse estimation [29] (also calledleast square [21]) method is to establish a direct mappingsystem from device response to reflectance that minimizesthe least square error for a characterization or trainingset of known reflectance spectra with associative deviceresponses. Let us consider, P as a M × k matrix that containsthe sensor responses [ r , r , . . . , r k ] and let R be a N × kmatrix that contains the corresponding spectral reflectances [ r , r , . . . , r k ] where k is the number of learning samples.The goal of pseudoinverse method [29, 21] is to computea transformation matrix W with dimension N × M, that minimizes k R − W P k . The notation k . k represents theFrobenius norm. The matrix W is given by: W = RP + (10)where P + represents the pseudoinverse matrix of the matrix P . Once the transformation matrix W is being computed, itcan be further used to estimate the reflectance from deviceresponses as: ˆ r = W r which is identical to the Eq. (4). The concept of
Linear estimation method [29, 6] camefrom the notion of representing the spectral reflectances asa linear combination of basis vectors. These basis vectorsare derived by applying principal component analysis [17](PCA) over the spectral reflectances. Let us consider a set ofreflectance r with d number of samples. If we perform PCAover these reflectances then we can represent the reflectanceset as:ˆ r = w v + w v + . . . + w d v d = V w (11)In Eq. (11), d represents the number of basis vectorswhich is equivalent to the number of samples in thereflectance r , v i represent the ith basis vector and w i represents it associated weight, V represents an N × d basismatrix containing the first d basis vectors as columnvectors and w represents a d × d weights corresponding to the basis vectors. Let usrearrange the basis vectors in decreasing order based ontheir eigenvalues. If we recall Eq. (3) without consideringthe noise term and substitute reflectance from Eq. (11), thenwe will get [29]: r = Q Lw = Lw (12)where, L = QV is the system matrix with dimension M × dcomputed by multiplying Q (matrix with dimension M × N)and V (matrix with dimension N × d). If we assume thatnumber of basis vector will be equal to the number ofsensors, i.e. M = d , then L is a square matrix. Therefore,the entries of matrix L will be known. Now, using Eq. (12)we can compute the approximate weight column vector ˆ w by rearranging it as:ˆ w = L − r (13)Note that, if our assumption does not hold, i.e. numberof basis vector is more than the number of channels, thenmatrix L will not be a square matrix. In such situation,we have to use L + instead of L − . Here, L + representthe pseudoinverse [3] of matrix L and L − represent directinverse of matrix L . pectral video construction from RGB video: Application to Image Guided Neurosurgery 5 After computing the weight column vector from deviceresponses, the estimated reflectance can be computed (bysubstituting Eq. (13) into Eq. (11)) as:ˆ r = V ˆ w = V L − r (14)here, r is the multichannel device response, ˆ r is thecorresponding estimation of spectral reflectance, V is thematrix of basis vectors and L is the system matrix computedfrom S , L and V matrices. method [29, 5, 21, 16, 15] follows similarconcept of linear method to represent the spectralreflectances. However, it modifies linear method byemploying a regression analysis between the weight columnvectors and corresponding sensor responses [29]. Here, weassume that the basis vectors are arranged in decreasingorder according to their eigenvalues [16, 15].Let us consider that we have a set of learning spectralreflectances r . We can represent the estimated reflectancesin terms of basis vectors and weights with Eq. (11). Let usconsider, B be a d × k matrix that contains the column vectorsof the weights to represent the k known spectral reflectancesand P be an M × k matrix that contains correspondingsensor response vectors of those k reflectances. Thismethod establishes a relationship between weights of thereflectances B and corresponding device responses P asusing regression analysis as k B − DP k . Here, D is a matrixwith d × M dimension that minimizes the Frobenius norm as: D = BP + (15)The matrix D can be used to estimate the weights fromdevice responses P as:ˆ w = DP (16)Finally, combining Eq. (16) and (11) we can estimate thespectral reflectance from device response asˆ r = V DP (17)
The basic idea of
Shi-Healey method [29, 5, 21, 28] isto generate unique solution while considering the numberbasis vectors to be more than the number of channels. Shiand Healey modified the linear method by allowing morethan three degrees of freedom for accurate representation ofspectral reflectances [28].Recall Eq. (11), and consider dividing the matrix ofbasis vectors into two matrices such as V with N × (d-M)dimension as [ v , v , , v d − M ] and V with N × M dimension as [ vd − , v d − , v d ] . Applying this division into Eq. (12) and (14) the sensor response r and estimated reflectance ˆ r canbe represented as: r = QV w + QV w (18)ˆ r = V w + V w (19)where, w and w represent the corresponding weightcolumn vectors for V and V matrices. From Eq. (18), w can be solved in terms of w as: w = ( QV ) − ( r − QV w ) (20)Substituting w into Eq. (19), spectral reflectance ˆ r canbe represented as:ˆ r = V w + V ( QV ) − ( r − QV w ) (21)In general, there will be a set of spectral reflectancesthat are consistent with the linear method and that satisfy Q ˆ r = r . Let r i be the spectral reflectance vector that satisfiesboth conditions above. Now, the weight column vector ˆ w isderived from Eq. (21) as:ˆ w = V − V ( QV ) − QV + r i − V ( QV ) − r (22)Therefore the estimated spectral reflectance can beobtained by the substituting Eq. (22) into equation (21) fromthe condition that minimizes k ˆ r i − r i k .An important characteristic of this method is thatestimation time is proportional to the number of trainingreflectance spectra. For a large number of training spectra,this method is very slow because for every single pixel ithas to check all the training spectra to find the best one asestimated spectra. In addition to this, for every single sensorresponse value in comparison with a particular trainingspectrum, it is necessary to investigate the correct numberof basis vector which produce best estimation.Different method requires different set of priorinformation. The common information required by all themethods are a set of spectral reflectance which is consideredas training set. The prior knowledge required to estimatespectral reflectance by different methods discussed above issummarized in Table 1 [29].Among the spectrum estimation methods discussedhere, Wiener [31], Pseudoinverse [29] and Imai-Berns [16]methods use linear regression analysis. In literature [29, 31,13], it is suggested that, use of nonlinear regression [12]with higher order (multivariate) polynomial may improvethe performance of estimation. Therefore, this researchconsiders both linear and nonlinear regression analysisfor spectral estimation experiments using Wiener [31],Pseudoinverse [29] and Imai-Berns [16] method. Md. Abul Hasnat et al.
Table 1: Requirements of prior knowledge for each method [29].
Method Name Sensitivities Illumination Reflectance RGB Values
Wiener
Yes / No Yes / No Yes No / Yes
Pseudoinverse
No No Yes Yes
Linear
Yes Yes Yes No
Imai-Berns
No No Yes Yes
Shi-Healey
Yes Yes Yes No
Fig. 1: Block diagram of spectral video generator. (a)Training data collection (b) Spectral video generator.2.2 Spectral video generationThe aim of this research is to establish a framework whichgenerates spectral video from neurosurgery RGB video. Thecomplete system is called spectral video generation which isdecomposed into two subtasks: (a) training spectra selectionand (b) spectral video construction. Training spectraare considered as prerequisite knowledge for spectralestimation. Block diagram of spectral video generator isillustrated in Figure 1. Initially, training spectra are collectedrandomly from spectral images. Then representative trainingset is determined (Figure 1(a)) by a searching method.Figure 1(b) illustrates the process to construct spectral videowith the aid of prior information. These two subtasks arebriefly discussed below.Training spectra are collected from a database ofspectral neurosurgery images. The database consists of34 spectral images which were captured during surgeryof 6 patients. Each spectral image was captured withinthe wavelength range of 420 nm to 720 nm at 10 nminterval. A spectral imaging device consisting of a camera(monochrome camera with
Sony ICX414 sensor), tunablefilter (
VariSpecVIS VIS-10
Liquid Crystal Tunable Filter) and a focusing lens (70 mm focal length lens) was usedto capture the spectral images. Since the brain tissues aremoving with heart beat and breathing, each wavelengthlayer is slightly shifted during spectral image capture.Therefore, images must be registered accurately to achievecorrect spectral responses. Among the 34 spectral images,17 images were finally selected according to registrationaccuracy. Moreover, after registration few of the imagescontain noises and unexpected color particularly in theborder. These images are manually corrected by croppingout the noisy regions. The spectral images are illustratedin Appendix A. RGB images corresponding to the spectralimages were not captured through the digital video camera.However, for the purpose of estimation it is obviousto have a RGB color version of each spectral image.Therefore, the color images were generated from thespectral images considering standard illuminant
D65 andCIE color matching functions. We consider these colorimages as training RGB images.In a practical condition of spectral estimation, thetraining spectra are collected from pre-specified targetpoints of the objects in image or from standard color patches(from calibration chart). In this research, the target points arenot specified. Therefore, the training spectra are collected bychoosing certain percentage (1%, 5%, 10%, 20% and 50%)of random pixels from each image. As a consequence, 5 setsof training spectra are available from each spectral image.Corresponding RGB values of the selected training spectraare also collected from RGB versions of the spectral images.The assumption of this random selection is that, spectrafrom all possible objects in the images will be selected. Inthis random selection process, total 85 (17 ×
5) different setof training spectra are collected for training. However, thefinal goal is to select or construct only one set of trainingspectra which will be considered as the representative set.The representative training set is identified by observing itsperformance in spectrum estimation. In this research, weconsider the root mean square error (RMSE) as a measureof evaluation for the training sets. RMSE is computedbetween original spectral image and its estimated version.The representative set of spectra may not same for allestimation methods. Therefore, the searching proceduremust be applied to each estimation method separately.Steps required to search representative set of spectra are asfollows: pectral video construction from RGB video: Application to Image Guided Neurosurgery 7 – Step 1:
For each RGB image, estimate spectral image(Sec. 2.1) using each set of training spectra. ComputeRMSE (Eq. (23)) and select the set with lowest RMSE. – Step 2:
Using each one of the training sets (obtainedfrom step 1), estimate all spectral images. Computeaverage RMSE value. Then, sort them in increasingorder. – Step 3:
Choose the first five sets and generate allpossible combination. With these combinations, followthe procedure in step 2. Then, select the best training setaccording to lowest average RMSE value.It is observed from the training spectra that, few of thespectra has quite unusual shapes and contains several peaks.These spectra are signifying the presence of highlight inthe spectral image. In a neurosurgery spectral image, thehighlights are present due to the liquid materials (blood,water) that reflects (specular reflection) the illuminant atcertain extent. It is also interesting to note that most of thespectra collected from different objects have high intensityat the red wavelength range. The reason for this is theabsence of blue and green colored objects in the collectedspectral images. Another important property of the spectrais that they are not smooth.The spectral video is generated from RGB video. Atthe beginning of the process the frames are sequentiallyextracted as RGB image from the video. A video frameis a RGB image which is first corrected if necessary. Forexample, for several estimation methods the RGB valuesneed to be normalized within the range between 0 and1. This correction is a part of preprocessing. After thepreprocessing is completed, the spectral image is estimated(Sec. 2.1) from the RGB image with the aid of necessaryprior information and particular estimation method. Then,the estimated spectral image is added to target spectralvideo. This process is continued until the last video frame isreached. Temporal aspect of video processing is maintainedeither by including wait functionality or by skipping fewframes based on inter-frame similarity. The results ofspectral estimation techniques are presented in the resultsand discussion section.
This section evaluates the experimental performance of thespectrum estimation methods. These techniques are appliedto the neurosurgery spectral image database (Appendix 1).Each spectral image has 31 spectral bands (spectral range420 nm to 720 nm with 10 nm sampling interval). 3.1 Spectrum estimationFive spectral estimation methods (described in Sec. 2.1) areexperimented to estimate 17 neurosurgery spectral imagesfrom RGB images. Accuracy of estimation is evaluated bycomputing two spectral metric such as root mean squareerror (RMSE), goodness of fit coefficient (GFC) [21] anda colorimetric metric CIELAB ( D E a b ). The metrics arecomputed using the actual and estimated spectral images.With RMSE (Eq. 23) and D E ab (Eq. 25), perfect match isobtained when the value of these metrics are 0. Unlike these,in GFC (Eq. (24)) where the range of value is between 0and 1, perfect match is obtained when the value is 1. Inthis research, we consider the RMSE value as the principalmetric for evaluation. Additionally, we observe the values ofGFC and D E ab for further evaluation if necessary. RMSE = s n n (cid:229) j = ( r ( l j ) − ˆ r ( l j )) (23) GFC = (cid:12)(cid:12)(cid:12) (cid:229) nj = r ( l j ) ˆ r ( l j ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:229) nj = [ r ( l j )] (cid:12)(cid:12) / (cid:12)(cid:12) (cid:229) nj = [ ˆ r ( l j )] (cid:12)(cid:12) / (24) D E ab = p D L ∗ + D a + D b (25)In Eq. (23) and (24), r represents the actual spectra andˆ r represents the estimated spectra. In Eq. 25, the value of D L ∗ , D a and D b are calculated by first converting the spectrato XYZ values and then converting XYZ to Lab. The XYZvalue of standard illuminant D65 is used as a white referencein XYZ to Lab conversion.In the evaluation process, for each method RMSE iscomputed for estimating 17 images using 5 set of trainingspectra. Therefore there are total 85 (5 ×
17) RMSE valuesobtained. Then the average of these 85 RMSE values iscalculated. This average RMSE is considered for evaluatingthe estimation methods (Sec. 2.1). The method that producesthe lowest average RMSE is considered as the best one.Table 2 illustrates the performance of the methods in termsof average RMSE, GFC and D E ab . We consider 3 basisvectors in linear and Imai-Berns methods.From the results listed in Table 2, it is clear thatthe
Wiener [31] and the
Pseudoinverse [29] methodsare producing similar estimation accuracy and they aresignificantly different than the
Linear [6] and the
Shi-Healey [28] methods. Accuracy produces by the
Imai-Berns [15]method is nearly close to the
Wiener and the
Pseudoinverse methods. From different metrics of accuracy measurementit is clear that when RMSE decreases then color difference( D E ab ) decreases and GFC increases. This reveals thatif RMSE is low, then the color difference is low andtherefore the estimation accuracy is high. Because of a goodagreement between the metrics, we consider only RMSE forfurther evaluation. Md. Abul Hasnat et al.
Table 2: Accuracy (average RMSE, average GFC and average D E ab ) of the methods. Wiener Pseudo inverse Linear Imai-Berns Shi-Healey
Average RMSE
Average GFC
Average D E ab Fig. 2: Plot of a recovered spectral reflectance. Comparison between actual spectra and estimated spectra from all methods.Table 3: Accuracy and required time of the methods for estimating a spectrum of a particular pixel.
Wiener Pseudo inverse Linear Imai-Berns Shi-Healey
RMSE
Time (sec)
In order to compare the estimation performance visually,we consider a pixel from a vein of the brain surfacefor illustration. Table 3 illustrates the performance of themethods (RMSE) for estimating the spectra of the selectedpixel in terms of accuracy and speed. From the RMSE valueof Table 3, it is clear that the
Wiener [31],
Pseudoinverse [29] and
Imai-Berns [15] methods outperforms the othertwo methods. Figure 2 illustrates the recovered spectrafrom different methods in comparison to actual spectra. Theplots clearly support the results obtained in the Table 3. Inaddition to the performance of estimation, the estimationtime is also taken into account. Required amount of timefor a particular estimation task is listed in the secondrow of Table 3. It is observed that estimation using the
Pseudoinverse method is faster than any other methods.Therefore, in this research we consider the
Pseudoinverse method for further experiments.In order to further increase the accuracy of estimationusing the
Pseudoinverse [29] method, nonlinear regressionanalysis [25] have been experimented. Accuracy (in terms ofaverage RMSE) of the combinations among different higher order polynomials in cooperation with different number ofterms is presented in Table 4. For each combination, averageRMSE value from 85 training sets is computed as a measureof accuracy. It is observed that the accuracy increases(average RMSE decreased) with certain combinations whenthe number of terms is increased. However, it is alsoimportant to note that computation time is proportional tothe addition of terms. Therefore, there is a tradeoff betweennumber of terms and computation time which should beconsidered depending on the demand of application. In thisresearch, we choose the combination [ R , G , B , R , G , B ] for neurosurgery video estimation since it has reasonablebalance between computation time and accuracy.For evaluating the accuracy of different estimationmethods, the strategy is to collect five sets of trainingspectra from a spectral image, and then use these trainingsets to experiment different methods. Finally the overallaccuracy of a method is computed from the average ofall these (5 sets per image ×
17 spectral image = 85 setsof training spectra) accuracies. That means, each spectralimage is estimated using the training samples collected from pectral video construction from RGB video: Application to Image Guided Neurosurgery 9
Table 4: Accuracy (in terms of average RMSE) of Pseudoinverse method with higher order polynomials and different numberof terms.
Combination of polynomials and terms Avg. RMSE R , G , B , RG , GB , BR R , G , B , R , G , B R , G , B , R , G , B R , G , B , RG , GB , BR R , G , B , RG , GB , BR , R G B R , G , B , RG , BG , BR , RG , GB , BR R , G , B , RG , BG , RB , R , G , B , RG , GB , BR R , G , B , RG , BG , RB , R , G , B , R G , G B , B R Table 5: Average RMSE of common training datasets toestimate all test images. Numbers indicate the percentageof data which has been collected from training imagesand numbers within [] indicate the images which werecombined.
Dataset Avg. RMSE itself. However, in a practical application it is not feasibleto determine a training dataset first and then estimate aparticular spectral image. Therefore, a common datasetis necessary to estimate spectral image from any RGBimage. We applied an algorithm (presented in section 2.2)to identify the best training dataset (representative set ofspectra) and observed the performance. Table 5 presents theaccuracy (in terms of average RMSE) of five best trainingdatasets which are used to estimate all test spectral images.From the obtained accuracy it is observed that, there is nosignificant difference between these five datasets. For thepurpose of selecting the representative set of spectra, weselect 5% random spectra collected from image number 2.In order to observe the estimation result in the image,a RGB image (single video frame) is extracted from avideo. Then the corresponding spectral image is computedby the
Pseudoinverse [29] method using the trainingset. After that, RGB image (transformed RGB image) iscomputed from the estimated spectral image. The colorreproduction error between the original and transformedRGB image is computed using
S-CIELAB [36] measure. Theperformance of the
Pseudoinverse method is evaluated using3 terms/variables and 6 terms/variables. Corresponding
S-CIELAB values are presented in table 6. From the
S-CIELAB values it is observed that, performance of 6terms estimation outperforms the performance of 3 termsestimation. Table 6: Comparison (using S-CIELAB values) betweenoriginal and transformed RGB images. Entry X means notexperimented.
Number of terms Wiener Pseudoinverse Linear Imai-Berns
We further investigated the error rate produced by the
Pseudoinverse method. In order to identify the spectrumwhich produces large error, a threshold value is set as:
T hreshold = ∗ avg RMSE (26)Based on this threshold value we identified the pixels inthe image that causes high estimation error rate. The rightimage (b) in figure 3 indicates the regions of the pixels wherethe estimated error rate is above the threshold value. It isobserved that, the estimation method is giving high errorrate in estimating the highlighted spectra presents in theimage. In the neurosurgery spectral image, these highlightsappear due to the presence of liquid materials (blood, water)that reflects the illuminant at certain extent. An example ofan estimated highlight spectra (with RMSE 0.43) using the Pseudoinverse [29] method is illustrated in figure 3(c). Froman experiment on the neurosurgery spectral image databasewith the threshold value (Eq. 26), it is observed that onan average 3% pixels are containing highlight in an imageand causes increase of 0.003 average RMSE. However, thisanalysis does not reveal the actual effect of estimation wherethe highlight is present significantly.It is observed that, if the average RMSE is computedwithout the highlight then it decreases from 0.26 to 0.19.For example, a particular spectral image that contains 4%highlighted pixel, increases average RMSE value 0.024.Therefore the error rate (RMSE) is proportional to theamount of highlighted pixels. We further investigatedparticularly on estimating the highlighted spectra. In ananalysis of RMSE values for different type of spectra,it is found that when the average RMSE value in aspectral image is 0.026, the average RMSE for nonhighlighted pixels are 0.019 and highlighted pixels are 0.11.
Fig. 3: (a) Original image. (b) black regions indicate the pixels in image that causes high estimation error. (c) Plot of anestimated spectrum of a highlighted pixel (RMSE value is 0.43).Therefore, it is observed that the average RMSE value forestimating only the highlight is significantly large compareto non highlighted spectra. Therefore, more analysis andexperiments with different methods are necessary to reducethe estimation error rate for highlight spectra estimation. Wedo not consider the task of highlight estimation in the scopeof this research.
This research presents a framework for generating anddisplaying near real time neurosurgery spectral video. Eachframe in the spectral video is generated using a spectrumestimation method. Five different estimation methods areexamined for estimation. The estimation matrix in Wienermethod is computed using the reflectance and correspondingRGB values. In the Linear and Shi-Healey methods,standard illuminant D65 and CIE color matching functionare used as illumination and device sensitivity information.Pseudoinverse and Wiener estimation methods are mostaccurate. It is found that second order polynomial withsix terms is the best choice in terms of accuracy andcomputational expense. Accuracy of Imai-Berns method isnearly close to Pseudoinverse and Wiener methods. Linearestimation method does not provide an acceptable accuracy.Shi-Healey method is very slow since number of trainingspectra considered in this research is considerably large.Therefore, in practical application of video estimation,Shi-Healey method is not a good choice with large trainingset. In order to collect training spectra, the most commonpractice is to take the spectral image of a calibration colorchart along with the target objects. However in a realsurgical environment, it is not possible to place a calibrationcolor chart and capture spectral image of the chart duringsurgery. Therefore, in this research the training spectra are not collected from standard target patches or recommendedcolor charts. An alternative approach is proposed to collecttraining spectra. In this approach, a searching technique isapplied to find out the best training set from a collectionof randomly selected set of spectrum. Temporal aspect invideo processing is considered. Compromise between framerate and speed of estimation is suggested when the processorspeed is not sufficient to generate near real time video.The experimental programs in this research are writtenin MATLAB and the demo application is developed in C++with OpenCV library. The methodology outlined here wouldalso be just as applicable to other research area.
A Experimental Data
Figure 4(a) illustrates several converted (considering standardilluminant
D65 and CIE color matching function) color images fromthe selected spectral images. Four RGB videos (captured duringsurgery of different patients) are available for this research. Thesevideos were captured using a frame grabber. Each video was capturedat 720 ×
576 spatial resolution and 25 frames per second. Figure 4(b)illustrates an example of a frame taken from each RGB video.
References
1. Akbari H, Kosugi Y (2009) Hyperspectral imaging: A newmodality in surgery. INTECH Open Access Publisher2. Antikainen J, von Und Zu Fraunberg M, Orava J, JaaskelainenJE, Hauta-Kasari M (2011) Spectral imaging of neurosurgicaltarget tissues through operation microscope. Optical review18(6):458–4613. Ben-Israel A, Greville TN (2003) Generalized inverses: theory andapplications, vol 15. Springer Science & Business Media4. Cheng LJ, Chao TH, Dowdy MW, LaBaw CC, Mahoney JC,Reyes GF, Bergman K (1993) Multispectral imaging systemsusing acousto-optic tunable filter. In: OE/LASE’93: Optics,Electro-Optics, & Laser Applications in Science& Engineering,International Society for Optics and Photonics, pp 224–231pectral video construction from RGB video: Application to Image Guided Neurosurgery 11