From Machine Learning to Transfer Learning in Laser-Induced Breakdown Spectroscopy: the Case of Rock Analysis for Mars Exploration
Chen Sun, Weijie Xu, Yongqi Tan, Yuqing Zhang, Zengqi Yue, Sahar Shabbir, Mengting Wu, Long Zou, Fengye Chen, Jin Yu
FFrom Machine Learning to Transfer Learning inLaser-Induced Breakdown Spectroscopy: theCase of Rock Analysis for Mars Exploration
Chen Sun, Weijie Xu, Yongqi Tan, Yuqing Zhang, Zengqi Yue, Sahar Shabbir,Mengting Wu, Long Zou, Fengye Chen and Jin Yu ∗ School of Physics and Astronomy, Shanghai Jiao Tong University,Shanghai 200240, P. R. China
E-mail: [email protected]
Abstract
With the ChemCam instrument, laser-induced breakdown spectroscopy (LIBS) hassuccessively contributed to Mars exploration by determining elemental compositions ofthe soil, crust and rocks. Two new lunched missions, Chinese Tianwen 1 and AmericanPerseverance, will further increase the number of LIBS instruments on Mars after theplanned landings in spring 2021. Such unprecedented situation requires a reinforcedresearch effort on the methods of LIBS spectral data treatment. Although the matrixeffects correspond to a general issue in LIBS, they become accentuated in the caseof rock analysis for Mars exploration, because of the large variation of rock composi-tion leading to the chemical matrix effect, and the difference in morphology betweenlaboratory standard samples (in pressed pellet, glass or ceramics) used to establishcalibration models and natural rocks encountered on Mars, leading to the physicalmatric effect. The chemical matrix effect has been tackled in the ChemCam projectwith large sets of laboratory standard samples offering a good representation of various a r X i v : . [ phy s i c s . a pp - ph ] F e b ompositions of Mars rocks. The present work deals with the physical matrix effectwhich is still expecting a satisfactory solution. The approach consists in introducingtransfer learning in LIBS data treatment. For the specific case of total alkali-silica(TAS) classification of natural rocks, the results show a significant improvement of theprediction capacity of pellet sample-based models when trained together with suitableinformation from rocks in a procedure of transfer learning. The correct classificationrate of rocks increases from 33.3% with a machine learning model to 83.3% with atransfer learning model. Introduction
It is generally considered that the matrix effects, both the chemical matrix effect and thephysical matrix effect, represent a critical issue in analyses with laser-induced breakdownspectroscopy (LIBS) for either qualitative classification or quantitative determination. Suit-able solutions with respect to such consideration become primordially determinant for ap-plications as important as LIBS analysis of rocks in Mars explorations, where the targetedscientific goals, searching for the present and past water activities and the traces of the life,as well as the study of the Mars habitability, rely, at least partially, on the reliability andthe exactitude of the analytical data that one can extract from LIBS spectra recorded byLIBS instruments embarked on Mars rovers. Certainly the diversity of chemical composi-tion of Mars rocks has been studied by the precedent missions, the absence of real samplesfrom Mars, except meteors, requires a large number of laboratory rock standard samplesto be prepared with Earth natural rocks or by mixing pure chemical compounds, in orderto cover the chemical variety of Mars rocks. It was the purpose of the sets of laboratorystandard rock samples prepared and used by the ChemCam team for training and validationof Mars LIBS spectral data processing models. The number of involved samples was first69, they were further increased to 408 in order to offer a larger covering of the chemicaland mineral compositions of Mars rocks. It is useful and important to point out that all2he laboratory rock standards were prepared in the form of pressed powder disks, glasses,and ceramics to minimize heterogeneity in the scale of LIBS observation of typically severalhundred µ m. Such sample preparation leads to obvious differences in surface morphologicaland physical properties between laboratory standards and real rocks analyzed by LIBS onMars, differences from which physical matrix effect can result. With this concern, the effectof surface roughness on the hydrogen emission line has been investigated. Our recently pub-lished work observed and analyzed the performance of a machine learning-based model trained with a set of pressed rock powder pellets for total alkali-silica (TAS) classification of rocks in their natural state. Important degradation of the model prediction performancecompared to the prediction for pellet samples has been observed. Such degradation preventsthe models trained with laboratory standards from reliable predictions with LIBS spectraacquired in situ on real rock samples, a situation which can become crucial in an applicationas we mentioned above, of in situ LIBS analysis of Mars rocks, since we are not yet able tobring back samples from Mars.In order to search a solution for the discussed issue, in this work, transfer learning hasbeen introduced for LIBS spectral data treatment. Transfer learning is considered in machinelearning when the knowledge gained while solving one problem is required to be applied toa different but related problem. Its necessity comes from the fact that a major assumptionin many machine learning and data mining algorithms is that the training and the model-targeted data should share the same feature space and have the same distribution. It isunfortunately not the case for the application scenario that we consider. Moreover, transferlearning has recently emerged as a new learning framework to address the problem of insuf-ficient training data in an application (target domain) with the help of the knowledge learntfrom a related application with the facility to get sufficient training data (source domain). Such strategy fits the requirement of LIBS analysis of Mars rocks, where sufficient labora-tory standards can be prepared as the source domain, while real Mars rock samples are notyet available as the target domain. According to the specific contents in the “knowledge”3o be transferred, we can distinguish feature-representation-transfer, where parts of relevantfeatures respectively from the both source and target domains are further selected for theirlow sensitivity to the difference between the two domains, to form a common set of trans-fer features contributing to the training of a transfer learning model. Instance-transfer isanother specification of transfer learning where data from the both source and target do-mains participate in the transfer learning model training, with a conditional testing of therelevance of each data from the source domain for its effect in improving the performance ofthe model during a cross-validation with data from the target domain. A weight is thenapplied to each source domain data according to its efficiency in improving the performanceof the model for predicting with target domain data.More specifically, in our experiment, on the basis of the LIBS spectra from a set of labo-ratory standard samples in the form of pressed pellets, a machine learning-based multivariatemodel was trained and used to predict the concentrations of major oxides,
SiO , N a O and K O , with the spectra from the corresponding natural rock samples, concentrations neces-sary for the TAS classification of the rocks. The influence of the physical matrix effect wasobserved. A modified model training procedure was then applied with the adjunction in thetraining sample set, of a small number of natural rocks with certified concentrations of themajor oxides. A transfer learning-based model was thus trained with the implementation offeature-representation-transfer and instance-transfer, and used to predict the concentrationsof the major oxides for the rock samples. In the following, we will first briefly present theused samples, the experimental setup and protocol. We describe then the methods of ma-chine learning- and transfer learning-based model trainings, before the performances of theboth models being presented and compared in order to draw the conclusion of the work.4igure 1: Presentation of the used rock samples in a TAS diagram according to their majoroxide concentrations determined using XRF. The short notations of the 15 fields (surroundedby circle) are according to Reference 14. Eight rock were selected as training samples (rep-resented by red crosses in the figure): S3, S7, S8, S11, S13, S14, S18 and S19. The rest ofthe 12 samples were used as validation samples (represented by blue dots in the figure). Samples and Experimental setup
Sample
In this work, 20 natural terrestrial rocks were used as samples for LIBS analysis. The rockswere first washed using alcohol and distilled water before any further preparation. All therocks were analyzed in 2 different forms. Raw rocks: LIBS measurements took place onthe natural surface of each rock. Pellets: a part of each rock was crushed and ground intopowder by a laboratory mill and then sieved by a 300-mesh screen (grain size ¡50 µm ). Abinder (microcrystalline cellulose powder) with similar particle size was mixed with the rockpowder at a weight ratio of 20%. One gram of the obtained powder was pressed under apressure of 850 MPa for 30 minutes to form a pellet of 15 mm diameter and 2 mm thickness.The composition, with especially the concentrations of major oxides, SiO , N a O , K O , ofeach rock was determined by XRF with the pellets, which allowed presenting the rocks in aTAS diagram as shown in Figure 1. 5 xperimental setup Figure 2: Schematic presentation of the used experimental setup, together with plasmaimages respectively induced from a pellet and a rock, and typical LIBS spectra showingdifference in emission intensity of Si, Na and K between a pellet and the corresponding rocksample.A detailed description of the used experimental setup can be found elsewhere.12 Brieflyas shown in Figure 2, a Q-switched Nd:YAG laser operated at a wavelength of 1064 nm witha pulse duration of 7 ns and a repetition of rate of 10 Hz, was used to ablate the sampleswith a pulse energy of 8 mJ. A lens of 50 mm focal length focused laser pulses about 0.86 mmbelow the surface of a sample. The diameter of the focused laser spot on the sample surfacewas estimated to 150 µ m, corresponding to a laser fluence on the sample surface of about45 J/cm2, or an irradiance of about 6.5 GW/cm2. Emission from the generated plasma wascollected by a combination of two quartz lenses with a same focal length of 75 mm into anoptical fiber of 50 µ m core diameter. The output of the fiber was connected to the entranceof an echelle spectrometer equipped with a ICCD camera (Mechelle 5000 and iStar, AndorTechnology) which provided a wide spectral range from 230 nm to 900 nm with spectralresolution power of 5000. The ICCD camera was triggered by laser pulses with a delay anda gate width of respectively 500 ns and 2000 ns. A lateral CCD camera (not shown in thefigure) allowed capturing time-integrated plasma images as shown in the inset of Figure 2.6he samples were mounted on a 3D translation stage allowing recording replicate spectra ona sample surface with an ablation crater matrix, while keeping a constant distance betweenthe focusing lens and the sample surface. Spectrum acquisition.
For each sample, 50 replicate spectra were taken on 50 different ablation craters, and eachcrater received 25 successive laser shots. The emission spectra induced by the first 5 lasershots were removed in order to avoid surface contaminations, and those induced by thesubsequent 20 laser shots were accumulated to produce a replicate spectrum. Such procedurealso intended to reduce the difference in surface roughness between the 2 different forms ofthe samples. In total, 2000 spectra were recorded for the 20 rocks with the 2 different formsfor each of the rocks. Typical spectra are presented in the inset of Figure 2 for a pelletand the corresponding rock sample, showing different line intensities of Si, Na and K. Suchdifference corresponds well to that observed for plasma images induced with the differenttypes of samples.
Data Treatment method.
The general data treatment flowchart used in this work is shown in Figure 3. Several stepscan be distinguished: pretreatment, feature selection, machine learning (ML) and transferlearning (TL) model trainings, and model validation. The LIBS spectra from the pelletsamples were used as training data, while those from the rocks were separated into a trainingdata set containing 8 samples and a validation data set containing 12 samples.
Data treatment
The pretreatment consisted in the following operations. i) Average in order to reduce exper-imental fluctuations and the effect of sample inhomogeneity: For each sample, the 50 raw7igure 3: General flowchart used in this work allowing a comparative study between theperformances of a machine learning (ML) model and those of a transfer learning (TL) model.spectra were averaged in a procedure where an averaged spectrum was calculated with a firstgroup of randomly selected 30 spectra. The rest 20 spectra then replaced one by one, a spec-trum in the first group, each time the new group of 30 spectra was averaged to generate 20other average spectra. 21 average spectra were generated for each sample. ii) Baseline correc-tion: an average spectrum was decomposed into a set of cubic spline of undecimated waveletscales, the local minima were found, then the spline function was interpolated through thedifferent minima to construct the spectral baseline which was removed. iii) Normalization:Baseline-corrected average spectra were normalized with their respective total intensity (thearea under the spectrum). iv) Standardization: Standard normal variate (SNV) transfor-mation was respectively applied to the normalized baseline-corrected average spectra of thepellets (20 × × SD ) equal to 1. The parameters de-termined in the standardization of the training set of the rock samples (the mean and the )8ere applied to the validation set of the rock samples (12 × Spectral feature selection.
SelectKBest algorithm was respectively applied to the pretreated spectra of the pellets andthe training set of the natural rocks, and successively for the 3 concerned oxides. Within asample set, for each spectral channel, covariance was calculated between the channel intensityand the concentration of the concerned compound in the corresponding sample, over all thespectra of the sample set. A score was then calculated as a function of the covarianceaccording to the definition given in Reference 13. Different scores, ρ i,j , were thus associatedto the spectral channels, with 2 index and a value varying from 1 to 22161, which ranks thethem from the lowest score to the highest one. Such procedure was applied to the 2 samplesets ( i = 1: pellets, i = 2: calibration set of rocks) and the 3 concerned oxides ( j = 1: SiO , j = 2: N a O , j = 3: K O ). A feature selection procedure first identified 100 highestranked spectral channels respectively for each of the 3 oxides in each of the 2 sample sets.Pearson’s correlation coefficient related to the above mentioned covariance was calculatedfor the 6 groups of 100 selected features. The results showed that all the selected featureshad a Pearson’s coefficient larger than 0.75.As we can see in the Figure 3, the 3 groups of 100 features selected for the 3 oxidesin the pellet sample set were directly used to respectively train the calibration models forthe 3 oxides base on a back-propagation neural network (BPNN). The training algorithmwhich involved stochastic gradient descent (SGD) and mini-batch stochastic gradient de-scent (MSGD) optimization iterations, as well as cross validation with randomly generatedstatistical equivalent data configurations, has been presented in detail in Reference 13. Wewill not in this paper go into more detail about such training algorithm.For the transfer learning model training, and according to the principle of feature-9epresentation-transfer discussed above, an ensemble of common selected features was iden-tified between the pellet and the training rock sample sets, by calculating a total rankingindex ρ j = ρ ,j + ρ ,j . A feature selection procedure then retained the 100 highest rankedfeatures according to the value of ρ j from the highest one to the lowest one, respectivelyfor the 3 oxides. These groups of features were fed into the transfer learning model trainingalgorithm. As an example, the results of feature selection for N a O are shown in Figure4, although similar behaviors can also be observed in the feature selections for the other 2oxides.Figure 4: Results of feature selection for N a O : (a) for pellets and (b) for calibration rocks,SKB scores of all the spectral channels with in red dots the 100 selected features; (c) totalranking index of all the spectral channels, together with in red dots the 100 common selectedfeatures; (d) a typical normalized average spectrum from a pellet sample, together with in reddots the 100 common selected features, with 2 insets showing enlarged parts of the spectrumaround 589 nm and 820 nm. 10n Figure 4 (a), we can see that for the pellets, the spectral channels with high SKB scoresare clearly concentrated around several Na emission lines: Na I 330.24 nm and 330.30 nmlines, the sodium D lines: Na I 588.99 nm and 589.59 nm lines (2 groups of ghost lines around572.1 nm and 606.9 nm are recorded due to these strong lines), Na I 818.33 nm and 819.48nm lines. For the calibration rocks in Figure 4 (b), the selected features are distributedamong other channels with a significant decrease of the scores for all the important features.This means that the physical matrix effect perturbs the inherent correlation between theemission line intensities of an element and its concentration in the material, and reducestherefore their importance in the concentration determination. In the same time, otherspectral channels, such as those around 275 nm and between 410 nm and 460 nm, getrelatively higher scores. This means that they become important in the determination ofelemental concentration when using a model based on the calibration set of the rock samples.These features, representative of the rock samples, are thus included in the common selectedfeatures for transfer learning model training. Figure 4 (c) shows the total ranking index for N a O , and in red dots, the 100 common selected features. These features are indicated in atypical spectrum in Figure 4 (d) in red dots. We can see that, beside the features related tothe Na emission lines, some features important for the rock samples are included. A moredetailed peak identification using the NIST database, shows the contributions from Fe II268.475 nm and 275.57 nm lines, Si II 385.366 nm and 385.602 nm lines, and the probablecontributions from K I 404.414 nm and 404.721 nm lines, Ca I 409.85 nm lines, and SiII 412.807 nm and 413.089 nm lines. A selected feature around 461 nm cannot have easyinterpretation.In the insets of Figure 4 (d), 2 parts of the spectrum are enlarged. The inset around 589nm shows the sodium D lines together with the selected features in red dots. We can seethat the selected features are located in the side parts of a line profile, while the central partof the line is not retained by the feature selection procedure. This is due to self-absorptionof the strong resonant Na D lines, which affects much more the central part of the spectral11ines. This observation shows the capability of the feature selection procedure to reducethe influence of self-absorption. The second inset in Figure 4 shows an enlarged part of thespectrum around 820 nm, where we can see the selected features related to the Na I 819.5 nmline in red dots. Due to the interference with the N I 820.0 nm, only the short wavelengthpart of the spectral profile around 820 nm is included in the selected features, showing theefficiency of the selection procedure to avoid the influence of spectral interference. Transfer learning-based calibration model training.
A training algorithm of the transfer learning model was developed in this work on the basisof that used for machine learning model training presented in detail in our previous publi-cation and used in various application scenarios. The flowchart of transfer learningmodel training is shown in Figure 5. We can distinguish 3 main steps: data formatting;model training by optimization through iteration loops and model validation. Training wasrespectively performed for the 3 concerned oxides, resulting in 3 specific models.The optimization and the assessment of the models were performed in this work using acertain number of indicators specified bellow: determination coefficient of a linear regression r indicating the correlation of the calibration data with respect to the regression model, limitof detection LOD of a model, average relative error of calibration
REC (%) assessing theaccuracy of a calibration model to be tested, average relative error of test
RET (%) assessinga tested model to be validated, average relative error of prediction
REP (%) assessing thetrueness of the model-predicted concentrations, average relative standard deviation
RSD (%)assessing the precision of the model-predicted concentrations. The mathematical definitionsof these parameters can be found elsewhere, in particular in References 13 and 27.
Data formatting.
According to the above discussed principles of feature-representation-transfer and instancetransfer in transfer learning, spectra from the pellet samples (the source domain) and those12igure 5: Flowchart of the transfer learning model training with the implementation offeature-representation-transfer and instance-transfer.13rom the calibration set of the rock samples (the target domain), with respectively 100common selected features, participated in the training process. All the 20 pellet sampleswith their average replicate spectra were initially involved in the training data set. Thesespectra were organized in a given data configuration where the replicate spectra for eachsample were arranged in an arbitrary order. The efficiency of each pellet was tested within aniteration loop where the
RET s with and without the spectra from the pellet were comparedin order to decide the exclusion or the definitive inclusion of the pellet in the transfer learningmodel training sample set. It was why the ensemble of replicate spectra associated to oneof the 20 pellets was indexed with which went from 1 to 20 (Figure 5 (a)) for the test theefficiency of all the pellet samples during the training process. Eight chosen rock samplescontributed to the transfer learning model training data set (S3, S7, S8, S11, S13, S14, S18and S19 in Figure 1). In particular, they were used in a cross validation process duringthe optimization of the neural network. It was why that the spectra of this data set werefirst organized in different data configurations where each configuration j corresponded toa certain arrangement of replicate spectra for each rock (Figure 5 (a)). Data configurationswere all statistical equivalent since the order of a replicate spectrum of a sample was adummy index. The number of different data configurations were limited to 3 in this workbecause more configurations did not bring further improvement of the model as tested inthe work. For a given configuration j , the spectra were further organized into 5 groups ofreplicates containing respectively 4, 4, 4, 4 and 5 spectra, respectively. A new index i wasintroduced to designate a group of replicate spectra of all the rock calibration samples asshown in Figure 5 (a). During the model training process, the index i went from 1 to 5 withinan iteration loop of cross validation, indicating each time the validation group of replicates. Model training by optimization.
A 3-layer back propagation neural network (BPNN) similar to that used in Reference 13 wasemployed in this work for the transfer learning model. The network was composed by an14nput layer of 100 neurons corresponding to the 100 common selected features of each inputspectrum; a hidden layer 5 neurons and an output layer with a single neuron correspondingto the targeted compound concentration. The function of the network was therefore to mapan input spectrum (a vector of 100 dimensions) to a scalar which can be considered as themodule of a vector in a hyperspace. The precision of the mapping was improved duringthe training process through different iteration loops under the supervision of the targetedconcentrations using the model performance indication parameters specified above.As shown in Figure 5 (b), 3 hierarchized iteration loops ( i, j, k , among them i, j aredoubled loops: for ± k ) surrounding the BPNN optimization loop, performed the supervisedoptimization of the model:-a doubled inner loop for i = 1 to 5: for the double cases of a given sample k in the pelletsample set being excluded ( − k ) and included (+ k ) in the training data set, and a givendata configuration of the rock spectra, the network was optimized within a cross-validationprocess where the model was trained using the ensemble of 4 groups of replicate spectra, offor example i = 2 , , , REC ( ij − k )and REC ( ij + k ) were calculated for the respectively optimized Models for test ( ij − k ) and( ik + k ). These models were then tested using the ensemble of replicate spectra of the rest i th group, i = 1 for example, generating RET ( j − k ) and RET ( j + k ), together with theoptimized Models for test ( j − k ) and ( j + k ).-A doubled intermediary loop for j = 1 to 3: in this loop, the above discussed loop i wasexecuted with 3 independent rock data configurations for the 2 cases of a given sample k in the pellet set being excluded from or included in the training data set. The model wasfurther optimized. Corresponding calculation of the values of RET resulted in
RET ( j − k )and RET ( j + k ).-An outer loop for k = 1 to 20: in this loop the above discussed loop i and loop j were executed for each of the 20 pellet samples successively assigned as the k pellet. Fora given pellet, RET ( − k ) and RET (+ k ) were compared. If an improvement was observed15ith the sample k , it was kept in the training sample set, otherwise it was removed. Thisloop generated a Model for test ( k ) for each considered pellet sample with the corresponding RET ( k ). The optimization process finally generated a Model for valuation with a minimized RET . Model validation.
The resulted transfer learning model was validated by the pretreated spectra from the val-idation set of the rock samples (12 spectra) with the identified features according to thecommon selected features between the pellet sample set and the training set of the rocksamples. The parameters assessing the performance of the model for prediction,
REP and
RSD were calculated. These parameters would indicate the performance of the model whenused for predictions with LIBS spectra from unknown rock samples.
RESULTS AND DISCUSSION
Analytical Performances with the machine learning model.
We first present the results obtained with the models trained with the 20 pellet samplesand validated with the 12 validation rock samples respectively for the 3 concerned oxides,
SiO , N a O and K O . The training method described in Reference 13 was implementedin this work to train a neural network. The training procedure was similar to the inner(loop i ) and the intermediary (loop j ) iteration loops used in the transfer learning modeltraining (Figure 5 b) with a similar neural network structure. As shown in Figure 3, theinput variables were the 100 selected features in a spectrum of a pellet sample for the trainingand the 100 identified features in a spectrum of a validation rock sample for the validation.For the cross-validation optimization in the training process, similarly as for the transferlearning model training, 3 × SiO , N a O and K O . The obtained results are shown in Figure 6. The extractedparameters for the assessment of the model performances are presented in Table 1.In Figure 6 and Table 1, We can see that the machine learning calibration models trainedwith pellet samples present good performances in terms of the usual assessment parametersincluding r , LOD , REC and
RET . As we have pointed out in Reference 12, this indicatesan efficient chemical matrix effect correction with machine learning. At the same time, wecan remark a large degradation in the performance when the model was validation by therock validation samples, in terms of
REP and
RSD . Figure 6 shows that the use of thepellet models for prediction with the spectra from the rock validation samples can lead tosystematic bias, with a shift of the linear regression of the validation data with respect tothe model as shown in Figure 6 (b) for
N a O , as well as variance, with a change of the slopeof the linear regression of the validation data with respect to the model as shown in Figure6 (a) and (c).These results show the effect of the physical matrix effect when the models trainedwith pellet samples were used for prediction for rock samples. As a consequence, the TASclassification of the validation rocks with the pellet machine learning models resulted inan unsatisfactory performance as shown in Figure 7. In this figure, the reference positionin the TAS diagram of each rock determined by the compound concentrations measuredusing XRF (as shown in Figure 1) is indicated with a colored solid circular point. With thesame color, the position predicted by the pellet machine learning models for the same rock isrepresented by a cross with error bars. More precisely, the cross represents the mean positioncalculated with the 21 pretreated validation spectra. The error bars represent the standarddeviations ( ± SD ) of the concentrations, in particular the vertical error bar was obtainedby summing the SD for the 2 concerned compounds. A dash-dot line further links the17eference and the predicted positions of a same rock sample in order to explicitly indicate theircorrespondence. Such presentation thus allows calculating the rate of correct classification, ρ ,for the validation rock samples in a TAS diagram according to their compound concentrationsdetermined using the machine learning calibration models, as compared to their referencepositions determined by the XRF concentrations. If the pellet model-predicted position ofa sample stays in the same TAS field as its XRF reference position, it is correctly classified.In figure 7, we can only see 4 correct classifications (S1, S2, S5 and S10), corresponding toa correct classification rate of 33.3%.Table 1: Parameters assessing the calibration and prediction performances of the machinelearning calibration models for SiO , N a O and K O .Compound SiO N a O K O AverageCalibration r Slope
LOD (%) 5.14 0.622 1.01 2.26
REC (%) 5.61 9.84 3.75 6.40
RET (%) 7.42 10.5 5.20 7.72Validation
REP (%) 13.76 176.2 82.28 90.75
RSD (%) 21.94 9.440 271.8 101.1
Analytical Performances with the transfer learning model.
In parallel to the results with the machine learning models and in order to review the im-provements with the transfer learning models, the calibrations models resulted from transferlearning are shown in Figure 8. The extracted parameters for assessment of the model perfor-mances are presented in Table 2. In Table 2, we can see that although the transfer learningmodels present slightly lower calibration performance in terms of r , the slope, LOD , REC and
RET , the performance for prediction for rock samples are significantly improved, es-pecially for
REP . This means that the participation of the 8 rock samples in the trainingdata set together with the retained pellet samples efficiently takes into account the physicalmatrix effect and reinforces the robustness of the models for prediction with rock samples.Correspondingly in Figure 8, we can see significant reductions of bias and variance of the18igure 6: Machine learning-based calibration models trained with the spectra from thepellet samples (black lines) together with the calibration data (black open cycles) for the3 compounds
SiO (a), N a O (b) and K O (c). Validation data from the rock validationsamples are presented in red crosses, their linear regressions in red lines. The error bars of thepresented data correspond to the standard deviations ( ± SD ) of the predicted concentrationsover the 21 pretreated spectra for a given sample.19igure 7: TAS classification using machine learning models of the validation rocks. Thepositions determined by XRF are presented in colored solid circles, the corresponding model-predicted positions are presented in the same color in crosses. A dashed line links the XRFreference position and the model-predicted position of a same rock sample. The error barson the predicted position are calculated among the different pretreated spectra of a givenvalidation sample.predicted concentrations for the validation rock samples with respect to the calibration mod-els trained with a part of the pellet samples and the training rock samples. In particular,for SiO , 18 pellet samples were retained in the training data set among the 20 ones bythe optimization loop during the model training process. The retained pellet sample wererespectively 13 and 14 for N a O and K O .The calibration models shown in Figure 8 were used to represent the validation rocksamples in a TAS diagram. The obtained result is shown in Figure 9 using the same symbolsas in Figure 7. We can see a much improved result conforming the good performance ofthe transfer learning models shown in Figure 8 and Table 2. A detailed counting shows 10correctly classified validation rock samples. Only two samples were classified into a wrongfield (S4 and S6), although they are very close to borders separating the correct and wrongfields. The rate of correct classification can thus be calculated to be 88.3%.20igure 8: Transfer learning-based calibration models trained with the pellet samples andthe training set of the rock samples (black lines) together with the calibration data (blackopen cycles) for the 3 compounds SiO (a), N a O (b) and K O (c). Validation data fromthe rock validation samples are presented in red crosses, their linear regressions in red lines.The error bars of the presented data correspond to the standard deviations ( ± SD ) of thepredicted concentrations over the 21 pretreated spectra for a given sample.21able 2: Parameters assessing the calibration and prediction performances of the transferlearning calibration models for SiO , N a O and K O .Compound SiO N a O K O AverageCalibration r Slope
LOD (%) 3.70 1.03 2.26 2.33
REC (%) 5.61 15.7 6.40 9.25
RET (%) 4.90 16.2 7.72 9.59Validation
REP (%) 3.11 41.99 19.79 21.63
RSD (%) 24.07 21.29 101.1 48.82Figure 9: TAS classification using transfer learning models of the validation rocks. The usedsymbols are similar to those used in Figure 7.22
ONCLUSIONS
In this work, within a specific application of classification of rocks using the TAS diagram,we have introduced transfer learning in LIBS spectral data treatment to improve the per-formance of the models trained with laboratory standard samples in the form of pelletswhen used for prediction with LIBS spectra from natural rock samples. Such scenario cor-responds well to many important applications, such as Mars exploration with LIBS, whereLIBS spectra acquired in situ using LIBS instruments onboard Mars rovers are treated withprediction models established using laboratory-prepared standard samples. Obvious differ-ences in physical state between the laboratory standards and the natural rocks analyzed onMars lead to unavoidable physical matrix effect that needs to be corrected for accurate andprecise determination of compound concentrations, which is the basis of the TAS classifica-tion of Martian rocks, among other geochemistry analyses, very important for the scientificobjectives fixed for the Mars exploration missions.In particular, feature-representation-transfer and instance-transfer as the two importantfeatures of transfer learning were implemented in the LIBS spectral data treatment. A setof common features was thus determined jointly by the ensembles of features respectivelyselected for the pellet samples and the training rock samples. These common selected featureswere thus used as the input variables for the training and validation of the transfer learningmodels. Instance-transfer consisted in retaining among the pellet samples, those who areefficient to improve the performance of the trained model in a procedure of cross validationwith the training rock samples. The performances of the transfer learning models werecompared with those of the machine learning models. Significant improvements have beenobserved for predictions with the LIBS spectra from the rock validation samples for the 3concerned compounds involved in the TAS classification,
SiO , N a O and K O . The rate ofcorrect TAS classification has been improved from 33.3% with the machine learning modelsto 88.3% with the transfer learning models.Our work therefore demonstrates the efficiency of transfer learning in the treatment of23ock LIBS spectra using machine learning-based models, once a suitable set of rock samplesare included in the model training process. Beyond Mars exploration with LIBS, similarscenarios exist also in LIBS industrial applications. Our findings in this work can have thusa more general interest in the development of LIBS technique for various applications. Author Contributions
CS studied and developed the data treatment method, wrote the corresponding computerprograms, and wrote the draft of the paper. WX prepared the samples and acquired the LIBSspectra. YT participated in the development of the feature selection algorithm. YZ, ZY,SS, MW, LZ, FC participated in the experimental setup development and LIBS spectrumacquisition. JY supervised the research program and wrote the paper. The manuscript waswritten through the contributions of all the authors. All authors have given the approval tothe final version of the manuscript.
Notes
The authors declare no competing financial interest.
Acknowledgement
This work was supported by the Startup Fund for Youngman Research at SJTU, the NationalNatural Science Foundation of China [Grants 11574209, 11805126, 61975190].
References (1) Zaytsev, S. M.; Krylov, I. N.; Popov, A. M.; Zorov, N. B.; Labutin, T. A. Accu-racy enhancement of a multivariate calibration for lead determination in soils by24aser induced breakdown spectroscopy.
SPECTROCHIMICA ACTA PART B-ATOMICSPECTROSCOPY , , 65–72.(2) Segnini, A.; Xavier, A.; Otaviani-Junior, P.; Ferreira, A., E.and Watanabe;Speran¸ca, M.; Nicolodelli, G.; Villas-Boas, P.; Oliveira, P.; Milori, D. Physical andChemical Matrix Effects in Soil Carbon Quantification Using Laser-Induced Break-down Spectroscopy. American Journal of Analytical Chemistry , 722-729.(3) Hahn, D. W.; Omenetto, N. Laser-Induced Breakdown Spectroscopy (LIBS), Part II:Review of Instrumental and Methodological Approaches to Material Analysis and Ap-plications to Different Fields.
APPLIED SPECTROSCOPY , , 347–419.(4) Castelvecchi, D. THERMODYNAMICS Clash of the physics laws. NATURE , , 597–598.(5) Grotzinger, J. P. et al. Mars Science Laboratory Mission and Science Investigation. SPACE SCIENCE REVIEWS , , 5–56.(6) Meslin, P.-Y. et al. Soil Diversity and Hydration as Observed by ChemCam at GaleCrater, Mars. SCIENCE , .(7) Grotzinger, J. P. et al. Deposition, exhumation, and paleoclimate of an ancient lakedeposit, Gale crater, Mars. SCIENCE , .(8) Maurice, S. et al. The ChemCam Instrument Suite on the Mars Science Laboratory(MSL) Rover: Science Objectives and Mast Unit Description. SPACE SCIENCE RE-VIEWS , , 95–166.(9) Wiens, R. C. et al. Pre-flight calibration and initial data processing for the ChemCam laser-induced breakdown spectroscopy instrument on the Mars Science Laboratoryrover. SPECTROCHIMICA ACTA PART B-ATOMIC SPECTROSCOPY , ,1–27. 2510) Clegg, S. M. et al. Recalibration of the Mars Science Laboratory ChemCam instru-ment with an expanded geochemical database. SPECTROCHIMICA ACTA PART B-ATOMIC SPECTROSCOPY , , 64–85.(11) Rapin, W.; Bousquet, B.; Lasue, J.; Meslin, P.-Y.; Lacour, J.-L.; Fabre, C.; Wiens, R.;Frydenvang, J.; Dehouck, E.; Maurice, S.; Gasnault, O.; Forni, O.; Cousin, A. Rough-ness effects on the hydrogen signal in laser-induced breakdown spectroscopy. Spec-trochimica Acta Part B: Atomic Spectroscopy , , 13–22.(12) Xu, W.; Sun, C.; Tan, Y.; Gao, L.; Zhang, Y.; Yue, Z.; Shabbir, S.; Wu, M.; Zou, L.;Chen, F.; Liu, S.; Yu, J. Total alkali silica classification of rocks with LIBS: influencesof the chemical and physical matrix effects. JOURNAL OF ANALYTICAL ATOMICSPECTROMETRY , , 1641–1653.(13) Sun, C.; Tian, Y.; Gao, L.; Niu, Y.; Zhang, T.; Li, H.; Zhang, Y.; Yue, Z.; Delepine-Gilon, N.; Yu, J. Machine Learning Allows Calibration Models to Predict Trace ElementConcentration in Soils with Generalized LIBS Spectra. SCIENTIFIC REPORTS , .(14) MIAN, I.; LEBAS, M. SODIC AMPHIBOLES IN FENITES FROM THE LOE-SHILMAN CARBONATITE COMPLEX, NW PAKISTAN. MINERALOGICALMAGAZINE , , 187–197.(15) Wu, X.; Kumar, V.; Quinlan, J. R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G. J.;Ng, A.; Liu, B.; Yu, P. S.; Zhou, Z.-H.; Steinbach, M.; Hand, D. J.; Steinberg, D. Top 10algorithms in data mining. KNOWLEDGE AND INFORMATION SYSTEMS , , 1–37.(16) Dai, W.; Yang, Q.; Xue, G.; Yu, Y. Boosting for Transfer Learning. Proc. 24th Inter-national Conference on Machine Learning ,2617) Dai, W.; Xue, G.-R.; Yang, Q.; Yu, Y. Co-clustering based Classification forOut-of-domain Documents.
KDD-2007 PROCEEDINGS OF THE THIRTEENTHACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOV-ERY AND DATA MINING ,(18) Pan, S. J.; Yang, Q. A Survey on Transfer Learning.
IEEE TRANSACTIONS ONKNOWLEDGE AND DATA ENGINEERING , , 1345–1359.(19) http://pubs.acs.org/books/references.shtml .(20) Zhang, Z.-M.; Chen, S.; Liang, Y.-Z.; Liu, Z.-X.; Zhang, Q.-M.; Ding, L.-X.; Ye, F.;Zhou, H. An intelligent background-correction algorithm for highly fluorescent samplesin Raman spectroscopy. Journal of Raman Spectroscopy , , 659–669.(21) Cormen, T.; Leiserson, C.; Rivest, R. Introduction to Algorithms ; The MIT Press,Cambridge, MA, USA, 2009.(22) P. Bruce, A., P.; Bruce
Practical Statistics for Data Scientists
SPECTROCHIMICA ACTA PART B-ATOMIC SPECTROSCOPY , , 10th Euro-Mediterranean Symposium onLaser-Induced Breakdown Spectroscopy (EMSLIBS), Brno, CZECH REPUBLIC, SEP08-13, 2019.(25) Yue, Z.; Sun, C.; Gao, L.; Zhang, Y.; Shabbir, S.; Xu, W.; Wu, M.; Zou, L.; Tan, Y.;Chen, F.; Yu, J. Machine learning efficiently corrects LIBS spectrum variation due tochange of laser fluence. OPTICS EXPRESS , , 14345–14356.2726) Zhang, Y.; Sun, C.; Yue, Z.; Shabbir, S.; Xu, W.; Wu, M.; Zou, L.; Tan, Y.; Chen, F.;Yu, J. Correlation-based carbon determination in steel without explicitly involvingcarbon-related emission lines in a LIBS spectrum. OPTICS EXPRESS , ,32019–32032.(27) Zou, L.; Sun, C.; Wu, M.; Zhang, Y.; Yue, Z.; Xu, W.; Shabbir, S.; Chen, F.; Liu, B.;Liu, W.; Yu, J. Online simultaneous determination of H2O and KCl in potash with LIBScoupled to convolutional and back-propagation neural networks. J. Anal. At. Spectrom.2021