An ad-hoc modified Likelihood Function Applied to Optimization of Data Analysis in Atomic Spectroscopy
AAn “ad-hoc” modified Likelihood Function Applied to Optimization of Data Analysis in Atomic Spectroscopy. *Leonardo Bennun * Applied Physics Laboratory, Physics Department, University of Concepcion; Casilla 160-C, Concepcion, Chile. E-mail: [email protected] (L. Bennun)
Running title: “A d — hoc ” Likelihood Function for optimization of spectroscopic results. Keywords: “ Ad — hoc ” Likelihood Function; Optimization of Statistical Results; Improving accuracy and uncertainty; Metrological traceability; Least Squares Methods; Spectral Analysis; Evaluation of Repeatability. Abstract:
In this paper we propose an “ ad — hoc ” construction of the Likelihood Function, in order to develop a data analysis procedure, to be applied in atomic and nuclear spectral analysis. The classical Likelihood Function was modified taking into account the underlying statistics of the phenomena studied, by the inspection of the residues of the fitting, which should behave with specific statistical properties. This new formulation was analytically developed, but the sought parameter should be evaluated numerically, since it cannot be obtained as a function of each one of the independent variables. For this simple numerical evaluation, along with the acquired data, we also should process many sets of external data, with specific properties — This new data should be uncorrelated with the acquired signal. The developed statistical method was evaluated using computer simulated spectra. The numerical estimations of the calculated parameter applying this method, indicate an improvement over accuracy and precision, being one order of magnitude better than those produced by least squares approaches. We still have to evaluate the improvement produced by this method over Detection and Quantitation Limits, in TXRF spectral analysis. INTRODUCTION
In atomic and nuclear spectroscopies, almost all of the methods for spectral analysis are based on the least squares algorithms. These data processing and interpretation techniques are usually applied indistinctly to linear or non—linear physical systems. The great success of the application of this method is based essentially on a deep agreement between the underlying physical phenomena studied and the applied mathematical theory. In most of the atomic and nuclear spectroscopic techniques, in order to obtain the sample/system characterization, we found a common sequence of events, which can be detailed as follows: a ) a generation of energetic particles required as an excitation source, b ) the de-excitation process of the sample, c ) the detection (usually trough a solid state detector) and d ) an acquisition process (typically the electronic chains are composed by a preamplifier, amplifier, and a multichannel analyzer). The intrinsic fluctuations of the excitation and de-excitation of the sample, being discrete events, are ruled by the Poisson´s Distribution like all of the atomic or nuclear interactions. Moreover, in the classical texts of Probability, the radioactive decay and nuclear decay eactions are used as iconic examples of the Poisson´s Statistics. This probability also rules the characteristic backgrounds proper of the matrix of the sample. At this stage, we should mention that the Poisson´s Distribution, when it is applied to a large number of events (usually, n >30) is very well described by a Normal Distribution. So, the joint probability that describes all kind or sequence of atomic or nuclear events in the sample, is a Normal Distribution. The Monte Carlo methods apply this property in order to infer the average behavior of the particles in the physical system from the average behavior of simulated particles.[1] At each one of the rest of the steps that follow the electrical signal leaving the detector until it is processed in the multichannel analyzer, it is affected by characteristics fluctuations, like temperature and gain voltage variations, electrical environmental noise, small systematics errors, etc. Again, by applying the Central Limit theorem, we can assure that the Distribution that rule the acquired atomic or nuclear signal, at each channel in the multichannel analyzer, would be a Normal Distribution. The Central Limit theorem is implicitly applied in the data processing of a large number of complex systems, which are studied with the least squares approach and many of their multiple variations.[2] These methodologies are applied with many strategies in computational mathematics, in order to optimize the results in scientific and engineering applications. Some of the areas of applications are: image and video processing, medical treatments, etc.[3] In the specific case of spectral analysis, the application of the Maximum Likelihood formulation is almost perfectly suited, since it is constructed considering a Normal Distribution at each channel. But, a question remains: How good is the quality of the results obtained from a Maximum Likelihood estimation for a given parameter? What the Likelihood Function is computing is how likely the measured data is to have come from the distribution assuming a particular value for the hidden parameter; the more likely this is, the closer one would think that this particular choice for hidden parameter is to the true value. So, in atomic spectroscopies, the results obtained from least squares algorithms should be the best, and no method of improvement of the results could be proposed. Moreover, the properties of the obtained results were largely studied. The evaluated parameters from the least squares formulation are unbiased (in the limit of infinite measurements) and have minimum variance among all unbiased linear estimators. This means that the estimates “get us as close to the true unknown parameter values as we can get”. For these reasons, an improvement on the quality of the results over those obtained by the least squares method seems to be unrealistic. Moreover, these results are considered as the limit of highest quality, to which tend the results, for instance, obtained from the Neural Network approach [4,5] when they are applied to data analysis in atomic or nuclear spectroscopies, and related systems. However, in another work [6] we devised a new smoothing method which was applied to simulated spectroscopic data, producing results with better accuracy than those obtained from the least squares approaches. In this paper, we propose a modification in the construction of the Likelihood function, which leads to a remarkable improvement on the quality of the results provided by the least squares approaches. This modification was made taking into account the underlying statistics of the phenomena studied, by the inspection of the residues of the fitting, which should behave with specific statistical properties. This new formulation was analytically developed, but the calculated parameter should be evaluated numerically, since it cannot be obtained as a function of each one of the independent variables. For the required numerical evaluation, along with the acquired spectrum, we should process many sets of external data with specific properties. This arbitrary term is a random sequence of data which is uncorrelated with the acquired signal. It should be ruled by a Gaussian distribution, having mean value zero and standard deviation ∆=1. his statistical method was evaluated using computer simulated spectra. The numerical estimations of the calculated parameter applying this method, indicate an improvement over accuracy and precision, one order of magnitude better than those produced by the least squares approaches. We still have to evaluate the improvement produced by this method over Detection and Quantitation Limits, in TXRF spectral analysis. THEORETICAL
A Maximum Likelihood estimate for some hidden parameter γ (or parameters, plural) of some probability distribution is a number 𝛾̂ computed from an Independent and Identically Distributed sample (IID) M , ..., M n from the given distribution that maximizes something called the “Likelihood Function”. Let´s suppose that this distribution is governed by a probability density function (pdf) G ( X ; γ , ..., γ k ) , where the γ i ’s are all hidden parameters. The Likelihood Function associated to this sample is: 𝐿(𝑀 , … . , 𝑀 𝑛 ) = ∏ 𝐺(𝑀 𝑖 , 𝛾 , … , 𝛾 𝑘 ) 𝑛𝑖=1 (1) Note that in all cases the estimated values are represented by English letters while parameter values are represented by Greek letters. If the distribution is
N(µ,σ ) , the Likelihood Function is: 𝐿(𝑀 , … . , 𝑀 𝑛 ; 𝜇̂, 𝜎̂ 𝑖2 ) = 1(2𝜋) 𝑛/2 (𝑒 − (𝑀 −𝜇̂ ) 𝜎̂ × … × 𝑒 − (𝑀 𝑛 −𝜇̂ 𝑛 ) 𝜎̂ 𝑛 ) (2) where the symbol “ ^ ” over the variables µ i and σ i indicates that they are estimators of their real values. In atomic spectroscopies (like TXRF, µSR-XRF, PIXE, etc.) the 𝜇̂ 𝑖 values could be linearly related with an specific known function, that is, 𝜇 𝑖 = 𝛼𝐹 𝑖 . The function F can be understood as a particular perfectly defined signal, directly linked to the presence of a given element in the sample.[7,8] The Fi sequence take specific values at each channel i ; and the α value is linearly related mainly with the abundance of this element, and others fundamental parameters.[9] In this case, the Maximum Likelihood yields [10]: 𝑙𝑛𝐿 = − ∑ (𝑚 𝑖 − 𝛼𝐹 𝑖 ) 𝑖2𝑛𝑖=1 + ∑ 𝑙𝑛 ( 1√2𝜋𝜎 𝑖 ) 𝑛𝑖=1 (3) At Eq. (3) the m i values represent the discrete data acquired with the multichannel analyzer; and the σ i values are their standard deviation, at each channel i . In this case, the total number of channels analyzed is n . The most probable 𝛼̂ value with its confidence interval can be calculated from Eq. (3) as [10]: 𝛼̂ = ∑ 𝑚 𝑖 𝐹 𝑖 𝜎 𝑖2 𝑛𝑖=1 ∑ 𝐹 𝑖2 𝜎 𝑖2 𝑛𝑖=1 ± 1√∑ 𝐹 𝑖2 𝜎 𝑖2 𝑛𝑖=1 (4) he same result is obtained applying least squares minimization to this heteroscedastic system. As was discussed in the Introduction, in atomic and nuclear spectroscopies, it is advisable to mention that the results at Eq. (4) are obtained from an adequate representation of the studied physical system. Each one of the underlying processes are indeed well described by the probabilities applied (Normal Distributions) in the developed model. In this sense, the values obtained at Eq. (4) should be the best, and no method of improvement of the results could be proposed. If we ask: Could the quality (accuracy and uncertainty) of a measurement be improved beyond that the Maximum Likelihood Criterion establishes in atomic spectroscopy? Being these spectroscopies well described by the proposed statistical model, the answer should be negative. But, let´s inspect the residues of the adjustment, which also should behave with specific statistical properties. Once the α parameter is evaluated, the difference 𝜀 𝑖 = 𝑚 𝑖 −𝛼𝐹 𝑖 , should on the average: i ) have equal number/quantity of positive and negative values, ii ) since it is ruled by a Poisson statistics, its uncertainty at each channel i should be: ∆𝜀 𝑖 = √𝛼𝐹 𝑖 . Taking into account these elements we can propose an improved version of the likelihood function Ŧ [11,12,13,14] which includes the statistical properties of the studied system, as: 𝑙𝑛𝐿 = − ∑ (𝑚 𝑖 − 𝛼𝐹 𝑖 − 𝑏𝑔 𝑖 √𝛼𝐹 𝑖 ) 𝜎 𝑖2𝑛𝑖=1 + ∑ 𝑙𝑛 ( 1√2𝜋𝜎 𝑖 ) 𝑛𝑖=1 (5) At Eq. (5) the term 𝑏𝑔 𝑖 is an arbitrary random sequence of n datum, uncorrelated with the acquired signal. This data is ruled by a Gaussian distribution, having mean value zero and standard deviation ∆=1. This external data is processed simultaneously with the acquired signal. Now we derivate the Eq. (5) with respect to α , in order to obtain the most probable α value, which maximize L . ∑ 𝑚 𝑖 𝐹 𝑖 𝜎 𝑖2𝑛𝑖=1 − 𝛼 ∑ 𝐹 𝑖2 𝜎 𝑖2𝑛𝑖=1 − √𝛼2 ∑ 𝐹 𝑖3/2 𝜎 𝑖2𝑛𝑖=1 𝑏𝑔 𝑖 − 12√𝛼 ∑ 𝑚 𝑖 √𝐹 𝑖 𝑏𝑔 𝑖 𝜎 𝑖2𝑛𝑖=1 + 12 ∑ 𝐹 𝑖 𝑏𝑔 𝑖2 𝜎 𝑖2𝑛𝑖=1 = 0 (6) The α parameter cannot be analytically evaluated at Eq. (6), since it cannot be obtained as a function of each one of the independent variables. For the required/obligatory numerical evaluation, i ) an initial α value must be calculated, which can be obtained from Eq. (4). Later, it should be refined numerically; and ii ) an arbitrary sequence of data 𝑏𝑔 𝑖 should be included, with the properties required at Eq. (5). In order to obtain the numerical evaluation of the parameter α we apply to Eq. (6) a property of the spectroscopic measurements, where the standard deviation at each channel is 𝜎 𝑖 (𝑚 𝑖 ) = √𝛼𝐹 𝑖 . Then, we can write: ∑ 𝑚 𝑖𝑛𝑖=1 − 𝛼 ∑ 𝐹 𝑖𝑛𝑖=1 − √𝛼2 ∑ √𝐹 𝑖𝑛𝑖=1 𝑏𝑔 𝑖 − 12√𝛼 ∑ 𝑚 𝑖 𝑏𝑔 𝑖 √𝐹 𝑖𝑛𝑖=1 + 12 ∑ 𝑏𝑔 𝑖2𝑛𝑖=1 = 0 (7) The standard error propagation formula is required in order to evaluate the uncertainty of the parameter α . This formula is expressed as: [ Ŧ ] There are many denominations about the “ad hoc” modifications over the likelihood function. It is not clear in which category this proposition should be included. 𝐻(𝑥 𝑖 ) = ∑ (𝜕𝐻(𝑥 𝑖 )𝜕𝑥 𝑖 ) (∆𝑥 𝑖 ) (8) Being ∆𝐻(𝑥 𝑖 ) the uncertainty of the function 𝐻(𝑥 𝑖 ) , and ∆𝑥 𝑖 the uncertainty of every one of the k independent input variables. The Eq. (8) is implicitly applied over Eq. (6). Taking into account that ∆𝐹 𝑖 =0 𝑎𝑛𝑑 ∆𝑏𝑔 𝑖 = 1 , the uncertainty of the parameter α , is: ∆ 𝛼 = (∑ ( 𝐹 𝑖 𝜎 𝑖2 − 𝑖 𝑏𝑔 𝑖 𝜎 𝑖2 ) 𝑛𝑖=1 ∆𝑚 𝑖 ) + (∑ ( √𝛼2 𝐹 𝑖3/2 𝜎 𝑖2 +
12𝛼 𝑚 𝑖 √𝐹 𝑖 𝜎 𝑖2 − 𝐹 𝑖 𝑏𝑔 𝑖 𝜎 𝑖2 ) ∆𝑏𝑔 𝑖𝑛𝑖=1 ) (∑ ( 𝐹 𝑖2 𝜎 𝑖2 + 𝑖3/2 𝑏𝑔 𝑖 𝜎 𝑖2 + 𝑏𝑔 𝑖 𝑚 𝑖 √𝐹 𝑖 𝜎 𝑖2 ) 𝑛𝑖=1 ) (9) The α parameter previously obtained at Eq. (7) is required for the evaluation of Eq. (9). We applied this method over computer simulated spectra. The proposed method produces results with a notorious enhancement on the quality of the accuracy and precision of the results, compared with those provided by the least squares algorithms. APPLICATION EXAMPLES
This method was applied to simulated data in order to evaluate its characteristics and the improvement obtained over the quality of the results. We applied this method over an arbitrary sinusoidal function
𝐹(𝑥) = a sin (𝑥) + 𝑏 . In this particular case, the function is defined as:
𝐹(𝑥) = 27 sin(𝑥/32.3) + 17 (10)
At each channel i , in a simulated spectroscopic acquisition the “measured” mi signal is obtained from the Fi function affected by fluctuations ruled by the Poisson´s statistics, as: 𝑚 𝑖 = 𝛼𝐹 𝑖 + 𝑏𝑔 𝑖 √𝛼𝐹 𝑖 (11) At Eq.(11) a particular Gaussian random sequence ( bgi ) is required, having mean value zero and standard deviation ∆=1. The Fi function and one of its possible “measured” mi signals are shown in Fig. (1).
20 40 60 80 100102030405060 C oun t s Channel (arbitrary units, usually energy) Fi mi
Figure 1. A sinusoidal function ( 𝐹𝑖 , black points) and one of its simulated spectroscopic results ( mi , red points) obtained by an atomic spectroscopic technique. This simulated measurement is affected by fluctuations ruled by the Poisson´s statistics. The random sequence ( bgi ) used in order to obtain the mi data is shown in the Fig. (2), where also a second sequence ( bgi ) with the same properties is included. Both sets of data could be used in order to evaluate the Eq.(7). ba ck g r ound f l u c t ua t i on s Channel (arbitrary units, usually energy) (bgi)1 (bgi)2
Figure 2. Two sequences of arbitrary bg data. The first set (( bg ) , red points) was used to construct the “measured” sequence mi , shown in the Fig.(1). This sequence can be applied in Eq.(7). The second set (( bg ) , blue points) or any other Gaussian random sequence (with mean value zero and standard deviation ∆=1 ) also can be used in the Eq.(7) in order to evaluate the α parameter. n the Fig. (1) the mi data is obtained from the F function when it is affected by a particular realization of the statistical fluctuations ( bgi ) , so in this case the α parameter is strictly 1. In order to check the proper development of the Eq.(7), we calculated the α parameter from the mi data shown in the Fig.(1). In this particular case the ( bgi ) set of data is used twice, first in order to construct the mi data from the F function, and second as the bgi set of data required in Eq.(7). An initial guess of the alpha value can be obtained applying the Eq. (4) to this m i data. In this case we get: α = 0.985 ± 0.02 . Later we refine this value determining the most appropriate number which makes zero the second member of the Eq.(7). The numerical evaluation of the parameter α is shown in Fig. (3). S e c ond M e m be r E q . ( ) parameter Numerical evaluation of the mostappropiate parameter Figure 3. Numerical evaluation of the most appropriate α value, which makes zero the second member of Eq. (7). In this ideal case, a) the exact α value is strictly 1 and b) the bg set of data is used twice, first to construct the mi data, and second as term required in Eq.(7). A zoom of the region of interest is included, where is observed that the alpha value is recovered with high accuracy. As can be seen from Fig.(3), the alpha value is recovered with high accuracy from the developed formulation. The very small difference obtained (~0.01%) can be attributed to numerical errors (round-off error). This is a characteristic result, with independency of the intensity of the α parameter, if the bg set of data is simultaneously used in order to produce the mi data and as a term in the Eq.(7). But in a real measurement we don´t know which is the random noise that affects the pure signal. In this case, we should propose another set of bg data in order to evaluate Eq.(7), as is shown in the Fig.(2) (blue points). For the case being analyzed the alpha parameter obtained with this new bg data is α =0.995. valuation of the Accuracy and Precision For the evaluation of the accuracy and precision obtained with this method, in the Table (1) we show ten calculations of the α parameter from the mi data shown in Fig.(1). In each case, we require a different bg set of data in order to evaluate the Eq.(7). From the obtained α values we can estimate the quality of the results obtained with this method. Real Value a) Eq.(4) b) Eq.(7) α ∆ α α ∆ α α
1 1 0 0.985 0.02 0.9952 2 0.9871 3 0.9992 4 1.0157 5 1.0058 6 1.0215 7 1.0105 8 0.9895 9 1.0017 10 1.0096
Table 1. Ten evaluations of the α parameter from the same mi data. Case a) Evaluation from a previously developed least squares method, Eq.(4). b) Results obtained from the proposed method, Eq.(7); in each case a different realization of random noise is required as the bg term. The results at Table 1 are also shown at Fig.(3), where is observed that usually this method produces better results than the least squares approach.
Real Value Least Squares Evaluation C a l c u l a t ed pa r a m e t e r Evaluation Number Calculations made with this method
Figure 3. Evaluations of the α parameter from the same mi data. a) Real value of the α parameter (red line); b) Evaluation from a least squares approach (blue line), and c) Ten results obtained from the proposed method (black lines), for each case a different realization of random noise is required as the bg term, in order to evaluate the Eq.(7). From the same mi data we can obtain a large number of evaluations. For each evaluation, a new sequence of bg data is required. In this case the averaged ten results in Fig.(3) produce the value, 𝛼 = 1.0035 ± 0.011 . We observe that the quality of the accuracy and precision in the results obtained with this method, are improved one order of magnitude, compared with those produced by the least squares algorithms. Since more evaluations can be simply obtained, the quality of the results can be improved easily (in this ideal system). CONCLUSIONS
In data spectral analysis, the application of the Maximum Likelihood formulation is almost perfectly suited, since it is constructed based on a deep agreement between the underlying physical phenomena studied and the applied mathematical theory. In this sense, the values obtained from the classical least squares approach should be the best, and no method of improvement of the results could be proposed. Moreover, the properties of the obtained results were largely studied: i ) The results from the least squares formulation are unbiased (in the limit of infinite measurements) and ii ) they have minimum variance among all unbiased linear estimators. This means that the estimates “get us as close to the true unknown parameter values as we can get”. For these reasons, an improvement on the quality of the results over those obtained by least squares approaches seems to be unrealistic. owever, in another work we developed a new smoothing method [6] which was applied to simulated spectroscopic data, producing results with better accuracy than those obtained from the least squares approach. In this paper, we propose a modification in the construction of the Likelihood function, which leads to a remarkable improvement on the quality of the results. This modification was made taking into account the underlying statistics of the phenomena studied, by the inspection of the residues of the fitting, which should behave with specific statistical properties. This new formulation was analytically developed, but the calculated parameter should be evaluated numerically, since it cannot be obtained as a function of each one of the independent variables. For the required numerical evaluation, we require: i ) the sought signal F , to be evaluated from the acquired spectrum, should be accurately known, ii ) along with the acquired spectrum, we should process many sets of external data with specific properties. This arbitrary term is a random sequence of data which is uncorrelated with the acquired signal. It should be ruled by a Gaussian distribution, having mean value zero and standard deviation ∆=1. As was shown, the results obtained from Maximum Likelihood can be improved, but a greater number of data should be handled. In this method, if we made n evaluations, the total number of data processed is the n times greater than the number of the acquired signal – this external data is artificially inserted in the numeric procedure. But, if we only process the acquired data, the quality of the results from least squares approach cannot be enhanced. The developed statistical method was evaluated using computer simulated spectra. The numerical estimations of the calculated parameter applying this method, indicate an improvement over accuracy and precision, one order of magnitude better than those produced by the common least squares approaches (see Table 1). Since more evaluations can be simply obtained, the quality of the results can be improved easily (in this ideal system). In spectral analysis, we still have to evaluate the improvement produced by this method in conjunction with a new smoothing procedure [6], over Detection and Quantitation Limits, when they are applied over real experimental results. These kind of evaluations are formally required in all spectroscopic techniques, in order to establish their metrological traceability, defining carefully their operational procedures. For instance, in TXRF spectroscopy there is a list of reports which evaluate its characteristics.[15,16,17,18] We consider that this method will improve the quality of the results in many spectroscopic techniques and related systems. REFERENCES [1]
Pelowitz, D. B. MCNP-6-
General Monte Carlo N-Particle Transport Code System Version 6, (2013) MCNP6 user's manual, version 1. Los Alamos National Laboratory, Los Alamos. [2]
I. Markovskya, and S. Van Huffelb. Overview of total least-squares methods Signal Processing 87, p. 2283–2302, (2007). [3] Dong Zeng et al. Penalized weighted least-squares approach for multienergy computed tomography image reconstruction via structure tensor total variation regularization. Computerized Medical Imaging and Graphics, V ol. 53, Pages 19–29, (2016). [4] L. Bennun, A. Delgado. Reply to `Feasibility of neural network approach in spectral mixture analysis of reflectance spectra'. International Journal of Remote Sensing 30 (18):4905-4907.(2009). [5] H. Navarro and L. Bennun. Descriptive examples of the limitations of Artificial Neural Networks applied to the analysis of independent stochastic data. International Journal of Computer Aided Engineering and Technology 5(5):40-42. (2014). [6] L. Bennun. A Pragmatic Smoothing Method for Improving the Quality of the Results in Atomic Spectroscopy. Accepted to be published in Applied Spectroscopy. 2017. [7] L. Bennun, J. Gomez. Determination of mercury by total-reflection X-ray fluorescence using amalgamation with gold. Spectrochimica Acta Part B 52, p. 1195-1200, (1997). [8] http://brukersupport.com/BrukerDownloads/S2PICOFOX. [9] L. Bennun, V. Gillette and E. D. Greaves. Data processing technique for mercury determination by total-reflection X-ray fluorescence, using amalgamation with gold. Spectrochimica Acta Part B Atomic Spectroscopy 54(9).:1291-1301. (1999). [10] L. Bennun, E. D. Greaves and J. Blostein. New procedure for intensity and detection limit determination in spectral trace analysis: application for trace mercury by TXRF. X-Ray Spectrom. 2002; : 289–295. [11] Desmond, AF. Optimal estimating functions, quasi-likelihood and statistical modeling. Journal of Statistical Planning and inference. v: 60n 1,p. 77-104 (1997). [12] Pei-Sheng Lin, Hui-Yun Lee, Murray Clayton. A comparison of efficiencies between quasi-likelihood and pseudo-likelihood estimates in non-separable spatial–temporal binary data. Journal of Statistical Planning and Inference 139 3310—3318. (2009). [13] Bruce G. Lindsay. Composite Likelihood Methods. Contemporary Mathematics, v. 80, p.221-239. (1988). [14] C. Varin, N. Reid and D. Firth. An overview of composite Likelihood Methods. Statistica Sinica, 21, 5-42. (2011). [15] Floor, G.H.; Queralt, I; Hidalgo, M.: 289–295. [11] Desmond, AF. Optimal estimating functions, quasi-likelihood and statistical modeling. Journal of Statistical Planning and inference. v: 60n 1,p. 77-104 (1997). [12] Pei-Sheng Lin, Hui-Yun Lee, Murray Clayton. A comparison of efficiencies between quasi-likelihood and pseudo-likelihood estimates in non-separable spatial–temporal binary data. Journal of Statistical Planning and Inference 139 3310—3318. (2009). [13] Bruce G. Lindsay. Composite Likelihood Methods. Contemporary Mathematics, v. 80, p.221-239. (1988). [14] C. Varin, N. Reid and D. Firth. An overview of composite Likelihood Methods. Statistica Sinica, 21, 5-42. (2011). [15] Floor, G.H.; Queralt, I; Hidalgo, M.