John H. Kalivas
Idaho State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by John H. Kalivas.
Pure and Applied Chemistry | 2006
Alejandro C. Olivieri; Nicolaas (Klaas) M. Faber; Joan Ferré; Ricard Boqué; John H. Kalivas; Howard Mark
This paper gives an introduction to multivariate calibration from a chemometrics perspective and reviews the various proposals to generalize the well-established univariate methodology to the multivariate domain. Univariate calibration leads to relatively simple models with a sound statistical underpinning. The associated uncertainty estimation and figures of merit are thoroughly covered in several official documents. However, univariate model predictions for unknown samples are only reliable if the signal is sufficiently selective for the analyte of interest. By contrast, multivariate calibration methods may produce valid predictions also from highly unselective data. A case in point is quantification from near-infrared (NIR) spectra. With the ever-increasing sophistication of analytical instruments inevitably comes a suite of multivariate calibration methods, each with its own underlying assumptions and statistical properties. As a result, uncertainty estimation and figures of merit for multivariate calibration methods has become a subject of active research, especially in the field of chemometrics.
Chemometrics and Intelligent Laboratory Systems | 1997
John H. Kalivas
Abstract Described in this paper are two data sets of near infrared (NIR) spectra that are now available for general use. One data set consists of NIR spectra of 100 wheat samples with known protein and moisture content. The second set contains NIR spectra of 60 gasoline samples with known octane numbers. Results from a recent wavelength selection study using the two data sets are summarized. Other results based on the new regression approach of cyclic subspace regression (CSR) are briefly described. Included in CSR are principal component regression, partial least squares and least squares. An explicit description of calibration and validation samples used in both investigations is provided.
Analytica Chimica Acta | 1995
Uwe Hörchner; John H. Kalivas
A recent publication compared the abilities of simulated annealing (SA), genetic algorithm, and stepwise elimination to determine optimal combinations of wavelengths for quantitative analysis of a chemical system. Using their implementation of SA, the authors were unable to identify the global optimum solution in the experiments they performed with three optimization criteria. However, using a different approach with SA, this paper shows that SA can indeed locate the best solution for the addressed problem and optimization criteria. The importance of wavelength searching in a close neighborhood of the previously evaluated wavelength subset for successful implementation of SA is demonstrated. A procedure is described that identifies optimal wavelength combinations exactly. This paper also addresses the assumed need to determine optimal wavelength combinations and the suitability of the used optimization criteria to establish calibrations with increased prediction accuracy and precision. Some general problems inherent to the standard SA algorithm are discussed.
Applied Spectroscopy | 1999
Chad E. Anderson; John H. Kalivas
In analytical chemistry, it is necessary to form instrument-dependent calibration models. Problems such as instrument drift, repair, or use of a new instrument create a need for recalibration. Since recalibration can require considerable costs and cause time delays, methods for calibration transfer have been developed. This paper shows that many of these approaches are based on the statistical procedure known as Procrustes analysis (PA). Transfer by PA methods is shown to involve translation (mean-centering), rotation, and stretching of instrument responses. This study investigates the ability of different forms of PA to transfer near-infrared spectra measured on two different instruments. Spectroscopic interpretations of translation, rotation, and stretching are provided. It is found for the data sets investigated that unconstrained forms of PA generally produce better results. It is also shown that translation is the key step for transformation of spectra and may often be all that is required.
Analytical Chemistry | 1996
Nickey J. Messick; John H. Kalivas; Patrick Lang
Analytical figures of merit are often used as criteria to decide whether or not a given instrumental method is suitable for attacking an analytical problem. To date, figures of merit primarily exist for analytical instruments producing data indexed by one variable, i.e., first-order instruments and first-order data. Almost none exist for instruments that generate data indexed by two variables, i.e., second-order instruments and data, and none exist for instruments supplying data indexed by three or more variables, i.e., nth-order instruments and data. This paper develops practical mathematical tools that can be used to create several figures of merit for nth-order instrumentation, namely, selectivity, net analyte signal, and sensitivity. In particular, the paper fully develops a local selectivity measure for second-order instrumentation and tests its performance using simulated second-order data and real second-order data obtained by gas chromatography with Fourier transform infrared detection and liquid chromatography with photodiode array detection. Also included in the paper is a brief discussion on practical uses of nth-order figures of merit.
Journal of Chemometrics | 1999
John H. Kalivas
This paper provides an expository discussion of the interrelationships between least squares (LS), principal component regression (PCR), partial least squares (PLS), ridge regression (RR), generalized ridge regression (GRR), continuum regression (CR) and cyclic subspace regression (CSR) for the linear model y = Xb + e. Developed in this paper is continuum CSR (CCSR). From this study it is ascertained that GRR encompasses LS, PCR, PLS, RR, CR, CSR and CCSR. It is shown that a regression vector, regardless of its source, can be written as a linear combination of the vi eigenvectors obtained from a singular value decomposition (SVD) of X, i.e. X = UΣVT. Similarly, it is shown that calibration fitted values ŷ obtained from any linear regression method can be written as a linear combination of the ui eigenvectors obtained from an SVD of X. Formulae are provided to compute ϕ and γ, respective vectors of weights for vi and ui eigenvectors. It is recommended that the ϕ eigenvector weights be inspected to ascertain exactly what information is being used to form the regression vector for the particular modeling approach used. Analogously, the γ eigenvector weights should be inspected to determine what information is being used to form calibration fitted values. Besides assisting in prediction rank determination, both eigenvector weight plots also allow for easy comparison of models built by different methods, e.g. the PCR model versus the PLS model. It is shown that it is not the number of factors used to build a PCR or PLS model that is important, but the number of eigenvectors used, which ones, and how they are weighted to form respective regression vectors and fitted values of calibration samples. In essence, how eigenvectors are weighted dictates which GRR model is formed. From the CR, CSR and RR eigenvector weight plots of ϕ it is concluded that the optimal model will most often have a combination of PCR and PLS attributes. Copyright
Analytica Chimica Acta | 1997
Yu-Long Xie; John H. Kalivas
Abstract Most situations using principal component regression (PCR) as a multivariate calibration tool use the conventional top-down selection procedure to determine the number of principal components (PCs) to generate a global model, i.e., the regression model is established by including PCs in sequence according to variances related to the PCs. This model is then used to predict future samples adequately spanned by the corresponding calibration set. Recently, some alternative procedures have been proposed for PC selection with respect to multivariate calibration. These include optimization (selection) by generalized simulated annealing and correlation principal component regression (CPCR) where PCs are ordered according to correlations with the dependent variable (concentration). The PCs are then selected one by one to form the global model based on a prediction criterion. In this paper, a forward selection procedure PCR (FSPCR) is evaluated and compared to CPCR and top-down selection. Four spectroscopic data sets are analyzed for the comparison study. In essence, results reveal that PCs selected based on a top-down approach generates the most stable global model. That is, top-down selection generally performs best for prediction of numerous future samples sets compared to CPCR and FSPCR. Reasons for such differences in performances of these procedures have been analyzed.
Applied Spectroscopy | 1997
Jason M. Brenchley; Uwe Hörchner; John H. Kalivas
For quantitative analysis of samples based on near-infrared (NIR) spectra, it is common practice to use full spectra in conjunction with partial least-squares (PLS) or principal component regression. Alternatively, least-squares (LS) can be used provided that proper wavelengths have been selected. Recently, optimization algorithms such as simulated annealing and the genetic algorithm have been applied to the selection of individual wavelengths. These algorithms are touted as global optimizers capable of locating the best set of parameters for a given large-scale optimization problem. Optimization methods such as simulated annealing and the genetic algorithm can become time intensive. Excessive computer time may be due not to computations but to the need to determine proper operational parameters to ensure acceptable optimization results. In order to reduce the time to select wavelengths, a different approach consists of selecting wavelengths directly on the basis of spectral criteria. This paper shows that results are not acceptable when one is separately using the criteria of large wavelength correlations to the prediction property, wavelengths associated with large values in loading vectors from PLS or derived from the singular value decomposition (SVD) of the spectra, and wavelengths associated with large PLS regression coefficients. However, it is demonstrated that acceptable results can be produced by using wavelength regions simultaneously associated with large correlations and loading values provided that the level of noise for identified wavelengths is also acceptable. Thus, this paper shows that, rather than using time-consuming optimization algorithms that generally select individual wavelengths, one can achieve improved results based on wavelength windows directly selected. In other words, the described approach is founded on the exclusion of spectral regions rather than the search for distinct wavelengths. As part of the NIR spectral characterization, it is shown that certain loading vectors from the SVD of spectra are equivalent to correlograms for prediction properties. The same is shown to be true for PLS loading vectors. This type of analysis is useful for determining dominant properties of spectra, i.e., primary properties responsible for spectral variations.
Analytica Chimica Acta | 2001
John H. Kalivas
Estimates of regression coefficients for a multivariate linear model have been the subject of considerable discussion in the literature. A purpose of this paper is to discuss biased estimators using common basis sets. Estimators of focus are least squares, principal component regression, partial least squares, ridge regression, generalized ridge regression, continuum regression, and cyclic subspace regression. Variations of these methods are also proposed. It is shown that it is not the common basis set used to span the calibration space or the number of vectors from the common basis set used to form respective calibration models that are important, i.e. a parsimony emphasis. Instead, it is suggested that the size and direction of the calibration subspace used to form the models is essential, i.e. a harmony consideration. The approach of the paper is based on representing estimated regression vectors as weighted sums of basis vectors.
Applied Spectroscopy | 1999
Valerie Allen; John H. Kalivas; Rene G. Rodriguez
Raman spectroscopy is evaluated as a spectroscopic method for identification of common household plastics for recycling purposes. The methods of K-nearest neighbor (KNN), cyclic subspace regression (CSR), and library searching are compared for computerized plastic classification. Plastics studied consist of polyethylene terephthalate, high-density polyethylene, polyvinyl chloride, low-density polyethylene, polypropylene, and polystyrene. With principal component analysis (PCA), visual distinction between the different plastics becomes possible. Correct class membership to all six plastic types is provided by KNN. To date, all development and uses of CSR have been based on building models for each prediction property analogous to the form of partial least-squares known as PLS1. Cyclic subspace regression is modified in this paper to also allow modeling of multiple properties, as does PLS2. The new form of CSR was able to correctly classify all six plastic types when seven-factor models were used. This paper reports that key observations made in comparing PCR to PLS1 are verified for the interrelationships of PCR and PLS2 models. Most notable is that even though PLS2 uses spectral responses and plastic identifications to form factors, PLS2 eigenvector weights are not much different from PCR eigenvector weights where PCR only uses spectral responses to form eigenvector weights. Library searching showed less significant results than KNN and CSR. Regardless of the identification approach, polyethylene samples could be identified as either being high density or low density with the use of Raman spectroscopy.