Leo H. Chiang
Dow Chemical Company
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Leo H. Chiang.
Measurement Science and Technology | 2001
Leo H. Chiang; Evan L. Russell; Richard D. Braatz
The appearance of this book is quite timely as it provides a much needed state-of-the-art exposition on fault detection and diagnosis, a topic of much interest to industrialists. The material included is well organized with logical and clearly identified parts; the list of references is quite comprehensive and will be of interest to readers who wish to explore a particular subject in depth. The presentation of the subject material is clear and concise, and the contents are appropriate to postgraduate engineering students, researchers and industrialists alike. The end-of-chapter homework problems are a welcome feature as they provide opportunities for learners to reinforce what they learn by applying theory to problems, many of which are taken from realistic situations. However, it is felt that the book would be more useful, especially to practitioners of fault detection and diagnosis, if a short chapter on background statistical techniques were provided. Joe Au
Chemometrics and Intelligent Laboratory Systems | 2000
Leo H. Chiang; Evan L. Russell; Richard D. Braatz
Abstract Principal component analysis (PCA) is the most commonly used dimensionality reduction technique for detecting and diagnosing faults in chemical processes. Although PCA contains certain optimality properties in terms of fault detection, and has been widely applied for fault diagnosis, it is not best suited for fault diagnosis. Discriminant partial least squares (DPLS) has been shown to improve fault diagnosis for small-scale classification problems as compared with PCA. Fishers discriminant analysis (FDA) has advantages from a theoretical point of view. In this paper, we develop an information criterion that automatically determines the order of the dimensionality reduction for FDA and DPLS, and show that FDA and DPLS are more proficient than PCA for diagnosing faults, both theoretically and by applying these techniques to simulated data collected from the Tennessee Eastman chemical plant simulator.
Computers & Chemical Engineering | 2004
Leo H. Chiang; Mark Kotanchek; Arthur K. Kordon
Abstract The proficiencies of Fisher discriminant analysis (FDA), support vector machines (SVM), and proximal support vector machines (PSVM) for fault diagnosis (i.e. classification of multiple fault classes) are investigated. The Tennessee Eastman process (TEP) simulator was used to generate overlapping datasets to evaluate the classification performance. When all variables were used, the datasets were masked with irrelevant information, which resulted in poor classification. With key variables selected by genetic algorithms and the contribution charts, SVM and PSVM outperformed FDA and demonstrated the advantage of using nonlinear technique when data are overlapped. The overall misclassification for the testing data using FDA dropped from 38 to 18%; while those using SVM and PSVM dropped from 44–45 to 6%. The effectiveness of the proposed approach is increased in PSVM by saving significant computation time and memory requirement, while obtaining comparable classification results. For auto-correlated data, the incorporation of time lags into SVM and PSVM improved classification results. The added dimensions decreased the degree to which the data overlap and the overall misclassification for the testing set using SVM and PSVM decreased further to 3%.
Chemometrics and Intelligent Laboratory Systems | 2000
Evan L. Russell; Leo H. Chiang; Richard D. Braatz
Abstract Principal component analysis (PCA) is a well-known data dimensionality technique that has been used to detect faults during the operation of industrial processes. Dynamic principal component analysis (DPCA) and canonical variate analysis (CVA) are data dimensionality techniques which take into account serial correlations, but their effectiveness in detecting faults in industrial processes has not been extensively tested. In this paper, score/state and residual space PCA, DPCA, and CVA are applied to the Tennessee Eastman process simulator, which was designed to simulate a wide variety of faults occurring in a chemical plant based on a facility at Eastman Chemical. This appears to be the first application of residual space CVA statistics for detecting faults in a large-scale process. Statistics quantifying variations in the residual space were usually more sensitive but less robust to the faults than the statistics quantifying the variations in the score or state space. The statistics exhibiting a small missed detection rate tended to exhibit small detection delays and vice versa. A residual-based CVA statistic proposed in this paper gave the best overall sensitivity and promptness, but the initially proposed threshold for the statistic lacked robustness. This motivated increasing the threshold to achieve a specified missed detection rate.
Archive | 2000
Evan L. Russell; Leo H. Chiang; Richard D. Braatz
I. Introduction.- 1. Introduction.- II. Background.- 2. Multivariate Statistics.- 3. Pattern Classification.- III. Methods.- 4. Principal Component Analysis.- 5. Fisher Discriminant Analysis.- 6. Partial Least Squares.- 7. Canonical Variate Analysis.- IV. Application.- 8. Tennessee Eastman Process.- 9. Application Description.- 10. Results and Discussion.- V. Other Approaches.- 11. Overview of Analytical and Knowledge-based Approaches.- References.
Journal of Process Control | 2003
Leo H. Chiang; Randy J. Pell; Mary Beth Seasholtz
Abstract To implement on-line process monitoring techniques such as principal component analysis (PCA) or partial least squares (PLS), it is necessary to extract data associated with the normal operating conditions from the plant historical database for calibrating the models. One way to do this is to use robust outlier detection algorithms such as resampling by half-means (RHM), smallest half volume (SHV), or ellipsoidal multivariate trimming (MVT) in the off-line model building phase. While RHM and SHV are conceptually clear and statistically sound, the computational requirements are heavy. Closest distance to center (CDC) is proposed in this paper as an alternative for outlier detection. The use of Mahalanobis distance in the initial step of MVT for detecting outliers is known to be ineffective. To improve MVT, CDC is incorporated with MVT. The performance was evaluated relative to the goal of finding the best half of a data set. Data sets were derived from the Tennessee Eastman process (TEP) simulator. Comparable results were obtained for RHM, SHV, and CDC. Better performance was obtained when CDC is incorporated with MVT, compared to using CDC and MVT alone. All robust outlier detection algorithms outperformed the standard PCA algorithm. The effect of auto scaling, robust scaling and a new scaling approach called modified scaling were investigated. With the presence of multiple outliers, auto scaling was found to degrade the performance of all the robust techniques. Reasonable results were obtained with the use of robust scaling and modified scaling.
Chemometrics and Intelligent Laboratory Systems | 2003
Leo H. Chiang; Richard D. Braatz
Abstract Data-driven techniques based on multivariate statistics (such as principal component analysis (PCA) and partial least squares (PLS)) have been applied widely to chemical processes and their effectiveness for fault detection is well recognized. There is an inherent limitation on the ability for purely data-driven techniques to identify and diagnose faults, especially when the abnormal situations are associated with unknown faults or multiple faults. The modified distance (DI) and modified causal dependency (CD) are proposed to incorporate the causal map with data-driven approach to improve the proficiency for identifying and diagnosing faults. The DI is based on the Kullback–Leibner information distance (KLID), the mean of the measured variables, and the range of the measured variable. The DI is used to measure the similarity of the measured variable between the current operating conditions and the historical operating conditions. When the DI is larger than the predefined threshold, the variable is identified as abnormal. The CD, derived based on the multivariate T 2 statistic, is used to measure the causal dependency of two variables. When CD is larger than the predefined threshold, the causal dependency of the two variables is broken. The proposed method requires a causal map and historical data associated with the normal operating conditions. A causal map containing the causal relationship between all of the measured variables can be derived based on knowledge from a plant engineer and the sample covariance matrix from the normal data. The DI/CD algorithm outperformed the purely data-driven techniques such as PCA for detecting and identifying known, unknown, and multiple faults using the data sets from the Tennessee Eastman process (TEP).
Journal of Process Control | 2004
Leo H. Chiang; Randy J. Pell
Many trouble-shooting problems in process industries are related to key variable identification for classifications. The contribution charts, based on principal component analysis (PCA), can be applied for this purpose. Genetic algorithms (GAs) have been proposed recently for many applications including variable selection for multivariate calibration, molecular modeling, regression analysis, model identification, curve fitting, and classification. In this paper, GAs are incorporated with Fisher discriminant analysis (FDA) for key variable identification. GAs are used as an optimization tool to determine variables that maximize the FDA classification success rate for two given data sets. GA/FDA is a proposed solution for the variable selection problem in discriminant analysis. The Tennessee Eastman process (TEP) simulator was used to generate the data sets to evaluate the correctness of the key variable selection using GA/FDA, and the T2 and Q statistic contribution charts. GA/FDA correctly identifies the key variables for the TEP case studies that were tested. For one case study where the correlation changes in two data sets, the contribution charts incorrectly suggest that the operating conditions are similar. On the other hand, GA/FDA not only determines that the operating conditions are different, but also identifies the key variables for the change. For another case study where many key variables are responsible for the changes in the two data sets, the contribution charts only identifies a fraction of the key variables, while GA/FDA correctly identifies all of the key variables. GA/FDA is a promising technique for key variable identification, as is evidenced in successful applications at The Dow Chemical Company.
Archive | 2001
Evan L. Russell; Leo H. Chiang; Richard D. Braatz
In the pattern classification approach to fault diagnosis outlined in Chapter 3, it was described how the dimensionality reduction of the feature extraction step can be a key factor in reducing the misclassification rate when a pattern classification system is applied to new data (data independent of the training set). The dimensionality reduction is especially important when the dimensionality of the observation space is large while the numbers of observations in the classes are relatively small. A PCA approach to dimensionality reduction was discussed in the previous chapter. Although PCA contains certain optimality properties in terms of fault detection, it is not as well-suited for fault diagnosis because it does not take into account the information between the classes when determining the lower dimensional representation. Fisher Discriminant Analysis (FDA), a dimensionality reduction technique that has been extensively studied in the pattern classification literature, takes into account the information between the classes and has advantages over PCA for fault diagnosis.
Annual Review of Chemical and Biomolecular Engineering | 2017
Leo H. Chiang; Bo Lu; Ivan Castillo
Big data analytics is the journey to turn data into insights for more informed business and operational decisions. As the chemical engineering community is collecting more data (volume) from different sources (variety), this journey becomes more challenging in terms of using the right data and the right tools (analytics) to make the right decisions in real time (velocity). This article highlights recent big data advancements in five industries, including chemicals, energy, semiconductors, pharmaceuticals, and food, and then discusses technical, platform, and culture challenges. To reach the next milestone in multiplying successes to the enterprise level, government, academia, and industry need to collaboratively focus on workforce development and innovation.