Can Machine Learning Identify Governing Laws For Dynamics in Complex Engineered Systems ? : A Study in Chemical Engineering
CC AN M ACHINE L EARNING I DENTIFY G OVERNING L AWS FOR D YNAMICS IN C OMPLEX E NGINEERED S YSTEMS ? : A S
TUDYIN C HEMICAL E NGINEERING
A P
REPRINT
Renganathan Subramanian ∗ Department of Chemical EngineeringIndian Institute of Technology, MadrasMadras, India [email protected]
Shweta Singh † Department of Agricultural & Biological EngineeringDivision of Environmental & Ecological EngineeringPurdue UniversityWest Lafayette, IN, USA [email protected]
July 19, 2019 A BSTRACT
Machine learning recently has been used to identify the governing equations for dynamics in physicalsystems. The promising results from applications on systems such as fluid dynamics and chemicalkinetics inspire further investigation of these methods on complex engineered systems. While atmicro-scales the governing laws such as heat transfer, diffusion, pressure variation are well knownand have been studied for decades, it is also known that the laws for complex systems are not wellestablished. Dynamics of these systems play a crucial role in design and operations. Hence, it wouldbe advantageous to learn more about the mechanisms that may be driving the complex dynamicsof systems where an overall governing law is unknown. In this work, our research question wasaimed at addressing this open question about applicability and usefulness of novel machine learningapproach in identifying the governing dynamical equations for engineered systems. We focused ondistillation column which is an ubiquitous unit operation in chemical engineering and demonstratescomplex dynamics i.e. it’s dynamics is a combination of heuristics and fundamental physical laws.We tested the method of Sparse Identification of Non-Linear Dynamics (SINDy) because of it’sability to produce white-box models with terms that can be used for physical interpretation ofdynamics. Dynamics of the system was externally forced using perturbations to input stream and timeseries data was generated from simulation of distillation column using ASPEN Dynamics software.One promising result was reduction of number of equations for dynamic simulation from 1000s inASPEN to only 13 - one for each state variable. Prediction accuracy was high on the test data fromsystem within the perturbation range, however outside perturbation range equations did not performwell. In terms of physical law extraction, some terms were interpretable as related to Fick’s law ofdiffusion (with concentration terms) and Henry’s law (with ratio of concentration and pressure terms).Equations were complex, some terms were interpretable but we did not achieve a conclusive answeron physical governing laws. We conclude that more research is needed on combining engineeringsystems with machine learning approach to improve understanding of unknown dynamics. K eywords Machine Learning · Chemical Engineering · ASPEN Dynamics · Distillation Column · SINDy · DynamicEquations ∗ Renganathan is a Senior UG Student at IIT-M. This research was done as Visiting Student at Purdue University. † Shweta Singh is an Assistant Professor at Purdue University, Corresponding Author a r X i v : . [ ee ss . S Y ] J u l PREPRINT - J
ULY
19, 2019
Engineering has relied on identification of system dynamics from first principle methods for decades in order tounderstand the underlying the mechanisms driving dynamics.These first principle equations form the fundamentalsthat are used to design and operate systems for desired outputs such as heat transfer, operation of process plants, fluidflow operations etc. The equations are also augmented with observation driven empirical relationships which are notfundamentally a law but form basic governing rules. Combined both mechanistic and empirical rules form core ofengineering in design and operations.However, in several cases the first principle based equations may not be available for the system or might be extremelycomplicated for quick computation and analysis. In these scenarios, it becomes necessary to develop an understandingof the system using data-driven methods. With the advancement in tools to generate, store, transport and analyze highquality and high quantity data, it has become inevitable to rely on data-driven methods to extract governing equationsand patterns for a system. However the use of data-driven methods to understand dynamical systems has been limited.In chemical engineering systems a single unit can exhibit complex dynamics. This motivates the need to explore thepotential of data based methods to extract dynamic governing equations in such systems.In this work, we try to identify system dynamics of a distillation column. Distillation columns are one of the well studiedand established units in Chemical Engineering. We first build a dynamic process flow simulation of this distillationcolumn to generate time series data. We then apply data driven system identification on this system and try to answerthe question of whether data based machine learning methods can replace first principles.The paper is organized as follows. In Section 2.1 we explain the method of Sparse Regression of Dynamical Systems.This method has been shown Brunton et al. [2016], Hoffmann et al. [2019] to possess the ability to extract sparsegoverning equations for dynamic systems and can balance model complexity with model accuracy. In Section 2.2,we explain the process flow simulation development on Aspen Plus R (cid:13) for the distillation column. The rest of Section ?? deals with creating a dynamic simulation, data generation, model training, selection and testing. In Section 3 weshow the results of the research and interpret them with the goal to establish the extent to which machine learning cansubstitute for first principles. We finally discuss the key takeaways and the prospects for future research in Section 4 Sparse regression is a machine learning methodology which works under the assumption that the governing equationsof most dynamical systems contain very few terms. These equation can be considered sparse in the function space andthe system is expected to evolve on a low dimensional manifold. Sparse Identification tries to discover these equationsfrom noisy time series data.We consider systems whose governing equations are non-linear ODEs of the form, d x ( t ) dt = f ( x ( t ) , u ( t )) where x ( t ) ∈ R n denotes the state of the system at time t , u ( t ) ∈ R is the input function value at time t and f ( x ( t )) isa linear combination of non-linear functions of x ( t ) and u ( t ) . Mathematically, ˙ x ( t ) = Σ ki =1 ξ i θ i ( x ( t ) , u ( t )) x ( t ) = [ x ( t ) x ( t ) . . . x n ( t )] Where x , . . . x n are the states of the system, θ s are non-linear functions called the candidate terms of f ( x ) , and ξ s arethe coefficients of the terms. We expect most of ξ s to be making f ( x, u ) sparse in the number of terms. The goal ofthe algorithm is to identify the very few terms which make up f ( x, u ) from a very large set of candidates. Instead of acombinatorial search for these terms by brute force, the algorithm includes a penalty for model complexity. This forcesthe selected function to be sparse. By forcing sparse functions the algorithm also ensures that the model obtained doesnot overfit the data.The data required for the algorithm is a time series of states arranged in a matrix X ( t ) ∈ R m × n of the form X ( t ) = x ( t ) x ( t ) . . . x n ( t ) x ( t − x ( t − . . . x n ( t − ... ... x ( t − m + 1) x ( t − m + 1) . . . x n ( t − m + 1) PREPRINT - J
ULY
19, 2019The derivative of X ( t ) , ˙ X ( t ) ∈ R m × n is matrix of the form ˙ X ( t ) = ˙ x ( t ) ˙ x ( t ) . . . ˙ x n ( t )˙ x ( t −
1) ˙ x ( t − . . . ˙ x n ( t − ... ... ˙ x ( t − m + 1) ˙ x ( t − m + 1) . . . ˙ x n ( t − m + 1) obtained by numerically differentiating X ( t ) . And, u ( t ) = [ u ( t ) , u ( t − , . . . u ( t − m + 1)] T The governing equation becomes ˙ X ( t ) = Θ ( X ( t ) , u ( t ) ) Ξ With
Θ ( X ( t ) ) ∈ R m × k given by [ θ ( X ( t ) ) θ ( X ( t ) ) . . . ] And Ξ ∈ R k × n given by [ ξ ξ . . . ξ n ] We try to identify the sparse matrix Ξ by solving the least squares optimization problem. However, this in-cludes an optimization for every column of ˙ X ( t ) . So in this case, we have to solve n optimization problems, one foreach of the n states of the system.The algorithm forces sparsity by adding a regularization term to the objective function. The ideal regularization to forcesparsity would be minimizing the L norm of the coefficients (number of non zero terms in the vector). But this anNP-hard problem K. Natarajan [1995]. However, it has been shown Donoho and Elad [2003] that mininmizing the L norm is a convex optimization and also produces solutions which are sparse. This is referred to as the lease absoluteshrinkage and selection operator (LASSO). The LASSO optimization problem is ξ ∗ i = argmin ξ i || ˙ x i − Θ ( X ( t ) ) ξ i | | + α || ξ i || i = 1 , . . . nα is the regularization parameter which has to be tuned inorder to achieve a trade-off between accuracy and sparsity.This optimization problem can be solved by the standard convex optimization algorithms. We have used coordinatedescent algorithm which is available as a prebuilt function in the scikit Python library.The capability of the algorithm to capture the dynamics of the system depends mainly on the candidate functionsprovided. Some prior knowledge of the process might help identify these candidate functions. This is a place wheredomain knowledge becomes important. The method also depends on the quality of the data. Therefore, we need to filterthe derivatives and/or variables as we are using numerical differentiation to obtain the derivatives.
In order to study the application of machine learning approach to identify the governing equations for dynamics inengineered systems we selected the unit operation of distillation column. Distillation columns are one of the mostubiquitous unit operations in process industries ranging from petrochemicals, biomass to now the next generationbiorefineries. While the column looks simple from the operations perspective (after multiple decades of theory anddesign development), the dynamics of this system is complex. The dynamics of the system is dependent on multiplephysical laws such as heat transfer, diffusion principles, mass flow dynamics, hueristics that relate the pressure tochemical properties, temperature and pressure relationships etc. A standard software used to model the operation ofsuch an unit can include upto 1000s of equation. While control principles using linearized models are already beingused to deverlop control systems for these units, an overall simple law that govern the dynamics of these systems isnot known. Our goal of using machine learning based approach was to test the applicability of simplified data drivenapproach to identify the governing laws as data can be more easily generated. First principles approach to identifycomplex dynamics of these systems will certainly be a much difficult task. Next we describe the system selected,generation of time series data and selected system variables that describe the state of system to apply SINDy method.
The system considered was an extractive distillation column used to recover methylcyclohexane (MCH) from a mixtureof MCH and toulene. Since MCH (Boiling Point = ◦ C) and toluene (Boiling Point = . ◦ C) have very closeboiling points they cannot be separated by a conventional distillation column. Therefore, we use phenol (Boiling Point= . ◦ C) which has a higher affinity towards toluene to alter the relative volatility and promote separation. Anequimolar mixture of MCH and toluene forming the feed stream and a pure phenol stream are fed to the distillationcolumn . MCH is extracted as the overhead product while toluene and phenol leave as the bottoms products. The3
PREPRINT - J
ULY
19, 2019process flow diagram for the column is given in Fig.1. The column was modelled as a RadFrac unit. The specificationsof the distillation column used are listed in Table.1 and the feed conditions are given in Table.2Figure 1: Process Flow Diagram - Distillation ColumnThe column is able to recover . of the MCH originally present in the feed stream. The process flow diagram is thenexported to Aspen Dynamics for running dynamic simulations that can allow extracting the rules governing dynamicsfor this system. The first-principle based mechanistic model has 2403 variables and 1848 equations as identified byAspen Dynamics, however structure of all these equations are not known.Table 1: Distillation Column Specifications Specification
ValueNo. of stages 22Reflux Ratio 8Distillate Rate (lbmol/hr) 200FEED Stage 14PHENOL Stage 7Stage 1 (Condenser) Pressure (psia) 16Stage 22 (Reboiler) Pressure (psia) 20.2Diameter (ft) 5Spacing (ft) 2Weir Height (ft) 0.164Lw/D 0.7267Tray Geometry % Active Area 90Overall Efficiency 1% Hole Area 10Hole Diameter (ft) 0.0833% Downcomer Escape Area 10Foaming Factor 1Length (ft) 6Reflux Drum Diameter (ft) 3Head Type HorizontalHeight (ft) 5Sump Diameter (ft) 3Head Type EllipticalType TotalHeat Transfer LMTDCondenser Medium Temperature (F) 68Temperature Approach (F) 18Heat Capacity (Btu/lb-R) 1.00315Type KettleReboiler Heat Transfer Constant DutyHeat Duty (Btu/hr) 31615232.64
PREPRINT - J
ULY
19, 2019Table 2: Feed SpecificationsPHENOL FEEDMolar Flow (lbmol/hr) 1200 400Phenol 1 0Mole Fraction Toluene 0 0.5MCH 0 0.5Temperature (F) 220 220Pressure (psia) 20 20
In order to capture the dynamics, system was perturbed by adding perturbations to the phenol feed stream and the feedflow rate was kept constant. This will allow the approach to identify equations that govern the dynamics developed inthe system due to changes in the extracting agent’s flow rate. An initial sensitivity analysis was carried out in AspenPlus Steady State (results shown in Fig.2 to identify the valid values for phenol flow rates for which the column canoperate without errors.Figure 2: Sensitivity Analysis on Exit Streams for Phenol Flow Rate VariationsThe perturbations were restricted to a fraction of this zone and the rest of the valid region was used for testing.
Perturbations :
The phenol feed perturbation was implemented by executing a Task in Aspen Dynamics The per-turbation was a random mix of step changes, linear ramps and sigmoidal ramps with a time period of 1 hour eachand amplitudes between 1000 lbmol/hr to 3000 lbmol/hr generated randomly with a uniform probability distributionfunction. The simulation was run for 100 hours with a calculation step size of 0.01 hours. One such feed flow rate timeseries plot is given in Fig.3 This generates 50001 equally spaced (in time) data points. The phenol feed time seriesbecomes u ( t ) which is equivalent to a forcing function that drives the dynamics of the system.5 PREPRINT - J
ULY
19, 2019Figure 3: Perturbations - Phenol Flow Rate (lbmol/hr) vs Time (hr)
Operating Conditions :
In order to define the system, the following variables were fixed as operating conditionparameters : Reflux ration, toluene feed rate, MCH feed rate, distillation column sizing, tray geometry, reboilergeometry and sizing, condenser geometry and sizing, reboiler duty and condenser heat transfer coefficients.Theseconditions play a crucial role in operation of selected distillation column hence fixing these parameters would allow usto identify the governing equations for mechanisms that drive the dynamics of flow streams. Further, in order to test therobustness of the equations extracted, the structure of the obtained equations were compared across different operatingconditions obtained by altering these parameters. The different operating conditions tested are listed in Table.3. Thistesting method has been further explained in Section 2.4
Parameter System 1 System 2 System 3 System 4Reflux Ratio
Toluene Feed
200 200 200 400
MCH Feed
200 200 400 200Table 3: Different Operating Conditions Tested
States of the System
Studying the dynamics of a systems requires following the state of system by mapping the state toobservable variables. In this case, for the machine learning algorithm to capture the complete dynamics of the system,we included the set of variables which change with the perturbations and are not fixed as operating conditions. Theresult will be a system of ODEs that can describe the evolution of the whole system as state space dynamics for thesevariables. For the system under consideration, we initially chose the following variables:6
PREPRINT - J
ULY
19, 2019Variables Description Symbol Used inODEsOVERHEAD StreamTemperature Top T
T OP T OVERHEAD Stream –Phenol Flow Rate Top Ph
T op Ph OVERHEAD Stream–Toluene Flow Rate Top Tol
T op
Tol
BOTTOMS Stream -MCH Flow Rate Bot MCH
Bot
MCH
BOTTOMS Stream –Phenol Flow Rate Bot Ph
Bot Ph BOTTOMS Stream–Toluene Flow Rate Bot Tol Bot
Tol
BOTTOMS StreamTemperature Bot T Bot T Condenser Duty Q Cond Q cond
Reboiler Vapour FlowRate Vep Reb Vap
Reb
Stage 1 (Condenser)Pressure P1 P Stage 22 (Reboiler)Pressure P22 P Table 4: State Space Variables for Distillation Column DynamicsThese variables hold significance in terms of column requirements as the equations developed can later be used forobtaining a specific extent of separation, quality of product, ensure pressure in the column within safety limits orestimate energy requirements. ODEs in terms of these variables will make these use cases possible.However, due to the presence of trace quantities of chemicals in streams, the equations for those chemicals producedinaccurate results. This can be attributed to the low Signal to Noise Ratio (noise arising from numerical integration) forthese variables. To improve the model, the total flow rate of the overhead stream was also included as state variables.The total flow rate is a redundant variable as it can be estimated as a summation of the individual flow rates. But, sincethe total flow has a higher SNR, it is expected to produce better results.
Candidate Functions
The variables were first mean shifted and auto scaled before generating the candidate functions.We used 360 candidate functions of the form, f i = x a ( i )1 x a ( i )2 · · · x a ( i )14 i = 1 , . . . k Where, Σ j =1 ≤ − ≤ a ( i ) j ≤ a ( i ) j ∈ Z And 70 candidate functions of the form, sin( x i ) , cos( x i ) , ln ( | x i | ) , e x i , (cid:112) | x i | ∀ i = 1 , . . . . These functions werechosen without using any strong understanding of the system to check if the algorithm can work with very little to nodomain knowledge. The algorithm was implemented on Python 3.6.5 using the libraries - pandas , numpy , sklearn , scipy , matplotlib and itertools . We used numerical differentiation with total variance regularization method developed in Chartrand [2011] toobtain the derivatives of the variables. The data was split in the ratio 3:1:1 for training, cross validation and testing. T7 PREPRINT - J
ULY
19, 2019
Two methods were used to select models. These two methods differ on the final use of the model and navigate atrade-off between accuracy of prediction and interpretability of the obtained equations. The models differed in the valueof the L1 norm regularization parameter. Models with low regularization parameter had a higher number of terms andhigher training set accuracy than those with high regularization.
This selected the model with highest cross validation accuracy. For some variables, cross validation had a clear peak asshown in Fig.4a. The accuracy is expected to initially increase with reducing regularization but later reduce due to overfitting. However, some of the selected models had too many terms making it difficult to interpret their physical meaning.Also, some variables like in Fig.4b did not exhibit this clear peak characteristic and the accuracy kept increasing withsmaller regularization. The implications of this observation are discussed in Section 3. (a) Clear cross-validation peak (b) Without a clear cross-validation peak
Figure 4: Cross Validation Model Selection
To avoid selecting a lot of terms and to break ties in cases without a clear cross validation peak, a selection score basedon model complexity was defined. Based on the score given by Eq.1, the model with the highest score was selected. αk − βln (cid:0) R CV (cid:1) (1)where α and β are weights selected based on inspection of the trade off graphs. k denotes the number of terms in theobtained equation and R CV is the cross validation R accuracy. Different testing methods were employed to quantify the goodness of the developed equations for different purposes.We looked at the accuracy of predicting ˙ X ( t ) given X ( t ) and u ( t ) and the accuracy in predicting X ( t ) from u ( t ) and the initial condition, by integrating the ODEs obtained. The results of these tests along with their interpretations areavailable in Section 3 and Appendix 4. The methods employed were: Test Data
Tests the accuracy of the developed model on the data selected randomly and excluded from training.This gives an idea about the overfitting and the predictive ability of the model under conditions similar towhich the training data was obtained. Low success under this test could indicate overfitting.
Outside Perturbation Region
This creates a new data set by changing the feed perturbation region and testing themodel on this new data. This checks if the model was able to capture the complete dynamics of the model.Low accuracy under this test would indicate incompleteness of the model in terms of missing critical statevariables or insufficient candidate functions.
Long Time Accuracy
In this testing, the dynamical system is run for a longer time (250 hours) than the training time(100 hours) to generate test data. This will help identify long time dynamic effects or time based evolution ofthe system which could have been missed by the algorithm.
Similar System Structural Comparison
PREPRINT - J
ULY
19, 2019
The model was trained on the four systems mentioned in Table. 3. The developed equations were used to predict ˙ X ( t ) from X ( t ) and the input for the test data. These results along with the sparsity of the models given by the numberof terms N are given in Table. 5 and 6. The training and testing were done for 2 values of α corresponding to lowregularization and high regularization.Table 5: Training and Test R values for the 4 systems Basic MCH400LowRegularization High Regularization LowRegularization High RegularizationVariable Train Test N Train Test N Train Test N Train Test NTop F
Top T
Top MCH
Top Ph
Top Tol
Bot T
Bot MCH
Bot Ph
Bot Tol
Cond Q
Vap Reb P1 P22 R values for the 4 systems T400 RR6LowRegularization High Regularization LowRegularization High RegularizationVariable Train Test N Train Test N Train Test N Train Test NTop F
Top T
Top MCH
Top Ph
Top Tol
Bot T
Bot MCH
Bot Ph
Bot Tol
Cond Q
Vap Reb P1 P22 ˙ X ( t ) with a reasonable accuracy from X ( t ) and u ( t ) . Reducing theregularization increases the accuracy in the test data. This trend is seen across variables and till very small regularizationparameter values. This indicates that we are unable to capture enough information from the data using the providedcandidate functions and number of terms. This could either indicate insufficient candidate function and state variablesor absence of a low dimensional function space representation for the system. Ways to analyze and possibly overcomethis are discussed in Section. 4.System 1 was also tested on two other simulations (one run for a longer time and the other outside the train-9 PREPRINT - J
ULY
19, 2019ing perturbation region) explained in Section. 2.4. The results for these two tests are in Table. 7. Sample result plots of ˙ X ( t ) vs t for Prediction vs True Values for these tests are in Fig. 5 and 6.Table 7: Long Time and Testing Outside the Training Perturbation Region Long Time Outside TrainingVariable Low α High α Low α High α Top F
Top T
Top MCH
Top Ph
Top Tol
Bot T
Bot MCH
Bot Ph
Bot Tol
Cond Q
Vap Reb P1 P22 (a) Condenser Pressure Derivative vs Time - Good Predictions (b) Reboiler Duty Derivative vs Time - Good Predictions
Figure 5: Long Time Data SetHowever the performance is sub par in region outside the training perturbation for most of the states. Also, with higherregularization, the model marginally improves as opposed to all the previous observations where the model kept gettingbetter on the test set with decreasing regularization. This indicates that the available variables and candidate functionsare over fitting not the training data but the state of the system in the training region. This can be resolved by includingnew state variables which will make the model obtained invariant to the training perturbation region.10
PREPRINT - J
ULY
19, 2019 (a) Outside Training Perturbation - Good Predictions (b) Outside Training Perturbation - Poor Predictions
Figure 6: Outside Training Perturbation Data SetIn Fig. 6b we can see a variable for which the algorithm has poor results. The model predicts peaks in regions of steadyoperation. This could be because a limiting variable which dictates the dynamics in this region has been missed. So,the system does not exhibit significant dynamics here, while the model predicts dynamics. However, in Fig. 6a theprediction closely follows the model. By using these two equations for the same component in different outlet streamswe can try to understand which states might be missed. By incorporating the missed states we can iteratively improvethe model.
The ODEs obtained for the 4 systems were compared with each other for similarity in the terms selected. The numberof such similar terms for two levels of regularization along with the total number of terms is provided in AppendixB.1. Appendix B.1 also has a list of the terms with most repetitions cross the 4 systems tested. These results can beinterpreted as dynamic equivalent of sensitivity analysis in steady systems. If the complete dynamics had been captured,most of the terms in the ODEs would have been repeated across the systems. However, this was not observed. Hardly of the total terms were common across 3 systems where the feed compositions were altered. This could mean thatwe have not completely described our system with the current set of states. We need to look for variables which arecrucial in deciding the dynamics by performing sensitivity analyses on the operating conditions too.However, by reducing regularization we notice that the fraction of terms retained across the systems either increases orremains the same in most cases. This indicates that by increasing the number of candidate functions selected, they areable to explain the model better, even if only by a small increment. This result correlates with the prediction accuracyexplained in Section. 3.1 which kept improving with smaller regularization. We also find the same terms repeatingacross all 4 systems more commonly. The system with a different Reflux Ratio (which is the only column specificationvaried) had the no common terms with the other systems under high regularization but had an increasing number ofcommon terms under low regularization. This could further indicate that the system might not be truly sparse in functionspace, highlighting the possible limitations of using SINDy in identifying the complex dynamics of unknown systemwithout some knowledge about functional space that may govern the dynamics of these systems.A similar analysis was carried out between the training set and the test set with phenol feed outside the trainingperturbation region. The results of this analysis are listed in Table 8Table 8: Structural Similarity of ODEs
Low α High α Variable Common Total Common TotalTop F
Top T
Top MCH
Top Ph
Top Tol
13 28 5 18
Bot T
18 39 5 16
Bot MCH
21 39 3 18
Bot Ph
19 35 6 1311
PREPRINT - J
ULY
19, 2019Table 8: Structural Similarity of ODEs
Low α High α Variable Common Total Common TotalBot Tol
21 37 4 12
Cond Q
13 36 3 14
Vap Reb
17 35 5 17 P1 P22
17 34 3 11We see that lowering regularization in most of the cases decreases the fraction of common terms. However in theother systems tested inside the perturbation region, this was not the case. This along with the interpretations of Table.7 further confirm the fact that some crucial state variables or functional forms are being missed. Even though theprediction accuracy is high, if the true mechanisms are captured for these complex systems, same terms should appear inthe governing equations. However, if the aim of the work is to obtain simpler equations that can capture the non-lineardynamics, the algorithm performs well. But, in order to understand the true physical mechanisms, the SINDy algorithmperhaps need to be provided with functional forms determined by domain experts. As was done in the reaction kineticsidentification Hoffmann et al. [2019], the authors provided functional form determined by “law of mass action" whichis a known physical law that drives rate kinetics and mechanisms. To improve on the distillation column differentialequation identification, such knowledge about relationship between top and bottom feed, temperature and pressureneed to be used to construct appropriate functions. This is challenging for the distillation column system because thereare several heuristic based equations that are used in design of the seperation system. Our future work will addressthis need of converting these complex design equations that govern non-equilibrium system in distillation column toappropriate functional forms to be used in extracting the governing differential equations.
The Structure of the ODE obtained for the basic system under high regularization is available in Appendix A. While theseODEs are very complex to interprete, it is still a win for representing the dynamics of this system using one equation foreach state variable as compared to over 1000 complex equations that relate the dynamics of system. However, there wasno direct interpretation of most of the terms in physical sense. Some of the terms such as sin ( T op
T ol ) which representssin of Toluene concentration in Top flow is physically not interpretable. Some of the commonly recurring terms that wefound physically relevant were : Conc which basically meant that second order terms in concentration were foundrelevant for controlling the dynamics. The only feasible interpretation can be that diffusion of two components or crossdiffusion is driving the dynamics. This can be because of fick’s law of diffusion acting on both the component involved.For example one terms in Appendix is Bot
MCH /Bot
T ol which is the ratio of concentration of MCH and Toluene inbottom flow. Apperance of this term in the equation driving the MCH in top stream denotes some relationship betweendiffusivity difference of MCH and Toluene in the extracting component Phenol. While the form of equation is surprisingbecause the functional form did not give the fick’s law of diffusion which actually needs concentration variation ratherthan just concentration, the appearance of this term provides some hope of these data driven approaches to learn aboutthe governing mechanisms of dynamics in complex systems. Another term that we relate to a physical law is ratio ofconcentration and Pressure such as the term
Bot
T ol /P . This term represents concentration of Toluene in bottomfeed and pressure of the last plate. We related this term to the Henry’s law which relates concentration of a solute inliquid phase to the partial pressure of the solute in gas phase. This term probably represents the relationship that thedynamics of concentration in bottom stream for toluene is related to the pressure on plates where the component mayexist in vapor phase. The gas-liquid mass transfer in these systems are interconnected and complex, hence it is difficultto pin-point one single mechanism driving dynamics. However, it is encouraging to see some functional forms that maybe related to physical laws being picked up in these equations. In order to be able to identify the laws, more complexrules for application of machine learning in complex engineered systems must be developed. In this work our goal was to apply a machine learning approach on data generated from mechanistic model for distillationcolumn to test the hypothesis of identifying governing physical laws for dynamics of the system. We began with theapproach of Sparse Identification of Non-Linear Dynamics (SINDy) because of it’s ability to give white box modelsand allow interpretation of terms that drive the dynamics. We tested both for accuracy and also the changes in structureof equations obtained under different design consideration of the system. We picked a distillation column because ofit’s ubiquitious role in chemical engineering world from petrochemical industries to biomass refining. The results for12
PREPRINT - J
ULY
19, 2019prediction on the test data generated from mechanistic models were very encouraging with most variables showing morethan 80 % accuracy. Outside the perturbation range, the equations did not perform very well which may be because ofthe change in dynamic regime. If the training data set only captured a particular dynamic regime, it cannot capturethe dynamics in a different regime. However, this is still an un-resolved question from mechanistic perspective, thatif DEs capture truly physical mechanisms this should provide insights into impending regime change as well. Fromphysical interpretation perspective of the equations obtained, it was encouraging to see terms such as
Concentration and ratio of concentration with pressure. The prior can be related to fick’s law of diffusion for two components inthe column whereas the later can be related to the Henry’s law controlling the solubility of the components in themixture controlled by pressure at different plates in the column. One interesting finding from extracting these DEs isthe simplified relationship that was obtained between component flow rate in top stream to the component flow rates inthe bottom flow rate along with the pressure of last plate. In actual distillation column design, there is a mass balanceequation solved for each plate that finally relates the component concentration in top stream to the bottom stream.Use of this one simplified equation captures this whole dynamics. We think this is the strength of machine learningapproach. Based on the accuracy of prediction within certain time steps, a moving time window to train the modelwould be more appropriate. We expect that this can be used in better control systems design because the method cancapture the non-linear dynamics much better and the need of linearization as prevalent in traditional control designmay be relaxed. At the end, novel machine learning advancement is opening up new avenues of looking at complexengineered systems where traditional first principle method of extracting governing equations may fail. However, weare still a long way to go. A greater cross communication between engineering and data science would be required toachieve breakthroughs in limitations of engineering dynamical studies using machine learning approach. Both fieldsmust inform each other for overcoming the limitations in algorithms as well. References
Steven L. Brunton, Joshua L. Proctor, and J. Nathan Kutz. Discovering governing equations from data by sparseidentification of nonlinear dynamical systems.
Proceedings of the National Academy of Sciences , 113(15):3932–3937,2016. ISSN 0027-8424. doi: 10.1073/pnas.1517384113. URL .Moritz Hoffmann, Christoph Fröhner, and Frank Noé. Reactive sindy: Discovering governing reactions fromconcentration data.
The Journal of Chemical Physics , 150(2):025101, 2019. doi: 10.1063/1.5066099. URL https://doi.org/10.1063/1.5066099 .B K. Natarajan. Sparse approximate solutions to linear systems.
SIAM J. Comput. , 24:227–234, 04 1995. doi:10.1137/S0097539792240406.David L. Donoho and Michael Elad. Optimally sparse representation in general (nonorthogonal) dictionaries via l1minimization.
Proceedings of the National Academy of Sciences , 100(5):2197–2202, 2003. ISSN 0027-8424. doi:10.1073/pnas.0437847100. URL .Rick Chartrand. Numerical differentiation of noisy, nonsmooth data.
ISRN Applied Mathematics , 2011. doi: 10.5402/2011/164564. URL http://dx.doi.org/10.5402/2011/164564 . Appendices
A ODEs Obtained
The ODEs obtained for the basic system under high regularization are reported here. ˙ Top F = 0 . V Reb P − − . Bot
Tol
Phenol − . Bot
Tol P − − . Bot
MCH
Bot − Tol − . Bot MCH +0 . Bot T Bot − Tol + 0 . Top − Tol
Phenol − + 0 . Top
Tol
Bot
MCH + 0 . Top − Ph Phenol − + 0 . Top − Ph P − . Top − F Bot
Tol − . sin ( Top
Tol ) − . sin ( Bot
MCH ) − . cos ( Top
Tol )˙ Top T = 0 . V Reb P − − . Bot
Tol
Phenol − . Bot
Tol P − − . Bot − MCH
Bot
Tol − . Bot
MCH
Bot − Tol − . Bot MCH + 0 . Bot T P − + 0 . Bot T Bot − Tol + 0 . Top − Tol
Phenol − + 0 . Top − Tol P +0 . Top
Tol
Bot
MCH + 0 . Top − Ph P − . Top Ph Bot − MCH − . sin ( Top
Tol ) − . sin ( Bot
MCH )˙ Top
MCH = 0 . V Reb P − − . Bot
Tol
Phenol − . Bot
Tol P − − . Bot
MCH
Bot − Tol − . Bot MCH +0 . Bot T Bot − Tol +0 . Top − Tol
Phenol − +0 . Top
Tol
Bot
MCH +0 . Top − Ph Phenol − +0 . Top − Ph P − PREPRINT - J
ULY
19, 2019 . Top − F Bot
Tol − . sin ( Top
Tol ) − . sin ( Bot
MCH ) − . cos ( Top
Tol )˙ Top Ph = − . Bot
MCH V Reb + 0 . Bot − MCH
Bot
Tol − . Top
Tol
Bot
MCH − . Top Tol + 0 . Top T V − Reb +0 . sin ( Top
Tol )˙ Top
Tol = 0 . Phenol − − . P − Phenol − + 0 . Bot
Tol
Phenol + 0 . Bot − Tol +0 . Bot
MCH
Phenol − . Bot − MCH
Phenol − . Bot
MCH
Phenol − − . Bot − MCH − . Top
Tol
Bot
MCH +0 . Top
Tol
Bot − MCH + 0 . Top T V − Reb + 0 . exp ( Phenol ) + 0 . sin ( Top
Tol ) − . sin ( P ) +0 . sin ( P ) + 0 . cos ( Top
Tol ) + 0 . cos ( Bot
Tol ) + 0 . cos ( Phenol )˙ Bot T = − . Bot
Tol
Phenol +1 . Bot − Tol P − . Bot
Tol P − − . Bot − MCH
Phenol − − . Bot − MCH V − Reb − . Bot − MCH
Bot
Tol − . Bot MCH + 0 . Top
Tol
Phenol − . Top − Tol
Bot − MCH + 0 . Top − Ph Bot − Tol +0 . Top − Ph Top
Tol + 0 . sin ( Top
Tol ) − . sin ( Q cond ) + 0 . sin ( P ) + 0 . sin ( Phenol ) − . cos ( Top
Tol ) − . cos ( Bot
Tol ) − . cos ( V Reb ) + 0 . cos ( P )˙ Bot
MCH = 0 . P − Phenol − + 0 . Bot
Tol
Phenol + 1 . Bot
Tol P − + 0 . Bot − MCH V − Reb +0 . Bot
MCH
Bot
Tol + 0 . Bot − MCH
Bot
Tol − . Bot
MCH
Bot − Tol + 1 . Bot MCH − . Top
Tol
Phenol +0 . Top − Tol
Bot
Tol − . Top
Tol
Bot
MCH − . Top Ph Bot
MCH + 0 . Top Ph Top − Tol − . Top
MCH
Bot − Tol +0 . Top T V − Reb − . Top F Bot − Tol − . sin ( Top
Tol ) − . sin ( P ) − . sin ( Phenol ) +0 . cos ( Top
Tol ) + 0 . cos ( Bot
Tol ) + 0 . cos ( V Reb )˙ Bot Ph = − . Bot
Tol
Phenol + 0 . Bot − Tol P + 0 . Bot − Tol P − . Bot
Tol P − + 0 . Bot Tol − . Bot − MCH
Phenol − − . Bot
MCH
Bot
Tol − . Bot − MCH
Bot
Tol + 1 . Bot
MCH
Bot − Tol − . Bot MCH +0 . Top
Tol
Phenol − . Top − Tol
Bot
Tol − . Top − Tol
Bot − MCH + 0 . Top − Ph Top
Tol − . Top − T V Reb +0 . Top F Bot − Tol + 0 . sin ( Top
Tol ) − . sin ( Bot
MCH ) − . sin ( Q cond ) + 0 . sin ( Phenol ) − . cos ( Top
Tol ) − . cos ( Bot
Tol ) + 0 . cos ( P )˙ Bot
Tol = − . Bot − Tol
Phenol − − . Bot − Tol P − . Bot − Tol + 0 . Bot Ph − . Bot − Ph − . Bot − Ph +0 . Bot − MCH
Bot
Tol − . Bot MCH +0 . Bot T Bot Ph − . log ( Bot
MCH )+0 . log ( Bot Ph )+0 . sin ( P ) − . sin ( P ) + 0 . cos ( Top
Tol ) + 0 . sqrt ( Bot Ph )˙ Q cond = − . V Reb P − + 0 . Bot
Tol
Phenol + 0 . Bot
Tol P − + 1 . Bot
MCH
Bot − Tol + 0 . Bot MCH − . Bot T Bot − Tol − . Top
Tol
Bot
MCH − . Top − Ph Phenol − + 0 . Top Ph Bot − MCH − . Top − Ph +0 . Top − MCH Q cond + 0 . Top − F Bot
Tol + 0 . sin ( Top
Tol ) + 0 . sin ( Bot
MCH )˙ V Reb = 0 . Bot − Tol
Phenol − − . Bot
Tol P − + 1 . Bot − Tol V Reb + 0 . Bot − Tol − . Bot − MCH
Phenol − . Bot − MCH V − Reb + 0 . Bot
MCH
Bot
Tol − . Bot
MCH
Bot − Tol + 0 . Bot MCH + 0 . Top
Tol
Bot
Tol +0 . Top − Tol
Bot − Tol +0 . Top Ph Top − Tol +0 . Top − Ph +0 . Top − T V Reb − . sin ( Bot Ph )+0 . sin ( P )+0 . cos ( Top
MCH ) + 0 . cos ( Bot
MCH ) − . cos ( Bot Ph ) + 0 . cos ( Bot
Tol )˙ P = − . P Phenol +0 . Bot − Tol
Phenol − − . Bot
Tol P − +0 . Bot − MCH
Bot
Tol − . Bot
MCH
Bot − Tol − . Bot MCH + 0 . Bot T Bot Ph + 0 . Top − Tol
Bot
Tol + 0 . Top
Tol
Bot
MCH + 0 . Top − Ph Phenol − − . log ( Bot
MCH ) + 0 . exp ( Bot Ph ) − . sin ( Top
Tol ) − . sin ( Bot
MCH )˙ P = 0 . Bot Ph + 0 . Bot Ph + 0 . Bot − MCH P + 0 . Bot − MCH
Bot
Tol − . Bot
MCH
Bot − Tol − . Bot MCH + 0 . Bot T Bot Ph + 0 . Top − Tol
Bot
Tol + 0 . Top
Tol
Bot
MCH − . Top Ph Phenol +0 . Top − Ph Bot
Tol + 0 . log ( Bot Ph ) + 0 . exp ( Bot Ph ) − . sin ( Top
Tol ) − . sin ( Bot
MCH ) +0 . sin ( P ) B Common Terms across Systems
B.1 Top Stream Flow Rate PREPRINT - J
ULY
19, 2019Table 9: High Regularization - Number of Terms Retained across Systems - Top F Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6 F Bot_Tol^1 Phenol^1 Top_Tol^1 Bot_MCH^1Bot_Tol^1 P22^-1 sin(Top_Tol)Bot_MCH^1 Bot_Tol^-1Bot_MCH^2Top_Tol^-1 Phenol^-1Top_Tol^1 Bot_MCH^1Top_F^-1 Bot_Tol^1sin(Top_Tol)cos(Top_Tol)P1^1 Phenol^1Bot_MCH^1 Phenol^1Top_Tol^-1 Bot_MCH^-1Top_Ph^-2sin(P22)
B.2 Top Stream Temperature
Table 12: High Regularization - Number of Terms Retained across Systems - Top T Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
PREPRINT - J
ULY
19, 2019Table 14: Terms Retained across Systems (High Regularization) - Top T Bot_Tol^1 Phenol^1 Bot_Tol^1 P22^-1Bot_Tol^1 P22^-1 Top_Tol^1 Bot_MCH^1Bot_MCH^1 Bot_Tol^-1Top_Tol^-1 Phenol^-1Top_Tol^-1 P1^1Top_Tol^1 Bot_MCH^1sin(Top_Tol)P1^1 Phenol^1Bot_MCH^1 Phenol^1sin(P22)
B.3 Top Stream MCH Flow Rate
Table 15: High Regularization - Number of Terms Retained across Systems - Top
MCH
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
MCH
Bot_Tol^1 Phenol^1 Top_Tol^1 Bot_MCH^1Bot_Tol^1P22^-1Bot_MCH^1Bot_Tol^-1Bot_MCH^2Top_Tol^-1Phenol^-1Top_Tol^1Bot_MCH^1Top_Ph^-1P1^1Top_F^-1Bot_Tol^1sin(Top_Tol)cos(Top_Tol)P1^1Phenol^1 16
PREPRINT - J
ULY
19, 2019Bot_Tol^1P1^-1Bot_MCH^1Phenol^1Top_Tol^-1Bot_MCH^-1sin(P22)
B.4 Top Stream Phenol Flow Rate
Table 18: High Regularization - Number of Terms Retained across Systems - Top
P h
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
P h
Top_Tol^1 Bot_MCH^1 Top_Tol^2Top_Tol^2 Top_T^1 V_Reb^-1Top_T^1 V_Reb^-1 sin(Top_Tol)sin(Top_Tol) sin(Bot_MCH)P1^-1 Phenol^-1 sin(Q_Cond)Bot_Tol^-2 sin(Phenol)Bot_MCH^1 Phenol^-1 cos(Top_T)Bot_MCH^1 Bot_Tol^-1 cos(Bot_Ph)Bot_MCH^2 cos(Q_Cond)Bot_T^1 Bot_Ph^1 cos(Phenol)exp(Phenol)sin(Top_T)sin(Bot_T)sin(Bot_MCH)sin(Bot_Ph)sin(Bot_Tol)sin(Q_Cond)sin(V_Reb)sin(P1)sin(Phenol)cos(Top_T)cos(Bot_T)cos(Bot_Ph) 17
PREPRINT - J
ULY
19, 2019cos(Q_Cond)cos(V_Reb)cos(P1)cos(Phenol)Top_Tol^1Phenol^1
B.5 Top Stream Toluene Flow Rate
Table 21: High Regularization - Number of Terms Retained across Systems - Top
T ol
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
T ol
Phenol^-2 Phenol^-2Bot_Tol^1 Phenol^1 Bot_Tol^1 Phenol^1Bot_MCH^1 Phenol^-1 Top_Tol^1 Bot_MCH^1Top_Tol^1 Bot_MCH^1 Top_Tol^1 Bot_MCH^-1Top_Tol^1 Bot_MCH^-1 sin(Top_Tol)sin(Top_Tol) sin(P22)sin(P22) cos(Top_Tol)cos(Top_Tol)
B.6 Bottom Stream Temperature
Table 24: High Regularization - Number of Terms Retained across Systems - Bot T Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
PREPRINT - J
ULY
19, 2019Table 25: Low Regularization - Number of Terms Retained across SystemsExcluding MCH400 T400 RR6Common Total Common Total Common Total Common TotalBasic 16 41 25 50 31 50 16 41MCH400 13 41 37 60 18 41T400 13 41 21 41RR6 21 50Table 26: Terms Retained across Systems (High Regularization) - Bot T Bot_Tol^1 Phenol^1 sin(Bot_T)Bot_MCH^-1 Phenol^-1 cos(Q_Cond)Bot_MCH^2Top_Tol^1Phenol^1Top_Tol^-1Bot_MCH^-1Top_Ph^-1Top_Tol^1cos(Top_Tol)cos(Bot_Tol)cos(P22)Bot_MCH^-1Bot_Tol^-1Top_Tol^1Bot_MCH^1Top_Tol^-1Bot_MCH^1Top_Ph^1Top_Tol^-1sin(Bot_T)cos(Bot_Ph)cos(Q_Cond)cos(P1)Top_T^1V_Reb^-1sin(Top_F)cos(Top_T)
B.7 Bottom Stream MCH Flow Rate
Table 27: High Regularization - Number of Terms Retained across Systems - Bot
MCH
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
PREPRINT - J
ULY
19, 2019Table 28: Low Regularization - Number of Terms Retained across SystemsExcluding MCH400 T400 RR6Common Total Common Total Common Total Common TotalBasic 20 48 30 45 27 45 16 45MCH400 14 45 34 63 24 48T400 15 45 26 48RR6 21 45Table 29: Terms Retained across Systems (High Regularization) - Bot
MCH
P1^-1 Phenol^-1 Bot_MCH^1 Bot_Tol^1Bot_MCH^1 Bot_Tol^1 Bot_MCH^1 Bot_Tol^-1Bot_MCH^1 Bot_Tol^-1 Bot_MCH^2Bot_MCH^2 Top_Ph^1 Top_Tol^-1Top_Ph^1 Bot_MCH^1 Top_F^1 Bot_Tol^-1Top_Ph^1 Top_Tol^-1 sin(Bot_T)Top_MCH^1 Bot_Tol^-1 cos(Bot_T)Top_T^1 V_Reb^-1 cos(Q_Cond)Top_F^1Bot_Tol^-1sin(Top_Tol)sin(P1)cos(Top_Tol)cos(Bot_Tol)cos(V_Reb)Bot_T^1Bot_Ph^1sin(Bot_T)sin(P22)cos(Bot_T)cos(Bot_Ph)cos(Q_Cond)cos(Phenol)cos(Top_MCH)cos(Bot_MCH)cos(P22)
B.8 Bottom Stream Phenol Flow Rate
Table 30: High Regularization - Number of Terms Retained across Systems - Bot
P h
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
PREPRINT - J
ULY
19, 2019T400 19 48 22 48RR6 28 57Table 32: Terms Retained across Systems (High Regularization) - Bot
P h
Bot_Tol^1 Phenol^1 Bot_MCH^2Bot_Tol^-1 P1^1 Top_T^-1 V_Reb^1Bot_Tol^1 P1^-1 Top_F^1 Bot_Tol^-1Bot_Tol^2 sin(Q_Cond)Bot_MCH^-1 Phenol^-1 Top_Ph^1 Top_Tol^-1Bot_MCH^1 Bot_Tol^1 sin(Bot_T)Bot_MCH^2Top_Tol^-1Bot_Tol^1Top_Tol^-1Bot_MCH^-1Top_T^-1V_Reb^1Top_F^1Bot_Tol^-1sin(Top_Tol)sin(Bot_MCH)sin(Q_Cond)cos(Top_Tol)cos(Bot_Tol)Q_Cond^1P22^-1Bot_Tol^-1Phenol^-1Bot_MCH^-2Bot_T^1Bot_Ph^1Top_Tol^1Bot_Tol^-1Top_Tol^-1Bot_MCH^1Top_Tol^-2Top_Ph^1Top_Tol^-1Top_MCH^1Bot_Tol^-1sin(Bot_T)sin(Bot_Tol)sin(P1)sin(P22)cos(Top_MCH)cos(Bot_T)cos(Bot_Ph)Q_Cond^-1P22^1sin(Top_MCH)sin(V_Reb)cos(Phenol) 21
PREPRINT - J
ULY
19, 2019
B.9 Bottom Stream Toluene Flow Rate
Table 33: High Regularization - Number of Terms Retained across Systems - Bot
T ol
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
T ol
Bot_Tol^-1 Phenol^-1 Bot_Tol^-1 Phenol^-1Bot_Tol^2 Bot_T^1 Bot_Ph^1Bot_MCH^1 Bot_Tol^-1 sin(Top_Tol)Bot_MCH^2 sin(Q_Cond)Bot_T^1 Bot_Ph^1 Top_Ph^1 Top_Tol^-1log(Bot_MCH)sin(Top_T)sin(Top_MCH)sin(Top_Tol)sin(Q_Cond)sin(V_Reb)sin(P22)cos(Bot_MCH)Q_Cond^1P22^-1Bot_Tol^1Phenol^1Bot_MCH^1Phenol^1Bot_MCH^-1Phenol^1Top_Tol^-1Bot_MCH^1Top_Ph^1Top_Tol^-1Top_T^1V_Reb^-1sin(Bot_T)sin(Bot_Ph)sin(Bot_Tol)sin(P1)cos(Bot_Ph)Top_Ph^2 22
PREPRINT - J
ULY
19, 2019
B.10 Condenser Duty
Table 36: High Regularization - Number of Terms Retained across Systems - Q cond
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
Cond
Bot_Tol^1 Phenol^1 Bot_Tol^1 P22^-1Bot_Tol^1 P22^-1 Top_Tol^1 Bot_MCH^1Bot_MCH^1 Bot_Tol^-1 Top_MCH^-1 Q_Cond^1Top_Tol^1 Bot_MCH^1 P1^1 Phenol^1Top_Ph^-2Top_MCH^-1Q_Cond^1Top_F^-1Bot_Tol^1sin(Top_Tol)P1^1Phenol^1Bot_MCH^1Phenol^1sin(P22)
B.11 Reboiler Vapor Flow Rate
Table 39: High Regularization - Number of Terms Retained across Systems - Vap
Reb
Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
PREPRINT - J
ULY
19, 2019MCH400 13 50 38 64 23 51T400 11 50 23 51RR6 18 50Table 41: Terms Retained across Systems (High Regularization) - Vap
Reb
Bot_Tol^-1 Phenol^-1 Top_Ph^1 Top_Tol^-1Bot_Tol^1 P1^-1 Top_T^-1 V_Reb^1Bot_Tol^-1 V_Reb^1 sin(P22)Bot_MCH^-1 Phenol^1 cos(Top_MCH)Bot_MCH^1 Bot_Tol^1 cos(Bot_MCH)Bot_MCH^1 Bot_Tol^-1 cos(Bot_Ph)Bot_MCH^2Top_Tol^-1Bot_Tol^-1Top_Ph^1Top_Tol^-1Top_T^-1V_Reb^1sin(Bot_Ph)sin(P22)cos(Top_MCH)cos(Bot_MCH)cos(Bot_Ph)cos(Bot_Tol)Top_Tol^1Bot_MCH^1Top_Tol^-1Bot_MCH^-1sin(Bot_T)sin(Q_Cond)sin(Bot_MCH)cos(V_Reb)cos(Phenol)
B.12 Pressure Stage 1 (Condenser)
Table 42: High Regularization - Number of Terms Retained across Systems - P Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6
PREPRINT - J
ULY
19, 2019Table 44: Terms Retained across Systems (High Regularization) - P Bot_Tol^-1 Phenol^-1 Bot_Tol^1 P22^-1Bot_Tol^1 P22^-1 Bot_MCH^1 Bot_Tol^-1Bot_MCH^1 Bot_Tol^-1 Top_Tol^-1 Bot_Tol^1Bot_MCH^2 Top_Tol^1 Bot_MCH^1Top_Tol^-1 Bot_Tol^1 sin(Bot_MCH)Top_Tol^1 Bot_MCH^1Top_Ph^-1 P1^1Top_Ph^-1 Top_Tol^-1Top_Ph^-2log(Bot_MCH)sin(Top_Tol)sin(Bot_MCH)Bot_MCH^-1 Phenol^1Bot_T^-1 Bot_Ph^-1Top_Ph^1 Top_Tol^-1sin(P22)
B.13 Pressure Stage 22 (Reboiler)
Table 45: High Regularization - Number of Terms Retained across Systems - P Excluding MCH400 T400 RR6
Common Total Common Total Common Total Common Total
Basic
MCH400
T400
RR6 Bot_MCH^1 Bot_Tol^-1 Bot_MCH^1 Bot_Tol^-1Bot_T^1 Bot_Ph^1 Top_Tol^-1 Bot_Tol^1Top_Tol^-1 Bot_Tol^1 sin(Bot_MCH)sin(Top_Tol)sin(Bot_MCH)Bot_Tol^1P22^-1Bot_MCH^-1Phenol^1Bot_T^-1Bot_Ph^-1 25
PREPRINT - J