[PDF] Closed-Loop Turbulence Control Using Machine Learning

Abstract

We propose a general model-free strategy for feedback control design of turbulent flows. This strategy called 'machine learning control' (MLC) is capable of exploiting nonlinear mechanisms in a systematic unsupervised manner. It relies on an evolutionary algorithm that is used to evolve an ensemble of feedback control laws until minimization of a targeted cost function. This methodology can be applied to any non-linear multiple-input multiple-output (MIMO) system to derive an optimal closed-loop control law. MLC is successfully applied to the stabilization of nonlinearly coupled oscillators exhibiting frequency cross-talk, to the maximization of the largest Lyapunov exponent of a forced Lorenz system, and to the mixing enhancement in an experimental mixing layer flow. We foresee numerous potential applications to most nonlinear MIMO control problems, particularly in experiments.

Full PDF

aa r X i v : . [ phy s i c s . f l u - dyn ] A p r Under consideration for publication in J. Fluid Mech. Closed-Loop Turbulence Control UsingMachine Learning

Thomas Duriez , † , Vladimir Parezanovi´c , Laurent Cordier ,Bernd R. Noack , Jo¨el Delville , Jean-Paul Bonnet ,Marc Segond and Markus Abel , , Institut PPRIME, CNRS - Universit´e de Poitiers - ENSMA, UPR 3346, D´epartement Fluides,Thermique, Combustion, CEAT, 43, rue de l’A´erodrome, F-86036 Poitiers Cedex, France Laboratoire de M´ecanique de Lille, Boulevard Paul Langevin, 59655 Villeneuve d’Ascq Cedex,France Ambrosys GmbH, Albert-Einstein-Str. 1-5, D-14469 Potsdam, Germany LEMTA, 2 Avenue de la Forˆet de Haye F-54518 Vandoeuvre-l`es-Nancy Cedex, France University of Potsdam, Karl-Liebknecht-Str. 24/25 D-14476 Potsdam, Germany(Received ?; revised ?; accepted ?. - To be entered by editorial oﬃce)

We propose a general model-free strategy for feedback control design of turbulent ﬂows.This strategy called ’machine learning control’ (MLC) is capable of exploiting nonlinearmechanisms in a systematic unsupervised manner. It relies on an evolutionary algorithmthat is used to evolve an ensemble of feedback control laws until minimization of atargeted cost function. This methodology can be applied to any non-linear multiple-input multiple-output (MIMO) system to derive an optimal closed-loop control law. MLCis successfully applied to the stabilization of nonlinearly coupled oscillators exhibitingfrequency cross-talk, to the maximization of the largest Lyapunov exponent of a forcedLorenz system, and to the mixing enhancement in an experimental mixing layer ﬂow.We foresee numerous potential applications to most nonlinear MIMO control problems,particularly in experiments.

Key words:

Nonlinear Dynamical Systems/Chaos, Flow control/Instability control, Tur-bulent ﬂows/Turbulence control.

1. Introduction

Closed-loop turbulence control is a rapidly evolving ﬁeld of ﬂuid mechanics synergizingmany diﬀerent academic disciplines for engineering applications of epic proportion: dragreduction of transport vehicles, green energy harvesting of wind and water ﬂows, andmedical applications, just to name a few.For many laminar ﬂows, control theory has a well established framework for the stabi-lization of the steady Navier-Stokes solution based on a local linearization of the Navier-Stokes equation. Corresponding numerical and experimental stabilization studies includevirtually any conﬁguration, e.g. wakes (Roussopoulos 1993), cavity ﬂows (Rowley & Williams2006; Sipp & Lebedev 2007; Illingworth et al. et al. et al. et al. † Email address for correspondence: [email protected]

T. Duriez and friends. solution in contrast to laminar ﬂow. Second, linear(ized) models cannot resolve im-portant frequency cross-talk between the coherent structures, the mean ﬂow and thestochastic small-scale ﬂuctuations. Yet, frequency cross-talk is an important actuationopportunity as demonstrated by successful wake stabilization with high-frequency actu-ation (Glezer et al. et al. et al. et al. et al. et al. et al. et al. §

2, the machine learning control strategy isdescribed. The three chosen control problems and associated cost functions are deﬁnedin § §

4. Conclusions and future directions areprovided in section §

2. Machine learning control

In the following, we restrict the description to ordinary diﬀerential equations for reasonsof comprehensibility. The system is represented in phase space by the vector a ∈ R n a , it losed-Loop Turbulence Control Using Machine Learning Figure 1.

Left: Control design using MLC. During a learning phase, each control law candidateis evaluated by the dynamical system or experimental plant. This process is iterated over manygenerations of individuals. At convergence, the best individual (in grey) is determined and usedfor control. Right: Production of a new generation of individuals. Each individual K mi is rankedby their cost, J mi , i pointing to the i th individual, m to the m th generation. An individual of thesubsequent generation can be a copy, a mutation or the result of the cross-over of individualsselected in the preceding generation according to their cost. is measured by sensors s ∈ R n s , and controlled by actuators b ∈ R n b , d a dt = F ( a , b ) , s = H ( a ) , b = K ( s ) , (2.1)with F denoting a general nonlinear function, H the measurement function, and K thesensor-based control law. This law shall minimize the state- and actuation-dependentcost function: J = J ( a , b ) . (2.2)The cost function value grades how a given control law K ( s ) performs relatively to theproblem at stake. The lower the value of the cost function, the better the control lawsolves the problem.We propose a model-free design of the control law. The genetic programming is usedto design the best control law K ( s ) as a composition of elementary functions. A ﬁrst setof control law candidates (called individuals) is generated by random compositions ofselected elementary functions. The employed GP algorithm (Luke et al. J ( a , b ). The next setof individuals (called generation) is generated by mutation, cross-over or replication ofindividuals with a speciﬁc rate for each process (see ﬁgure 1).The individuals used to produce the new generation are selected based on how well theyminimize the cost function. A global extremum of the cost function is typically approxi-mated well in a ﬁnite number of generations if the population contains enough diversityto explore the search space. The method has been shown to be successful (Lewis et al.

3. Control problems

MLC is used to put the system in a desirable state as equilibrium ( § § § T. Duriez and friends.

Generalized mean-ﬁeld model

We ﬁrst consider a generalized mean-ﬁeld model describing frequency cross-talk for a vari-ety of physical phenomena including ﬂuid ﬂows (Zielinska et al. et al. t  a a a a  =  σ ω − ω σ σ ω − ω σ   a a a a  +  b  (3.1)with σ = σ − ( a + a + a + a ) . Hereafter, we denote the sum of squared amplitudes as energy to avoid linguistic sophis-tication. We set ω = ω /

10 = 1 and σ = − σ = 0 . a , a )is unstable at the origin while the other one ( a , a ) is stable. When uncontrolled ( b ≡ ,

0) and thus a cost function whichmeasures the ﬂuctuation energy of that unstable oscillator. For any useful application,the energy used for control is required to be small, hence, we penalize the actuationenergy: J = (cid:10) a ( t ) + a ( t ) + γb ( t ) (cid:11) T , (3.2)with γ = 0 .

01 as penalization coeﬃcient and h·i T denoting the average over the timeinterval [0 , T ]. Here, T = 100 × π/ω is chosen to allow meaningful statistics. Thequadratic form of the state and the actuation in the cost function is a standard choice incontrol theory. We apply MLC with full-state observation ( s ≡ a ) to exploit all potentialnonlinear mechanisms to control the unstable oscillator.Knowing the nonlinearity at stake, an open-loop strategy can be designed: exciting thestable oscillator at frequency ω will provoke an energy growth which stabilizes the ﬁrstoscillator as soon as a + a + a + a > σ . Note that the linearization of (3.1) yields twouncoupled oscillators. Thus, the ﬁrst oscillator is uncontrollable in a linear framework.3.2. Lorenz system

As second example, we consider the Lorenz system controlled in the third component:d a d t = σ ( a − a ) , d a d t = a ( ρ − a ) − a , (3.3)d a d t = a a − βa + b, with full-state feedback b = K ( a , a , a ), i.e. s ≡ a . The Lorenz system can be stable,periodic or chaotic depending on the set of used parameters. We employ σ = 10, β =8 / ρ = 20, such that the uncontrolled system ( b ≡

0) is periodic. Instead ofstabilizing an equilibrium, we demonstrate how to obtain a chaotic system. Existingstrategies may stabilize or destabilize periodic orbits (Ott et al. losed-Loop Turbulence Control Using Machine Learning Figure 2.

Experimental setup of the mixing layer. The hot-wire rake is placed at 500 mm downstream of separating plate to capture the structures in the shear layer. The spacing of thehot-wire probe is δy = 8 mm . the largest Lyapunov exponent λ while penalizing the actuation power with a factor γ .If λ is positive, the system is chaotic and well-mixing. We deﬁne the cost function, whichshould be minimized, as: J = exp( − λ ) + γ (cid:10) b ( t ) (cid:11) T if P i =1 λ i < ,J → ∞ if P i =1 λ i > , (3.4)where T = 100 is the integration time and λ > λ > λ are the Lyapunov exponents.These exponents are obtained by a standard algorithm (Wolf et al. J is assignedthe largest computable real number on the computer if the sum of the Lyapunov expo-nents is positive or the states exceed the bounds we specify.3.3. Experimental mixing layer

The TUCOROM mixing layer experimental demonstrator is a dual stream wind tunnelwith independently controlled turbines. The test section after the trailing edge of theseparating plate is of dimension width × height × length = 1 . × . × . m . In thisexperiment, the Reynolds number is Re θ = U c θ/ν = 500 based on convective velocity U c = ( U + U ) / θ . The sensors are made of a rakeof 24 hot-wires to record velocity ﬂuctuations on a vertical proﬁle across the shear layer.The actuators are 96 micro jets located at the tip of the separating plate and which canbe triggered up to 500 Hz (see ﬁgure 2). The machine learning control strategy is appliedto maximize the width of the mixing layer, J = 1 W , with W = DhP i =1 s i ( t ) iE T max i ∈ [1 , ( h s i i T ) , (3.5)where s i ( t ) is the velocity ﬂuctuation as recorded by the hot-wire anemometer i , h·i T isan average of all acquisitions during the evaluation time T = 10 s corresponding to about1000 Kelvin-Helmholtz period. This cost function is minimized when the width W of theﬂuctuation energy proﬁle is maximized. T. Duriez and friends.

Figure 3.

Controlled generalized mean-ﬁeld model. When the energy contained in the ﬁrstoscillator (top) is larger than 10 − the control (bottom) is exciting the second oscillator atfrequency ω , its energy grows so that σ reaches approximately −

5. This results in a fast decayof the energy in the ﬁrst oscillator after which the control goes in “standby mode“. An animationof the controlled system can be found in §

4. Results

In this section, we present the results of the MLC algorithm for the three examplesdiscussed in §

3: a system with frequency cross-talk ( § § § Generalized mean-ﬁeld model

The function space is explored by using a set of elementary (+ , − , × , / ) and transcen-dental (e.g. exp , sin , ln) functions. The functions are ’protected’ to allow them to takearbitrary arguments in R . Additionally, the actuation command is limited to the range[ − ,

1] to emulate an experimental actuator. Up to 50 generations comprising 1000 in-dividuals are processed.The control law ultimately returned by the MLC process corresponds to the bestindividual of the last generation. The formula is given in §

7. It can be summarized asfollows: b = K ( a ) × K ( a , a , a , a ) . (4.1)The function K ( a ) describes a phasor control that destabilizes the stable oscillator.The function K ( a , a , a , a ) acts as a gain dominated by the energy of the unstableoscillator. The performance and the behaviour of the control law are displayed in ﬁgure 3.The control law is energizing the second oscillator up to 10 ≫ σ as soon as theﬁrst oscillator has an energy which is larger than 10 − . This is stabilizing the unstableoscillator very quickly with a decay scaling roughly as exp( − t ). After stabilization, thecontrol stays at very low values. That keeps the stable oscillator at a correspondingly lowenergy ≈ − , while the amplitude of the unstable oscillator is exponentially increasingwith its initial growth rate σ . This control law exploits the frequency cross-talk andvanishes when not needed, i.e. a ≈ a ≈

0. That control could not be derived froma linearized model of the system. Less energy is used as compared to the best periodicexcitation. losed-Loop Turbulence Control Using Machine Learning γ = 1 γ = 0 . γ = 0 Figure 4.

Controlled Lorenz systems with σ = 10, β = 8 / ρ = 20. For γ = 1 (left), thesystem exhibits chaotic behaviour ( λ = 0 . ρ = 28 ( λ = 0 . γ = 0 .

01 (center), the system exhibits more complex trajectories,the nature of the central ﬁxed point has changed and λ = 2 . γ = 0 (right), the natureof all ﬁxed points has changed. The non-penalization of the actuation leads to a change in thescales ( λ = 17 . § Lorenz system

MLC is applied to the periodic Lorenz system to maximize the largest Lyapunov exponentwhile keeping the solution bounded. The basic operations that compose the control laware (+, − , × , / ) as well as randomly generated constants. The maximum number ofgenerations is chosen as 50 with 1000 individuals each. We consider for γ the valuesof γ S = 1, γ W = 0 .

01 and γ N = 0, representing strong, weak and no penalization ofthe actuation. This illustrates how the cost function deﬁnition inﬂuences the problemto be solved. After 50 generations, the best individuals (see §

7) associated with strong,weak and no penalization have maximum Lyapunov exponents of λ = 0 . .

072 and17 . γ S and γ W are aﬃne expressions of a and thereduction of the actuation cost leads to a larger amplitude of the feedback. In those cases,the most eﬃcient controls lead the system into behaviours close to the canonical Lorenzsystem ( ρ = 28, λ = 0 . γ W the nature (from saddle point to spiral saddle point)and the position of the central ﬁxed point from the actuated system are changed. If theactuation is not penalized ( γ = 0) the feedback law is a complex nonlinear function ofall states. The nature and position of all ﬁxed points are changed as λ reaches highervalues. 4.3. Experimental mixing layer

MLC is applied to the TUCOROM experimental mixing layer demonstrator (Parezanovi´c et al. , − , × , /, sin , cos , log , tanh). The micro-jets are turnedon if the action command is positive and oﬀ otherwise. The number of generations is cho-sen to be 25 with 100 individuals each. The evaluation of one generation is achieved in40 minutes of experiment. The employed cost function (3.5) maximizes the width of theﬂuctuation energy proﬁle in the shear layer. The control law ultimately returned by theMLC algorithm is compared to the reference open-loop control, an harmonic forcing atthe most eﬃcient frequency as determined by a parametric study (see ﬁgure 5). While thebest open-loop forcing is able to upgrade the cost function value by 55% (compared tothe uncontrolled ﬂow), MLC is able to improve it by 67%. Moreover, the total ﬂow-ratethrough the actuation jets achieved by the MLC closed-loop control is reduced by 46%.These results shall be further detailed in an upcoming publication. T. Duriez and friends.

Figure 5.

Pseudo-visualizations of the TUCOROM experimental mixing layer demonstra-tor (Parezanovi´c et al. W = 100%), (b)the best open-loop benchmark (width W = 155%) and (c) MLC closed-loop control (width W = 167%). The velocity ﬂuctuations recorded by 24 hot-wires probes (see ﬁgure 2) are shownas contour-plot over the time t (abscissa) and the sensor position y (ordinate). The black stripesabove the controlled cases indicate when the actuator is active (taking into account the convec-tive time). The average actuation frequency achieved by the MLC control is comparable to theopen-loop benchmark.

5. Conclusions and future directions

We propose a model-free optimization of sensor-based control laws for general multiple-input multiple-output (MIMO) plants, the ’machine learning control’ (MLC). This strat-egy is based on genetic programming (GP). GP is one of the most versatile methodsfor function optimization in machine learning and includes genetic algorithms (GA).While GA performs a parameter identiﬁcation of a given control law, GP performs alsoa structure identiﬁcation of arbitrary nonlinear control laws. Thus, MLC comprises anincreasingly popular control optimization based on GA. MLC is based on an ensemble(called ’generation’) of general nonlinear functions (called ’individuals’) and invests in anexploration of novel laws. Thus, MLC has a large chance to detect and exploit otherwiseinvisible local extrema. In contrast, model-free adaptive control is particularly suitedfor adjusting one or few parameters of prescribed open- or closed-loop control laws tochanging ﬂow conditions. Such online parameter adaptation is not part of the presentedMLC method but could — in principle — be included.As our ﬁrst test-case, MLC has been successfully applied to a closed-loop stabiliza-tion of a generalized two-frequency mean-ﬁeld model detecting and exploiting frequencycross-talk in an unsupervised manner. Frequency cross-talk is of primordial importancefor large-scale turbulence control with complex interactions between the coherent struc-tures at diﬀerent dominant frequencies, the mean ﬂow changing on large time scales andthe cascade to small-scale structures with small associated time scales. By deﬁnition, fre-quency cross-talk is ignored in any linearized system. Another successful demonstrationof MLC is closed-loop control for the maximization of the Lyapunov exponent (stretch-ing) of the forced Lorenz equations. Again, this increase of unpredictability is a highlynonlinear phenomenon.A challenging experimental closed-loop control demonstration is the increase of themixing layer width in the TUCOROM wind-tunnel (Parezanovi´c et al. losed-Loop Turbulence Control Using Machine Learning et al.

6. Acknowledgements

We acknowledge funding of the French Science Foundation ANR (Chaire d’ExcellenceTUCOROM, SepaCoDe). MS and MA acknowledge the support of the LINC project (no.289447) funded by ECs Marie-Curie ITN program (FP7-PEOPLE-2011-ITN). We thankSteven Brunton, Eurika Kaiser and Nathan Kutz for fruitful discussions and comments.

7. Supplementary material

A document displaying the control laws derived by MLC is available as supplementarymaterial. A movie displaying the behaviour of the controlled generalized mean-ﬁeld modelis available as Movie 1. A movie displaying the behaviour of the controlled Lorenz systemis available as Movie 2

REFERENCESBagheri, S., Brandt, L. & Henningson, D. S.

J. Fluid Mech. , 263–298.

Choi, H., Moin, P. & Kim, J.

J. Fluid Mech. , 75–110.

Fleming, P. J. & Purshouse, R. C.

Control Engineering Practice (11), 1223–1241. de la Fraga, L. G. & Tlelo-Cuautle, E. Nonlinear Dynamics pp. 1–13.

Glezer, A., Amitay, M. & Honohan, A. M.

AIAA journal (7), 1501–1511. Herv´e, A., Sipp, D., Schmid, P. J. & Samuelides, M.

J. Fluid Mech. , 26–58.

H¨ogberg, M., Bewley, T. R. & Henningson, D. S.

J. Fluid Mech. , 149–175.

Illingworth, S. J., Morgans, A. S. & Rowley, C. W.

J. Fluid Mech. , 223–248. T. Duriez and friends.

King, R.

Active Flow Control II: Papers Contributed to the Conference Active Flow ControlII 2010, Berlin, Germany, May 26 to 28, 2010 , , vol. 108. Springer.

King, R., Becker, R., Feuerbach, G., Henning, L., Petz, R., Nitsche, W., Lemke, O.& Neise, W. , pp. 1–6.

Koza, J. R.

Genetic Programming: On the Programming of Computers by Means of NaturalSelection . The MIT Press.

Lewis, M. A., Fagg, A. H. & Solidum, A.

IEEE InternationalConference on Robotics and Automation , , vol. 3, pp. 2618–2623.

Luchtenburg, D. M., G¨unter, B., Noack, B. R., King, R. & Tadmor, G.

J. Fluid Mech. , 283–316.

Luke, S., Panait, L., Balan, G., Paus, S., Skolicki, Z., Kicinger, R., Popovici, E., Sul-livan, K., Harrison, J., Bassett, J., Hubley, R., Desai, A., Chircop, A., Comp-ton, J., Haddon, W., Donnelly, S., Jamil, B., Zelibor, J., Kangas, E., Abidi, F.,Mooers, H., O’Beirne, J., Talukder, K. A. & McDermott, J. http://cs.gmu.edu/ eclab/projects/ecj/ . Nordin, P. & Banzhaf, W.

Adaptive Behavior (2), 107–140. Noriega, J. R. & Wang, H.

IEEE Trans. Neural Networks (1), 27–34. Ott, E., Grebogi, C. & Yorke, J. A.

Phys. Rev. Lett. (11), 1196. Parezanovi´c, V., Laurentie, J. C., Fourment, C., Cordier, L., Noack, B. R. &Shaqarin, T.

Proceedings of the 8th International Symposium On Turbulent and ShearFlow Phenomena . Pastoor, M., Henning, L., Noack, B. R., King, R. & Tadmor, G.

J. Fluid Mech. , 161–196.

Pyragas, K.

Phys. Lett. A (6), 421–428.

Roussopoulos, K.

J. FluidMech. , 267–296.

Rowley, C. W. & Williams, D. R.

Annu. Rev. Fluid Mech. (1), 251–276. Samimy, M., Debiasi, M., Caraballo, E., Serrani, A., Yuan, X., Little, J. & Myatt,J. H.

J. FluidMech. , 315–346.

Sch¨oll, E. & Schuster, H. G.

Handbook of Chaos Control . Wiley-VCH, Weinheim.

Sipp, D. & Lebedev, A.

J. Fluid Mech. (1), 333–358.

Thiria, B., Goujon-Durand, S. & Wesfreid, J.E.

J. Fluid Mech. , 123–147.

Wahde, M.

Biologically Inspired Optimization Methods: An Introduction . WIT Press.

Wolf, A., Swift, J. B., Swinney, H. L. & Vastano, J. A.

Physica (3), 285–317.

Zielinska, B. J. A., Goujon-Durand, S., Dˇusek, J. & Wesfreid, J. E.

Phys. Rev. Lett.79