Bilinear Optimal Control of an Advection-reaction-diffusion System
BBILINEAR OPTIMAL CONTROL OF ANADVECTION-REACTION-DIFFUSION SYSTEM
ROLAND GLOWINSKI, YONGCUN SONG, XIAOMING YUAN, AND HANGRUI YUE
To the memory of J. L. Lions (1928-2001) who suggested investigating problems like (BCP) below.
Abstract.
We consider the bilinear optimal control of an advection-reaction-diffusion system,where the control arises as the velocity field in the advection term. Such a problem is generallychallenging from both theoretical analysis and algorithmic design perspectives mainly because thestate variable depends nonlinearly on the control variable and an additional divergence-free con-straint on the control is coupled together with the state equation. Mathematically, the proof of theexistence of optimal solutions is delicate, and up to now, only some results are known for a fewspecial cases where additional restrictions are imposed on the space dimension and the regularity ofthe control. We prove the existence of optimal controls and derive the first-order optimality condi-tions in general settings without any extra assumption. Computationally, the well-known conjugategradient (CG) method can be applied conceptually. However, due to the additional divergence-freeconstraint on the control variable and the nonlinear relation between the state and control vari-ables, it is challenging to compute the gradient and the optimal stepsize at each CG iteration, andthus nontrivial to implement the CG method. To address these issues, we advocate a fast innerpreconditioned CG method to ensure the divergence-free constraint and an efficient inexactnessstrategy to determine an appropriate stepsize. An easily implementable nested CG method is thusproposed for solving such a complicated problem. For the numerical discretization, we combinefinite difference methods for the time discretization and finite element methods for the space dis-cretization. Efficiency of the proposed nested CG method is promisingly validated by the results ofsome preliminary numerical experiments. Introduction
Background and Motivation.
The optimal control of distributed parameter systems hasimportant applications in various scientific areas, such as physics, chemistry, engineering, medicine,and finance. We refer to, e.g. [12, 13, 14, 22, 26, 28], for a few references. In a typical mathematicalmodel of a controlled distributed parameter system, either boundary or internal locally distributedcontrols are usually used; these controls have localized support and are called additive controlsbecause they arise in the model equations as additive terms. Optimal control problems with additivecontrols have received a significant attention in past decades following the pioneering work of J. L.Lions [22], and many mathematical and computational tools have been developed, see e.g., [12, 13,14, 23, 27, 29]. However, it is worth noting that additive controls describe the effect of externaladded sources or forces and they do not change the principal intrinsic properties of the controlledsystem. Hence, they are not suitable to deal with processes whose principal intrinsic propertiesshould be changed by some control actions. For instance, if we aim at changing the reaction rate in
Date : January 7, 2021.2020
Mathematics Subject Classification.
Key words and phrases.
Bilinear optimal control, advection-reaction-diffusion system, conjugate gradient method,nested iteration, finite element method, finite difference method.The work of R.G is supported by the US Department of Energy (ORNL) and the Hong Kong based Kennedy WongFoundation. The work of X.Y is supported by the General Research Fund 12302318 from the Hong Kong ResearchGrants Council. a r X i v : . [ m a t h . O C ] J a n R. GLOWINSKI, Y. SONG, X. YUAN, AND H. YUE some chain reaction-type processes from biomedical, nuclear, and chemical applications, additivecontrols amount to controlling the chain reaction by adding into or withdrawing out of a certainamount of the reactants, which is not realistic. To address this issue, a natural idea is to usecertain catalysts or smart materials to control the systems, which can be mathematically modeledby optimal control problems with bilinear controls. We refer to [19] for more detailed discussions.Bilinear controls, also known as multiplicative controls, enter the model as coefficients of thecorresponding partial differential equations (PDEs). These bilinear controls can change some mainphysical characteristics of the system under investigation, such as a natural frequency responseof a beam or the rate of a chemical reaction. In the literature, bilinear controls of distributedparameter systems have become an increasingly popular topic and bilinear optimal control problemsconstrained by various PDEs, such as elliptic equations [20], convection-diffusion equations [3],parabolic equations [18], the Schr¨odinger equation [16] and the Fokker-Planck equation [7], havebeen widely studied both mathematically and computationally.In particular, bilinear controls play a crucial role in optimal control problems modeled byadvection-reaction-diffusion systems. On one hand, the control can be the coefficient of the diffusionor the reaction term. For instance, a system controlled by the so-called catalysts that can accelerateor slow down various chemical or biological reactions can be modeled by a bilinear optimal controlproblem for an advection-reaction-diffusion equation where the control arises as the coefficient ofthe reaction term [18]; this kind of bilinear optimal control problems have been studied in e.g.,[3, 4, 18, 19]. On the other hand, the systems can also be controlled by the velocity field in theadvection term, which captures important applications in e.g., bioremediation [15], environmentalremediation process [21], and mixing enhancement of different fluids [24]. We note that there isa very limited research being done on the velocity field controlled bilinear optimal control prob-lems; and only some special one-dimensional space cases have been studied in [15, 17, 21] for theexistence of an optimal control and the derivation of first-order optimality conditions. To the bestof our knowledge, no work has been done yet to develop efficient numerical methods for solvingmulti-dimensional bilinear optimal control problems controlled by the velocity field in the advec-tion term. All these facts motivate us to study bilinear optimal control problems constrained by anadvection-reaction-diffusion equation, where the control enters into the model as the velocity fieldin the advection term. Actually, investigating this kind of problems was suggested to one of us (R.Glowinski), in the late 1990’s, by J. L. Lions (1928-2001).1.2.
Model.
Let Ω be a bounded domain of R d with d ≥ (cid:26) u ∈ U ,J ( u ) ≤ J ( v ) , ∀ v ∈ U , (BCP)with the objective functional J defined by(1.1) J ( v ) = 12 (cid:90) (cid:90) Q | v | dxdt + α (cid:90) (cid:90) Q | y − y d | dxdt + α (cid:90) Ω | y ( T ) − y T | dx, and y = y ( t ; v ) the solution of the following advection-reaction-diffusion equation ∂y∂t − ν ∇ y + v · ∇ y + a y = f in Q,y = g on Σ ,y (0) = φ. (1.2)Above and below, Q = Ω × (0 , T ) and Σ = Γ × (0 , T ) with 0 < T < + ∞ ; α ≥ , α ≥ , α + α > y d and y T are given in L ( Q ) and L (Ω), respectively; the diffusion coefficient ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 3 ν > a are assumed to be constants; the functions f ∈ L ( Q ), g ∈ L (0 , T ; H / (Γ)) and φ ∈ L (Ω). The set U of the admissible controls is defined by U := { v | v ∈ [ L ( Q )] d , ∇ · v = 0 } . Clearly, the control variable v arises in (BCP) as a flow velocity field in the advection term of (1.2),and the divergence-free constraint ∇ · v = 0 implies that the flow is incompressible. One can controlthe system by changing the flow velocity v in order that y and y ( T ) are good approximations to y d and y T , respectively.1.3. Difficulties and Goals.
In this paper, we intend to study the bilinear optimal control prob-lem (BCP) in the general case of d ≥ u , and its first-orderoptimality condition. Then, computationally, we propose an efficient and relatively easy to im-plement numerical method to solve (BCP). For this purpose, we advocate combining a conjugategradient (CG) method with a finite difference method (for the time discretization) and a finiteelement method (for the space discretization) for the numerical solution of (BCP). Although thesenumerical approaches have been well developed in the literature, it is nontrivial to implement themto solve (BCP) as discussed below, due to the complicated problem settings.1.3.1. Difficulties in Algorithmic Design.
Conceptually, a CG method for solving (BCP) can beeasily derived following [14]. However, CG algorithms are challenging to implement numerically forthe following reasons: 1). The state y depends non-linearly on the control v despite the fact thatthe state equation (1.2) is linear. 2). The additional divergence-free constraint on the control v ,i.e., ∇ · v = 0, is coupled together with the state equation (1.2).To be more precise, the fact that the state y is a nonlinear function of the control v makes theoptimality system a nonlinear problem. Hence, seeking a suitable stepsize in each CG iterationrequires solving an optimization problem and it can not be as easily computed as in the linear case[14]. Note that commonly used line search strategies are too expensive to employ in our settingsbecause they require evaluating the objective functional value J ( v ) repeatedly and every evaluationof J ( v ) entails solving the state equation (1.2). The same concern on the computational cost alsoapplies when the Newton method is employed to solve the corresponding optimization problemfor finding a stepsize. To tackle this issue, we propose an efficient inexact stepsize strategy whichrequires solving only one additional linear parabolic problem and is cheap to implement as shownin Section 3.Furthermore, due to the divergence-free constraint ∇ · v = 0, an extra projection onto theadmissible set U is required to compute the first-order differential of J at each CG iteration inorder that all iterates of the CG method are feasible. Generally, this projection subproblem hasno closed-form solution and has to be solved iteratively. Here, we introduce a Lagrange multiplierassociated with the constraint ∇ · v = 0, then the computation of the first-order differential DJ ( v )of J at v is equivalent to solving a Stokes type problem. Inspired by [9], we advocate employing apreconditioned CG method, which operates on the space of the Lagrange multiplier, to solve theresulting Stokes type problem. With an appropriately chosen preconditioner, a fast convergenceof the resulting preconditioned CG method can be expected in practice (and indeed, has beenobserved).1.3.2. Difficulties in Numerical Discretization.
For the numerical discretization of (BCP), we notethat if an implicit finite difference scheme is used for the time discretization of the state equation(1.2), a stationary advection-reaction-diffusion equation should be solved at each time step. Tosolve this stationary advection-reaction-diffusion equation, it is well known that standard finiteelement techniques may lead to strongly oscillatory solutions unless the mesh-size is sufficiently
R. GLOWINSKI, Y. SONG, X. YUAN, AND H. YUE small with respect to the ratio between ν and (cid:107) v (cid:107) . In the context of optimal control problems,to overcome such difficulties, different stabilized finite element methods have been proposed andanalyzed, see e.g., [1, 6]. Different from the above references, we implement the time discretizationby a semi-implicit finite difference method for simplicity, namely, we use explicit advection andreaction terms and treat the diffusion term implicitly. Consequently, only a simple linear ellipticequation is required to be solved at each time step. We then implement the space discretizationof the resulting elliptic equation at each time step by a standard piecewise linear finite elementmethod and the resulting linear system is very easy to solve.Moreover, we recall that the divergence-free constraint ∇· v = 0 leads to a projection subproblem,which is equivalent to a Stokes type problem, at each iteration of the CG algorithm. As discussedin [8], to discretize a Stokes type problem, direct applications of standard finite element methodsalways lead to an ill-posed discrete problem. To overcome this difficulty, one can use different typesof element approximations for pressure and velocity. Inspired by [8, 9], we employ the Bercovier-Pironneau finite element pair [2] (also known as P - P iso P finite element) to approximate thecontrol v and the Lagrange multiplier associated with the divergence-free constraint. More con-cretely, we approximate the Lagrange multiplier by a piecewise linear finite element space which istwice coarser than the one for the control v . In this way, the discrete problem is well-posed andcan be solved by a preconditioned CG method. As a byproduct of the above discretization, thetotal number of degrees of freedom of the discrete Lagrange multiplier is only d d of the numberof the discrete control. Hence, the inner preconditioned CG method is implemented in a lower-dimensional space than that of the state equation (1.2), implying a computational cost reduction.With the above mentioned discretization schemes, we can relatively easily obtain the fully discreteversion of (BCP) and derive the discrete analogue of our proposed nested CG method.1.4. Organization.
An outline of this paper is as follows. In Section 2, we prove the existence ofoptimal controls for (BCP) and derive the associated first-order optimality conditions. An easilyimplementable nested CG method is proposed in Section 3 for solving (BCP) numerically. InSection 4, we discuss the numerical discretization of (BCP) by finite difference and finite elementmethods. Some preliminary numerical results are reported in Section 5 to validate the efficiency ofour proposed numerical approach. Finally, some conclusions are drawn in Section 6.2.
Existence of optimal controls and first-order optimality conditions
In this section, first we present some notation and known results from the literature that will beused in later analysis. Then, we prove the existence of optimal controls for (BCP) and derive theassociated first-order optimality conditions. Without loss of generality, we assume that f = 0 and g = 0 in (1.2) for convenience.2.1. Preliminaries.
Throughout, we denote by L s (Ω) and H s (Ω) the usual Sobolev spaces for any s >
0. The space H s (Ω) denotes the completion of C ∞ (Ω) in H s (Ω), where C ∞ (Ω) denotes thespace of all infinitely differentiable functions over Ω with a compact support in Ω. In addition, weshall also use the following vector-valued function spaces: L (Ω) := [ L (Ω)] d , L div (Ω) := { v ∈ L (Ω) , ∇ · v = 0 in Ω } . Let X be a Banach space with a norm (cid:107) · (cid:107) X , then the space L (0 , T ; X ) consists of all measurablefunctions z : (0 , T ) → X satisfying (cid:107) z (cid:107) L (0 ,T ; X ) := (cid:18)(cid:90) T (cid:107) z ( t ) (cid:107) X dt (cid:19) < + ∞ . ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 5
With the above notation, it is clear that the admissible set U can be denoted as U := L (0 , T ; L div (Ω)).Moreover, the space W (0 , T ) consists of all functions z ∈ L (0 , T ; H (Ω)) such that ∂z∂t ∈ L (0 , T ; H − (Ω))exists in a weak sense, i.e. W (0 , T ) := { z | z ∈ L (0 , T ; H (Ω)) , ∂z∂t ∈ L (0 , T ; H − (Ω)) } , where H − (Ω)(= H (Ω) (cid:48) ) is the dual space of H (Ω).Next, we summarize some known results for the advection-reaction-diffusion equation (1.2) inthe literature for the convenience of further analysis.The variational formulation of the state equation (1.2) reads: find y ∈ W (0 , T ) such that y (0) = φ and ∀ z ∈ L (0 , T ; H (Ω)),(2.1) (cid:90) T (cid:28) ∂y∂t , z (cid:29) H − (Ω) ,H (Ω) dt + ν (cid:90) (cid:90) Q ∇ y · ∇ zdxdt + (cid:90) (cid:90) Q v · ∇ yzdxdt + a (cid:90) (cid:90) Q yzdxdt = 0 , where (cid:104)· , ·(cid:105) H − (Ω) ,H (Ω) denotes the duality pairing between H − (Ω) and H (Ω). The existence anduniqueness of the solution y ∈ W (0 , T ) to problem (2.1) can be proved by standard argumentsrelying on the Lax-Milgram theorem, we refer to [22] for the details. Moreover, we can definethe control-to-state operator S : U → W (0 , T ), which maps v to y = S ( v ). Then, the objectivefunctional J in (BCP) can be reformulated as J ( v ) = 12 (cid:90) (cid:90) Q | v | dxdt + α (cid:90) (cid:90) Q | S ( v ) − y d | dxdt + α (cid:90) Ω | S ( v )( T ) − y T | dx, and the nonlinearity of the solution operator S implies that (BCP) is nonconvex.For the solution y ∈ W (0 , T ), we have the following estimates. Lemma 2.1.
Let v ∈ L (0 , T ; L div (Ω)) , then the solution y ∈ W (0 , T ) of the state equation (1.2)satisfies the following estimate: (2.2) (cid:107) y ( t ) (cid:107) L (Ω) + 2 ν (cid:90) t (cid:107)∇ y ( s ) (cid:107) L (Ω) ds + 2 a (cid:90) t (cid:107) y ( s ) (cid:107) L (Ω) ds = (cid:107) φ (cid:107) L (Ω) . Proof.
We first multiply the state equation (1.2) by y ( t ), then applying the Green’s formula inspace yields(2.3) 12 ddt (cid:107) y ( t ) (cid:107) L (Ω) = − ν (cid:107)∇ y ( t ) (cid:107) L (Ω) − a (cid:107) y ( t ) (cid:107) L (Ω) . The desired result (2.2) can be directly obtained by integrating (2.3) over [0 , t ]. (cid:3) Above estimate implies that(2.4) y is bounded in L (0 , T ; H (Ω)) . On the other hand, ∂y∂t = ν ∇ y − v · ∇ y − a y, and the right hand side is bounded in L (0 , T ; H − (Ω)). Hence,(2.5) ∂y∂t is bounded in L (0 , T ; H − (Ω)) . Furthermore, since ∇ · v = 0, it is clear that (cid:90) (cid:90) Q v ·∇ yzdxdt = (cid:90) (cid:90) Q ∇ y · ( v z ) dxdt = − (cid:90) (cid:90) Q y ∇· ( v z ) dxdt = − (cid:90) (cid:90) Q y ( v ·∇ z ) dxdt, ∀ z ∈ L (0 , T ; H (Ω)) . R. GLOWINSKI, Y. SONG, X. YUAN, AND H. YUE
Hence, the variational formulation (2.1) can be equivalently written as:“ find y ∈ W (0 , T ) suchthat y (0) = φ and ∀ z ∈ L (0 , T ; H (Ω)), (cid:90) T (cid:28) ∂y∂t , z (cid:29) H − (Ω) ,H (Ω) dt + ν (cid:90) (cid:90) Q ∇ y · ∇ zdxdt − (cid:90) (cid:90) Q ( v · ∇ z ) ydxdt + a (cid:90) (cid:90) Q yzdxdt = 0 . Existence of Optimal Controls.
With above preparations, we prove in this subsection theexistence of optimal controls for (BCP). For this purpose, we first show that the objective functional J is weakly lower semi-continuous. Lemma 2.2.
The objective functional J given by (1.1) is weakly lower semi-continuous. That is,if a sequence { v n } converges weakly to ¯ v in L (0 , T ; L div (Ω)) , we have J (¯ v ) ≤ lim inf n →∞ J ( v n ) . Proof.
Let { v n } be a sequence that converges weakly to ¯ v in L (0 , T ; L div (Ω)) and y n := y ( x, t ; v n )the solution of the following variational problem: find y n ∈ W (0 , T ) such that y n (0) = φ and ∀ z ∈ L (0 , T ; H (Ω)),(2.6) (cid:90) T (cid:28) ∂y n ∂t , z (cid:29) H − (Ω) ,H (Ω) dt + ν (cid:90) (cid:90) Q ∇ y n · ∇ zdxdt − (cid:90) (cid:90) Q ( v n · ∇ z ) y n dxdt + a (cid:90) (cid:90) Q y n zdxdt = 0 . Moreover, it follows from (2.4) and (2.5) that there exists a subsequence of { y n } , still denoted by { y n } for convenience, such that y n → ¯ y weakly in L (0 , T ; H (Ω)) , and ∂y n ∂t → ∂ ¯ y∂t weakly in L (0 , T ; H − (Ω)) . Since Ω is bounded, it follows directly from the compactness property (also known as Rellich’sTheorem) that y n → ¯ y strongly in L (0 , T ; L (Ω)) . Taking v n → ¯ v weakly in L (0 , T ; L div (Ω)) into account, we can pass the limit in (2.6) and derivethat ¯ y (0) = φ and ∀ z ∈ L (0 , T ; H (Ω)), (cid:90) T (cid:28) ∂ ¯ y∂t , z (cid:29) H − (Ω) ,H (Ω) dt + ν (cid:90) (cid:90) Q ∇ ¯ y · ∇ zdxdt − (cid:90) (cid:90) Q (¯ v · ∇ z )¯ ydxdt + a (cid:90) (cid:90) Q ¯ yzdxdt = 0 , which implies that ¯ y is the solution of the state equation (1.2) associated with ¯ v .Since any norm of a Banach space is weakly lower semi-continuous, we have thatlim inf n →∞ J ( v n )=lim inf n →∞ (cid:18) (cid:90) (cid:90) Q | v n | dxdt + α (cid:90) (cid:90) Q | y n − y d | dxdt + α (cid:90) Ω | y n ( T ) − y T | dx (cid:19) ≥ (cid:90) (cid:90) Q | ¯ v | dxdt + α (cid:90) (cid:90) Q | ¯ y − y d | dxdt + α (cid:90) Ω | ¯ y ( T ) − y T | dx = J (¯ v ) . We thus obtain that the objective functional J is weakly lower semi-continuous and complete theproof. (cid:3) Now, we are in a position to prove the existence of an optimal solution u to (BCP). ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 7
Theorem 2.3.
There exists at least one optimal control u ∈ U = L (0 , T ; L div (Ω)) such that J ( u ) ≤ J ( v ) , ∀ v ∈ U .Proof. We first observe that J ( v ) ≥ , ∀ v ∈ U , then the infimum of J ( v ) exists and we denote it as j = inf v ∈U J ( v ) , and there is a minimizing sequence { v n } ⊂ U such thatlim n →∞ J ( v n ) = j. This fact, together with (cid:107) v n (cid:107) L (0 ,T ; L div (Ω)) ≤ J ( v n ), implies that { v n } is bounded in L (0 , T ; L div (Ω)).Hence, there exists a subsequence, still denoted by { v n } , that converges weakly to u in L (0 , T ; L div (Ω)).It follows from Lemma 2.2 that J is weakly lower semi-continuous and we thus have J ( u ) ≤ lim inf n →∞ J ( v n ) = j. Since u ∈ U , we must have J ( u ) = j , and u is therefore an optimal control. (cid:3) We note that the uniqueness of optimal control u cannot be guaranteed and only a local optimalsolution can be pursued because the objective functional J is nonconvex due to the nonlinearrelationship between the state y and the control v .2.3. First-order Optimality Conditions.
Let DJ ( v ) be the first-order differential of J at v and u an optimal control of (BCP). It is clear that the first-order optimality condition of (BCP) reads DJ ( u ) = 0 . In the sequel of this subsection, we discuss the computation of DJ ( v ), which will play an importantrole in subsequent sections.To compute DJ ( v ), we employ a formal perturbation analysis as in [14]. First, let δ v ∈ U be aperturbation of v ∈ U , we clearly have(2.7) δJ ( v ) = (cid:90) (cid:90) Q DJ ( v ) · δ v dxdt, and also δJ ( v ) = (cid:90) (cid:90) Q v · δ v dxdt + α (cid:90) (cid:90) Q ( y − y d ) δydxdt + α (cid:90) Ω ( y ( T ) − y T ) δy ( T ) dx, (2.8)in which δy is the solution of ∂δy∂t − ν ∇ δy + δ v · ∇ y + v · ∇ δy + a δy = 0 in Q,δy = 0 on Σ ,δy (0) = 0 . (2.9)Consider now a function p defined over Q (the closure of Q ); and assume that p is a differentiablefunction of x and t . Multiplying both sides of the first equation in (2.9) by p and integrating over Q , we obtain (cid:90) (cid:90) Q p ∂∂t δydxdt − ν (cid:90) (cid:90) Q p ∇ δydxdt + (cid:90) (cid:90) Q δ v · ∇ ypdxdt + (cid:90) (cid:90) Q v · ∇ δypdxdt + a (cid:90) (cid:90) Q pδydxdt = 0 . R. GLOWINSKI, Y. SONG, X. YUAN, AND H. YUE
Integration by parts in time and application of Green’s formula in space yield (cid:90) Ω p ( T ) δy ( T ) dx − (cid:90) Ω p (0) δy (0) dx + (cid:90) (cid:90) Q (cid:104) − ∂p∂t − ν ∇ p − v · ∇ p + a p (cid:105) δydxdt + (cid:90) (cid:90) Q δ v · ∇ ypdxdt − ν (cid:90) (cid:90) Σ ( ∂δy∂ n p − ∂p∂ n δy ) dxdt + (cid:90) (cid:90) Σ pδy v · n dxdt = 0 . (2.10)where n is the unit outward normal vector at Γ.Next, let us assume that the function p is the solution to the following adjoint system − ∂p∂t − ν ∇ p − v · ∇ p + a p = α ( y − y d ) in Q,p = 0 on Σ ,p ( T ) = α ( y ( T ) − y T ) . (2.11)It follows from (2.8), (2.9), (2.10) and (2.11) that δJ ( v ) = (cid:90) (cid:90) Q ( v − p ∇ y ) · δ v dxdt. which, together with (2.7), implies that(2.12) DJ ( v ) ∈ U , (cid:90) (cid:90) Q DJ ( v ) · z dxdt = (cid:90) (cid:90) Q ( v − p ∇ y ) · z dxdt, ∀ z ∈ U . From the discussion above, the first-order optimality condition of (BCP) can be summarized asfollows.
Theorem 2.4.
Let u ∈ U be a solution of (BCP). Then, it satisfies the following optimalitycondition (cid:90) (cid:90) Q ( u − p ∇ y ) · z dxdt = 0 , ∀ z ∈ U , where y and p are obtained from u via the solutions of the following two parabolic equations: ∂y∂t − ν ∇ y + u · ∇ y + a y = f in Q,y = g on Σ ,y (0) = φ, (state equation) and − ∂p∂t − ν ∇ p − u · ∇ p + a p = α ( y − y d ) in Q,p = 0 on Σ ,p ( T ) = α ( y ( T ) − y T ) . (adjoint equation)3. An Implementable Nested Conjugate Gradient Method
In this section, we discuss the application of a CG strategy to solve (BCP). In particular, weelaborate on the computation of the gradient and the stepsize at each CG iteration, and thus obtainan easily implementable algorithm.
ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 9
A Generic Conjugate Gradient Method for (BCP).
Conceptually, implementing theCG method to (BCP), we readily obtain the following algorithm: (a)
Given u ∈ U . (b) Compute g = DJ ( u ). If DJ ( u ) = 0, then u = u ; otherwise set w = g .For k ≥ u k , g k and w k being known, the last two different from , one computes u k +1 , g k +1 and w k +1 as follows: (c) Compute the stepsize ρ k by solving the following optimization problem (cid:40) ρ k ∈ R ,J ( u k − ρ k w k ) ≤ J ( u k − ρ w k ) , ∀ ρ ∈ R . (3.1) (d) Update u k +1 and g k +1 , respectively, by u k +1 = u k − ρ k w k , and g k +1 = DJ ( u k +1 ) . If DJ ( u k +1 ) = 0, take u = u k +1 ; otherwise, (e) Compute β k = (cid:82)(cid:82) Q | g k +1 | dxdt (cid:82)(cid:82) Q | g k | dxdt , and then update w k +1 = g k +1 + β k w k . Do k + 1 → k and return to ( c ).The above iterative method looks very simple, but practically, the implementation of the CGmethod ( a )–( e ) for the solution of (BCP) is nontrivial. In particular, it is numerically challengingto compute DJ ( v ), ∀ v ∈ U and ρ k as illustrated below. We shall discuss how to address these twoissues in the following part of this section.3.2. Computation of DJ ( v ) . It is clear that the implementation of the generic CG method ( a )–( e ) for the solution of (BCP) requires the knowledge of DJ ( v ) for various v ∈ U , and this has beenconceptually provided in (2.12). However, it is numerically challenging to compute DJ ( v ) by (2.12)due to the restriction ∇ · DJ ( v ) = 0 which ensures that all iterates u k of the CG method meet theadditional divergence-free constraint ∇ · u k = 0. In this subsection, we show that equation (2.12)can be reformulated as a saddle point problem by introducing a Lagrange multiplier associated withthe constraint ∇ · DJ ( v ) = 0. Then, a preconditioned CG method is proposed to solve this saddlepoint problem.We first note that equation (2.12) can be equivalently reformulated as(3.2) DJ ( v )( t ) ∈ S , (cid:90) Ω DJ ( v )( t ) · z dx = (cid:90) Ω ( v ( t ) − p ( t ) ∇ y ( t )) · z dx, ∀ z ∈ S , where S = { z | z ∈ [ L (Ω)] d , ∇ · z = 0 } . Clearly, problem (3.2) is a particular case of(3.3) g ∈ S , (cid:90) Ω g · z dx = (cid:90) Ω f · z dx, ∀ z ∈ S , with f given in [ L (Ω)] d .Introducing a Lagrange multiplier λ ∈ H (Ω) associated with the constraint ∇ · z = 0, it is clearthat problem (3.3) is equivalent to the following saddle point problem(3.4) ( g , λ ) ∈ [ L (Ω)] d × H (Ω) , (cid:90) Ω g · z dx = (cid:90) Ω f · z dx + (cid:90) Ω λ ∇ · z dx, ∀ z ∈ [ L (Ω)] d , (cid:90) Ω ∇ · g qdx = 0 , ∀ q ∈ H (Ω) , which is actually a Stokes type problem.In order to solve problem (3.4), we advocate a CG method inspired from [9, 10]. For this purpose,one has to specify the inner product to be used over H (Ω). As discussed in [9], the usual L -innerproduct, namely, { q, q (cid:48) } → (cid:82) Ω qq (cid:48) dx leads to a CG method with poor convergence properties.Indeed, using some arguments similar to those in [8, 9], we can show that the saddle point problem(3.4) can be reformulated as a linear variational problem in terms of the Lagrange multiplier λ .The corresponding coefficient matrix after space discretization with mesh size h has a conditionnumber of the order of h − , which is ill-conditioned especially for small h and makes the CG methodconverges fairly slow. Hence, preconditioning is necessary for solving problem (3.4). As suggestedin [9], we choose −∇·∇ as a preconditioner for problem (3.4), and the corresponding preconditionedCG method operates in the space H (Ω) equipped with the inner product { q, q (cid:48) } → (cid:82) Ω ∇ q · ∇ q (cid:48) dx and the associated norm (cid:107) q (cid:107) H (Ω) = ( (cid:82) Ω |∇ q | dx ) / , ∀ q, q (cid:48) ∈ H (Ω). The resulting algorithm readsas: G1 Choose λ ∈ H (Ω). G2 Solve g ∈ [ L (Ω)] d , (cid:90) Ω g · z dx = (cid:90) Ω f · z dx + (cid:90) Ω λ ∇ · z dx, ∀ z ∈ [ L (Ω)] d , and r ∈ H (Ω) , (cid:90) Ω ∇ r · ∇ qdx = (cid:90) Ω ∇ · g qdx, ∀ q ∈ H (Ω) . If (cid:82) Ω |∇ r | dx max { , (cid:82) Ω |∇ λ | dx } ≤ tol , take λ = λ and g = g ; otherwise set w = r . For k ≥ λ k , g k , r k and w k being known with the last two different from 0, we compute λ k +1 , g k +1 , r k +1 and if necessary w k +1 , as follows: G3 Solve ¯ g k ∈ [ L (Ω)] d , (cid:90) Ω ¯ g k · z dx = (cid:90) Ω w k ∇ · z dx, ∀ z ∈ [ L (Ω)] d , and ¯ r k ∈ H (Ω) , (cid:90) Ω ∇ ¯ r k · ∇ qdx = (cid:90) Ω ∇ · ¯ g k qdx, ∀ q ∈ H (Ω) , and compute the stepsize via η k = (cid:82) Ω |∇ r k | dx (cid:82) Ω ∇ ¯ r k · ∇ w k dx . ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 11 G4 Update λ k , g k and r k via λ k +1 = λ k − η k w k , g k +1 = g k − η k ¯ g k , and r k +1 = r k − η k ¯ r k . If (cid:82) Ω |∇ r k +1 | dx max { , (cid:82) Ω |∇ r | dx } ≤ tol , take λ = λ k +1 and g = g k +1 ; otherwise, G5 Compute γ k = (cid:82) Ω |∇ r k +1 | dx (cid:82) Ω |∇ r k | dx , and update w k via w k +1 = r k +1 + γ k w k . Do k + 1 → k and return to G3 .Clearly, one only needs to solve two simple linear equations at each iteration of the preconditionedCG algorithm ( G1 )-( G5 ), which implies that the algorithm is easy and cheap to implement. More-over, due to the well-chosen preconditioner −∇ · ∇ , one can expect the above preconditioned CGalgorithm to have a fast convergence; this will be validated by the numerical experiments reportedin Section 5.3.3. Computation of the Stepsize ρ k . Another crucial step to implement the CG method (a) – (e) is the computation of the stepsize ρ k . It is the solution of the optimization problem (3.1) whichis numerically expensive to be solved exactly or up to a high accuracy. For instance, to solve (3.1),one may consider the Newton method applied to the solution of H (cid:48) k ( ρ k ) = 0 , where H k ( ρ ) = J ( u k − ρ w k ) . The Newton method requires the second-order derivative H (cid:48)(cid:48) k ( ρ ) which can be computed via aniterated adjoint technique requiring the solution of four parabolic problems per Newton’s iteration.Hence, the implementation of the Newton method is numerically expensive.The high computational load for solving (3.1) motivates us to implement certain stepsize ruleto determine an approximation of ρ k . Here, we advocate the following procedure to compute anapproximate stepsize ˆ ρ k .For a given w k ∈ U , we replace the state y = S ( u k − ρ w k ) in the objective functional J ( u k − ρ w k )by S ( u k ) − ρS (cid:48) ( u k ) w k , which is indeed the linearization of the mapping ρ (cid:55)→ S ( u k − ρ w k ) at ρ = 0. We thus obtain thefollowing quadratic approximation of H k ( ρ ):(3.5) Q k ( ρ ) := 12 (cid:90) (cid:90) Q | u k − ρ w k | dxdt + α (cid:90) (cid:90) Q | y k − ρz k − y d | dxdt + α (cid:90) Ω | y k ( T ) − ρz k ( T ) − y T | dx, where y k = S ( u k ) is the solution of the state equation (1.2) associated with u k , and z k = S (cid:48) ( u k ) w k satisfies the following linear parabolic problem ∂z k ∂t − ν ∇ z k + w k · ∇ y k + u k · ∇ z k + a z k = 0 in Q,z k = 0 on Σ ,z k (0) = 0 . (3.6) Then, it is easy to show that the equation Q (cid:48) k ( ρ ) = 0 admits a unique solution(3.7) ˆ ρ k = (cid:82)(cid:82) Q g k · w k dxdt (cid:82)(cid:82) Q | w k | dxdt + α (cid:82)(cid:82) Q | z k | dxdt + α (cid:82) Ω | z k ( T ) | dx , and we take ˆ ρ k , which is clearly an approximation of ρ k , as the stepsize in each CG iteration.Altogether, with the stepsize given by (3.7), every iteration of the resulting CG algorithm requiressolving only three parabolic problems, namely, the state equation (1.2) forward in time and theassociated adjoint equation (2.11) backward in time for the computation of g k , and to solving thelinearized parabolic equation (3.6) forward in time for the stepsize ˆ ρ k . For comparison, if the Newtonmethod is employed to compute the stepsize ρ k by solving (3.1), at least six parabolic problems arerequired to be solved at each iteration of the CG method, which is much more expensive numerically. Remark . To find an appropriate stepsize, a natural idea is to employ some line search strategies,such as the backtracking strategy based on the Armijo–Goldstein condition or the Wolf condition,see e.g., [25]. It is worth noting that these line search strategies require the evaluation of J ( v )repeatedly, which is numerically expensive because every evaluation of J ( v ) for a given v requiressolving the state equation (1.2). Moreover, we have implemented the CG method for solving (BCP)with various line search strategies and observed from the numerical results that line search strategiesalways lead to tiny stepsizes making extremely slow the convergence of the CG method.3.4. A Nested CG Method for Solving (BCP).
Following Sections 3.2 and 3.3, we advocatethe following nested CG method for solving (BCP): I. Given u ∈ U . II.
Compute y and p by solving the state equation (1.2) and the adjoint equation (2.11)corresponding to u . Then, for a.e. t ∈ (0 , T ), solve g ( t ) ∈ S , (cid:90) Ω g ( t ) · z dx = (cid:90) Ω ( u ( t ) − p ( t ) ∇ y ( t )) · z dx, ∀ z ∈ S , by the preconditioned CG algorithm ( G1 )–( G5 ); and set w = g . For k ≥ u k , g k and w k being known, the last two different from , one computes u k +1 , g k +1 and w k +1 as follows: III.
Compute the stepsize ˆ ρ k by (3.7). IV.
Update u k +1 by u k +1 = u k − ˆ ρ k w k . Compute y k +1 and p k +1 by solving the state equation (1.2) and the adjoint equation (2.11)corresponding to u k +1 ; and for a.e. t ∈ (0 , T ), solve g k +1 ( t ) ∈ S , (cid:90) Ω g k +1 ( t ) · z dx = (cid:90) Ω ( u k +1 ( t ) − p k +1 ( t ) ∇ y k +1 ( t )) · z dx, ∀ z ∈ S , by the preconditioned CG algorithm ( G1 )–( G5 ).If (cid:82)(cid:82) Q | g k +1 | dxdt (cid:82)(cid:82) Q | g | dxdt ≤ tol , take u = u k +1 ; else ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 13 V. Compute β k = (cid:82)(cid:82) Q | g k +1 | dxdt (cid:82)(cid:82) Q | g k | dxdt , and w k +1 = g k +1 + β k w k . Do k + 1 → k and return to III .4.
Space and time discretizations
In this section, we discuss first the numerical discretization of the bilinear optimal control problem(BCP). We achieve the time discretization by a semi-implicit finite difference method and the spacediscretization by a piecewise linear finite element method. Then, we discuss an implementablenested CG method for solving the fully discrete bilinear optimal control problem.4.1.
Time Discretization of (BCP).
First, we define a time discretization step ∆ t by ∆ t = T /N ,with N a positive integer. Then, we approximate the control space U = L (0 , T ; S ) by U ∆ t := ( S ) N ;and equip U ∆ t with the following inner product( v , w ) ∆ t = ∆ t N (cid:88) n =1 (cid:90) Ω v n · w n dx, ∀ v = { v n } Nn =1 , w = { w n } Nn =1 ∈ U ∆ t , and the norm (cid:107) v (cid:107) ∆ t = (cid:32) ∆ t N (cid:88) n =1 (cid:90) Ω | v n | dx (cid:33) , ∀ v = { v n } Nn =1 ∈ U ∆ t . Then, (BCP) is approximated by the following semi-discrete bilinear control problem (BCP) ∆ t :(BCP) ∆ t (cid:40) u ∆ t ∈ U ∆ t ,J ∆ t ( u ∆ t ) ≤ J ∆ t ( v ) , ∀ v = { v n } Nn =1 ∈ U ∆ t , where the cost functional J ∆ t is defined by J ∆ t ( v ) = 12 ∆ t N (cid:88) n =1 (cid:90) Ω | v n | dx + α t N (cid:88) n =1 (cid:90) Ω | y n − y nd | dx + α (cid:90) Ω | y N − y T | dx, with { y n } Nn =1 the solution of the following semi-discrete state equation: y = φ ; then for n =1 , . . . , N , with y n − being known, we obtain y n from the solution of the following linear ellipticproblem: y n − y n − ∆ t − ν ∇ y n + v n · ∇ y n − + a y n − = f n in Ω ,y n = g n on Γ . (4.1) Remark . For simplicity, we have chosen a one-step semi-explicit scheme to discretize system(1.2). This scheme is first-order accurate and reasonably robust, once combined to an appropri-ate space discretization. The application of second-order accurate time discretization schemes tooptimal control problems has been discussed in e.g., [5].
Remark . At each step of scheme (4.1), we only need to solve a simple linear elliptic problem toobtain y n from y n − , and there is no particular difficulty in solving such a problem.The existence of a solution to the semi-discrete bilinear optimal control problem (BCP) ∆ t canbe proved in a similar way as what we have done for the continuous case. Let u ∆ t be a solution of(BCP) ∆ t , then it verifies the following first-order optimality condition: DJ ∆ t ( u ∆ t ) = 0 , where DJ ∆ t ( v ) is the first-order differential of the functional J ∆ t at v ∈ U ∆ t .Proceeding as in the continuous case, we can show that DJ ∆ t ( v ) = { g n } Nn =1 ∈ U ∆ t where g n ∈ S , (cid:90) Ω g n · w dx = (cid:90) Ω ( v n − p n ∇ y n − ) · w dx, ∀ w ∈ S , and the vector-valued function { p n } Nn =1 is the solution of the semi-discrete adjoint system below: p N +1 = α ( y N − y T );for n = N , solve p N − p N +1 ∆ t − ν ∇ p N = α ( y N − y Nd ) in Ω ,p N = 0 on Γ , and for n = N − , · · · , , solve p n − p n +1 ∆ t − ν ∇ p n − v n +1 · ∇ p n +1 + a p n +1 = α ( y n − y nd ) in Ω ,p n = 0 on Γ . Space Discretization of (BCP) ∆ t . In this subsection, we discuss the space discretizationof (BCP) ∆ t , obtaining thus a full space-time discretization of (BCP). For simplicity, we supposefrom now on that Ω is a polygonal domain of R (or has been approximated by a family of suchdomains).Let T H be a classical triangulation of Ω, with H the largest length of the edges of the triangles of T H . From T H we construct T h with h = H/ T H .We first consider the finite element space V h defined by V h = { ϕ h | ϕ h ∈ C ( ¯Ω); ϕ h | T ∈ P , ∀ T ∈ T h } with P the space of the polynomials of two variables of degree ≤
1. Two useful sub-spaces of V h are V h = { ϕ h | ϕ h ∈ V h , ϕ h | Γ = 0 } := V h ∩ H (Ω) , and (assuming that g ( t ) ∈ C (Γ)) V gh ( t ) = { ϕ h | ϕ h ∈ V h , ϕ h ( Q ) = g ( Q, t ) , ∀ Q vertex of T h located on Γ } . In order to construct the discrete control space, we introduce firstΛ H = { ϕ H | ϕ H ∈ C ( ¯Ω); ϕ H | T ∈ P , ∀ T ∈ T H } , and Λ H = { ϕ H | ϕ H ∈ Λ H , ϕ H | Γ = 0 } . Then, the discrete control space U ∆ th is defined by U ∆ th = ( S h ) N , with S h = { v h | v h ∈ V h × V h , (cid:90) Ω ∇ · v h q H dx (cid:18) = − (cid:90) Ω v h · ∇ q H dx (cid:19) = 0 , ∀ q H ∈ Λ H } . With the above finite element spaces, we approximate (BCP) and (BCP) ∆ t by (BCP) ∆ th definedby(BCP) ∆ th (cid:40) u ∆ th ∈ U ∆ th ,J ∆ th ( u ∆ th ) ≤ J ∆ th ( v ∆ th ) , ∀ v ∆ th ∈ U ∆ th , ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 15 where the fully discrete cost functional J ∆ th is defined by(4.2) J ∆ th ( v ∆ th ) = 12 ∆ t N (cid:88) n =1 (cid:90) Ω | v n,h | dx + α t N (cid:88) n =1 (cid:90) Ω | y n,h − y nd | dx + α (cid:90) Ω | y N,h − y T | dx with { y n,h } Nn =1 the solution of the following fully discrete state equation: y ,h = φ h ∈ V h , where φ h verifies φ h ∈ V h , ∀ h > , and lim h → φ h = φ, in L (Ω) , then, for n = 1 , . . . , N , with y n − ,h being known, we obtain y n,h ∈ V gh ( n ∆ t ) from the solution ofthe following linear variational problem:(4.3) (cid:90) Ω y n,h − y n − ,h ∆ t ϕdx + ν (cid:90) Ω ∇ y n,h ·∇ ϕdx + (cid:90) Ω v n ·∇ y n − ,h ϕdx + (cid:90) Ω a y n − ,h ϕdx = (cid:90) Ω f n ϕdx, ∀ ϕ ∈ V h . In the following discussion, the subscript h in all variables will be omitted for simplicity.In a similar way as what we have done in the continuous case, one can show that the first-orderdifferential of J ∆ th at v ∈ U ∆ th is DJ ∆ th ( v ) = { g n } Nn =1 ∈ ( S h ) N where(4.4) g n ∈ S h , (cid:90) Ω g n · z dx = (cid:90) Ω ( v n − p n ∇ y n − ) · z dx, ∀ z ∈ S h , and the vector-valued function { p n } Nn =1 is the solution of the following fully discrete adjoint system:(4.5) p N +1 = α ( y N − y T );for n = N , solve p N ∈ V h , (cid:90) Ω p N − p N +1 ∆ t ϕdx + ν (cid:90) Ω ∇ p N · ∇ ϕdx = (cid:90) Ω α ( y N − y Nd ) ϕdx, ∀ ϕ ∈ V h , (4.6)then, for n = N − , · · · , , , solve p n ∈ V h , (cid:90) Ω p n − p n +1 ∆ t ϕdx + ν (cid:90) Ω ∇ p n · ∇ ϕdx − (cid:90) Ω v n +1 · ∇ p n +1 ϕdx + a (cid:90) Ω p n +1 ϕdx = (cid:90) Ω α ( y n − y nd ) ϕdx, ∀ ϕ ∈ V h . (4.7)It is worth mentioning that the so-called discretize-then-optimize strategy is employed here, whichimplies that we first discretize (BCP), and to compute the gradient in a discrete setting, the fullydiscrete adjoint equation (4.5)–(4.7) has been derived from the fully discrete cost functional J ∆ th ( v )(4.2) and the fully discrete state equation (4.3). This implies that the fully discrete state equation(4.3) and the fully discrete adjoint equation (4.5)–(4.7) are strictly in duality. This fact guaranteesthat − DJ ∆ th ( v ) is a descent direction of the fully discrete bilinear optimal control problem (BCP) ∆ th . Remark . A natural alternative has been advocated in the literature: (i) Derive the adjointequation to compute the first-order differential of the cost functional in a continuous setting; (ii)Discretize the state and adjoint state equations by certain numerical schemes; (iii) Use the resultingdiscrete analogs of y and p to compute a discretization of the differential of the cost functional. Themain problem with this optimize-then-discretize approach is that it may not preserve a strict dualitybetween the discrete state equation and the discrete adjoint equation. This fact implies in turnthat the resulting discretization of the continuous gradient may not be a gradient of the discrete optimal control problem. As a consequence, the resulting algorithm is not a descent algorithm anddivergence may take place (see [11] for a related discussion).4.3. A Nested CG Method for Solving the Fully Discrete Problem (BCP) ∆ th . In thissubsection, we propose a nested CG method for solving the fully discrete problem (BCP) ∆ th . Asdiscussed in Section 3, the implementation of CG requires the knowledge of DJ ∆ th ( v ) and anappropriate stepsize. In the following discussion, we address these two issues by extending theresults for the continuous case in Sections 3.2 and 3.3 to the fully discrete settings; and derive thecorresponding CG algorithm.First, it is clear that one can compute DJ ∆ th ( v ) via the solution of the N linear variationalproblems encountered in (4.4). For this purpose, we introduce a Lagrange multiplier λ ∈ Λ H associated with the divergence-free constraint, then problem (4.4) is equivalent to the followingsaddle point system(4.8) ( g n , λ ) ∈ ( V h × V h ) × Λ H , (cid:90) Ω g n · z dx = (cid:90) Ω ( v n − p n ∇ y n − ) · z dx + (cid:90) Ω λ ∇ · z dx, ∀ z ∈ V h × V h , (cid:90) Ω ∇ · g n qdx = 0 , ∀ q ∈ Λ H . As discussed in Section 3.2, problem (4.8) can be solved by the following preconditioned CGalgorithm, which is actually a discrete analogue of ( G1 )–( G5 ). DG1
Choose λ ∈ Λ H . DG2
Solve g n ∈ V h × V h , (cid:90) Ω g n · z dx = (cid:90) Ω ( v n − p n ∇ y n − ) · z dx + (cid:90) Ω λ ∇ · z dx, ∀ z ∈ V h × V h , and r ∈ Λ H , (cid:90) Ω ∇ r · ∇ qdx = (cid:90) Ω ∇ · g n qdx, ∀ q ∈ Λ H . If (cid:82) Ω |∇ r | dx max { , (cid:82) Ω |∇ λ | dx } ≤ tol , take λ = λ and g n = g n ; otherwise set w = r . For k ≥ λ k , g kn , r k and w k being known with the last two different from 0, we define λ k +1 , g k +1 n , r k +1 and if necessary w k +1 , as follows: DG3
Solve ¯ g kn ∈ V h × V h , (cid:90) Ω ¯ g kn · z dx = (cid:90) Ω w k ∇ · z dx, ∀ z ∈ V h × V h , and ¯ r k ∈ Λ H , (cid:90) Ω ∇ ¯ r k · ∇ qdx = (cid:90) Ω ∇ · ¯ g kn qdx, ∀ q ∈ Λ H , and compute η k = (cid:82) Ω |∇ r k | dx (cid:82) Ω ∇ ¯ r k · ∇ w k dx . ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 17
DG4
Update λ k , g kn and r k via λ k +1 = λ k − η k w k , g k +1 n = g kn − η k ¯ g kn , and r k +1 = r k − η k ¯ r k . If (cid:82) Ω |∇ r k +1 | dx max { , (cid:82) Ω |∇ r | dx } ≤ tol , take λ = λ k +1 and g n = g k +1 n ; otherwise, DG5
Compute γ k = (cid:82) Ω |∇ r k +1 | dx (cid:82) Ω |∇ r k | dx , and update w k via w k +1 = r k +1 + γ k w k . Do k + 1 → k and return to DG3 .To find an appropriate stepsize in the CG iteration for the solution of (BCP) ∆ th , we note that,for any { w n } Nn =1 ∈ ( S h ) N , the fully discrete analogue of Q k ( ρ ) in (3.5) reads as Q ∆ th ( ρ ) = 12 ∆ t N (cid:88) n =1 (cid:90) Ω | u n − ρ w n | dx + α t N (cid:88) n =1 (cid:90) Ω | y n − ρz n − y nd | dx + α (cid:90) Ω | y N − ρz N − y T | dx, where the vector-valued function { z n } Nn =1 is obtained as follows: z = 0; then for n = 1 , . . . , N ,with z n − being known, z n is obtained from the solution of the linear variational problem z n ∈ V h , (cid:90) Ω z n − z n − ∆ t ϕdx + ν (cid:90) Ω ∇ z n · ∇ ϕdx + (cid:90) Ω w n · ∇ y n ϕdx + (cid:90) Ω u n · ∇ z n − ϕdx + a (cid:90) Ω z n − ϕdx = 0 , ∀ ϕ ∈ V h . As discussed in Section 3.3 for the continuous case, we take the unique solution of Q ∆ th (cid:48) ( ρ ) = 0 asthe stepsize in each CG iteration, that is(4.9) ˆ ρ ∆ th = ∆ t (cid:80) Nn =1 (cid:82) Ω g n · w n dx ∆ t (cid:80) Nn =1 (cid:82) Ω | w n | dxdt + α ∆ t (cid:80) Nn =1 (cid:82) Ω | z n | dxdt + α (cid:82) Ω | z N | dx . Finally, with above preparations, we propose the following nested CG algorithm for the solutionof the fully discrete control problem (BCP) ∆ th . DI.
Given u := { u n } Nn =1 ∈ ( S h ) N . DII.
Compute { y n } Nn =0 and { p n } N +1 n =1 by solving the fully discrete state equation (4.3) and thefully discrete adjoint equation (4.5)–(4.7) corresponding to u . Then, for n = 1 , · · · , N solve g n ∈ S h , (cid:90) Ω g n · z dx = (cid:90) Ω ( u n − p n ∇ y n − ) · z dx, ∀ z ∈ S h , by the preconditioned CG algorithm ( DG1 )–(
DG5 ), and set w n = g n . For k ≥ u k , g k and w k being known, the last two different from , one computes u k +1 , g k +1 and w k +1 as follows: DIII.
Compute the stepsize ˆ ρ k by (4.9). DIV.
Update u k +1 by u k +1 = u k − ˆ ρ k w k . Compute { y k +1 n } Nn =0 and { p k +1 n } N +1 n =1 by solving the fully discrete state equation (4.3) and thefully discrete adjoint equation (4.5)–(4.7) corresponding to u k +1 . Then, for n = 1 , · · · , N ,solve(4.10) g k +1 n ∈ S h , (cid:90) Ω g k +1 n · z dx = (cid:90) Ω ( u k +1 n − p k +1 n ∇ y k +1 n − ) · z dx, ∀ z ∈ S h , by the preconditioned CG algorithm ( DG1 )–(
DG5 ).If ∆ t (cid:80) Nn =1 (cid:82) Ω | g k +1 n | dx ∆ t (cid:80) Nn =1 (cid:82) Ω | g n | dx ≤ tol , take u = u k +1 ; else DV.
Compute β k = ∆ t (cid:80) Nn =1 (cid:82) Ω | g k +1 n | dx ∆ t (cid:80) Nn =1 (cid:82) Ω | g kn | dx , and w k +1 = g k +1 + β k w k . Do k + 1 → k and return to DIII .Despite its apparent complexity, the CG algorithm ( DI )-( DV ) is easy to implement. Actually,one of the main computational difficulties in the implementation of the above algorithm seems to bethe solution of N linear systems (4.10), which is time-consuming. However, it is worth noting thatthe linear systems (4.10) are separable with respect to different n and they can be solved in parallel.As a consequent, one can compute the gradient { g kn } Nn =1 simultaneously and the computation timecan be reduced significantly.Moreover, it is clear that the computation of { g kn } Nn =1 requires the storage of the solutions of (4.3)and (4.5)-(4.7) at all points in space and time. For large scale problems, especially in three spacedimensions, it will be very memory demanding and maybe even impossible to store the full sets { y kn } Nn =0 and { p kn } N +1 n =1 simultaneously. To tackle this issue, one can employ the strategy describedin e.g., [14, Section 1.12] that can drastically reduce the storage requirements at the expense of asmall CPU increase. 5. Numerical Experiments
In this section, we report some preliminary numerical results validating the efficiency of theproposed CG algorithm ( DI )–( DV ) for (BCP). All codes were written in MATLAB R2016b andnumerical experiments were conducted on a Surface Pro 5 laptop with 64-bit Windows 10.0 oper-ation system, Intel(R) Core(TM) i7-7660U CPU (2.50 GHz), and 16 GB RAM. Example 1.
We consider the bilinear optimal control problem (BCP) on the domain Q = Ω × (0 , T )with Ω = (0 , and T = 1. In particular, we take the control v ( x, t ) in a finite-dimensional space,i.e. v ∈ L (0 , T ; R ). In addition, we set α = 0 in (1.1) and consider the following tracking-typebilinear optimal control problem:(5.1) min v ∈ L (0 ,T ; R ) J ( v ) = 12 (cid:90) T | v ( t ) | dt + α (cid:90) (cid:90) Q | y − y d | dxdt, where | v ( t ) | = (cid:112) v ( t ) + v ( t ) is the canonical 2-norm, and y is obtained from v via the solutionof the state equation (1.2). ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 19
Since the control v is considered in a finite-dimensional space, the divergence-free constraint ∇ · v = 0 is verified automatically. As a consequence, the first-order differential DJ ( v ) can beeasily computed. Indeed, it is easy to show that(5.2) DJ ( v ) = (cid:26) v i ( t ) + (cid:90) Ω y ( t ) ∂p ( t ) ∂x i dx (cid:27) i =1 , a.e. on (0 , T ) , ∀ v ∈ L (0 , T ; R ) , where p ( t ) is the solution of the adjoint equation (2.11). The inner preconditioned CG algorithm( DG1 )-(
DG5 ) for the computation of the gradient { g n } Nn =1 is thus avoided.In order to examine the efficiency of the proposed CG algorithm ( DI )–( DV ), we construct anexample with a known exact solution. To this end, we set ν = 1 and a = 1 in (1.2), and y = e t ( − πx ) sin( πx ) + 1 . πx ) sin(2 πx )) , p = ( T − t ) sin πx sin πx . Substituting these two functions into the optimality condition DJ ( u ( t )) = 0, we have u = ( u , u ) (cid:62) = (2 e t ( T − t ) , − e t ( T − t )) (cid:62) . We further set f = ∂y∂t − ∇ y + u · ∇ y + y, φ = − πx ) sin( πx ) + 1 . πx ) sin(2 πx ) ,y d = y − α (cid:18) − ∂p∂t − ∇ p − u · ∇ p + p (cid:19) , g = 0 . Then, it is easy to verify that u is a solution point of the problem (5.1). We display the solution u and the target function y d at different instants of time in Figure 1 and Figure 2, respectively. t -1-0.500.511.52 u Exact u Exact u Figure 1.
The exact optimal control u for Example 1.The stopping criterion of the CG algorithm ( DI )–( DV ) is set as∆ t (cid:80) Nn =1 | g k +1 n | ∆ t (cid:80) Nn =1 | g n | ≤ − . The initial value is chosen as u = (0 , (cid:62) ; and we denote by u ∆ t and y ∆ th the computed controland state, respectively.First, we take h = i , i = 5 , , ,
8, ∆ t = h and α = 10 , and implement the proposed CGalgorithm ( DI )–( DV ) for solving the problem (5.1). The numerical results reported in Table 1show that the CG algorithm converges fairly fast and is robust with respect to different mesh sizes.We also observe that the target function y d has been reached within a good accuracy. Similarcomments hold for the approximation of the optimal control u and of the state y of problem (5.1). Figure 2.
The target function y d at t = 0 . , . .
75 (from left to right) forExample 1.By taking h = and ∆ t = , the computed state y ∆ th and y ∆ th − y d at t = 0 . , . .
75 arereported in Figures 3, 4 and 5, respectively; and the computed control u ∆ t and error u ∆ t − u arevisualized in Figure 6. Table 1.
Results of the CG algorithm ( DI )–( DV ) with different h and ∆ t forExample 1. Mesh sizes
Iter (cid:107) u ∆ t − u (cid:107) L (0 ,T ; R ) (cid:107) y ∆ th − y (cid:107) L ( Q ) (cid:107) y ∆ th − y d (cid:107) L ( Q ) / (cid:107) y d (cid:107) L ( Q ) h = 1 / , ∆ t = 1 /
117 2.8820 × − × − × − h = 1 / , ∆ t = 1 /
48 1.3912 × − × − × − h = 1 / , ∆ t = 1 /
48 6.9095 × − × − × − h = 1 / , ∆ t = 1 /
31 3.4845 × − × − × − Figure 3.
Computed state y ∆ th , error y ∆ th − y and y ∆ th − y d (from left to right) at t = 0 .
25 for Example 1.Furthermore, we tested the proposed CG algorithm ( DI )–( DV ) with h = and ∆ t = fordifferent penalty parameter α . The results reported in Table 2 show that the performance of theproposed CG algorithm is robust with respect to the penalty parameter, at least for the examplebeing considered. We also observe that as α increases, the value of (cid:107) y ∆ th − y d (cid:107) L Q ) (cid:107) y d (cid:107) L Q ) decreases. Thisimplies that, as expected, the computed state y ∆ th is closer to the target function y d when thepenalty parameter gets larger. Example 2.
As in Example 1, we consider the bilinear optimal control problem (BCP) on thedomain Q = Ω × (0 , T ) with Ω = (0 , and T = 1. Now, we take the control v ( x, t ) in theinfinite-dimensional space U = { v | v ∈ [ L ( Q )] , ∇ · v = 0 } . We set α = 0 in (1.1), ν = 1 and ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 21
Figure 4.
Computed state y ∆ th , error y ∆ th − y and y ∆ th − y d (from left to right) at t = 0 . Figure 5.
Computed state y ∆ th , error y ∆ th − y and y ∆ th − y d (from left to right) at t = 0 .
75 for Example 1. t -1.5-1-0.500.511.522.5 u Computed u Computed u t -0.015-0.01-0.00500.0050.010.0150.02 e rr o r Error u Error u Figure 6.
Computed optimal control u ∆ t and error u ∆ t − u for Example 1. Table 2.
Results of the CG algorithm ( DI )–( DV ) with different α for Example 1. α Iter CP U ( s ) (cid:107) u ∆ t − u (cid:107) L (0 ,T ; R ) (cid:107) y ∆ th − y (cid:107) L ( Q ) (cid:107) y ∆ th − y d (cid:107) L Q ) (cid:107) y d (cid:107) L Q )
46 126.0666 1.3872 × − × − × −
48 126.4185 1.3908 × − × − × −
48 128.2346 1.3912 × − × − × −
48 127.1858 1.3912 × − × − × −
48 124.1160 1.3912 × − × − × − a = 1 in (1.2), and consider the following tracking-type bilinear optimal control problem:(5.3) min v ∈U J ( v ) = 12 (cid:90) (cid:90) Q | v | dxdt + α (cid:90) (cid:90) Q | y − y d | dxdt, where y is obtained from v via the solution of the state equation (1.2).First, we let y = e t ( − πx ) sin( πx ) + 1 . πx ) sin(2 πx )) ,p = ( T − t ) sin πx sin πx , and u = P U ( p ∇ y ) , where P U ( · ) is the projection onto the set U .We further set f = ∂y∂t − ∇ y + u · ∇ y + y, φ = − πx ) sin( πx ) + 1 . πx ) sin(2 πx ) ,y d = y − α (cid:18) − ∂p∂t − ∇ p − u · ∇ p + p (cid:19) , g = 0 . Then, it is easy to show that u is a solution point of the problem (5.3). We note that u = P U ( p ∇ y )has no analytical solution and it can only be solved numerically. Here, we solve u = P U ( p ∇ y ) bythe preconditioned CG algorithm ( DG1 )–(
DG5 ) with h = and ∆ t = , and use the resultingcontrol u as a reference solution for the example we considered. Figure 7.
The target function y d with h = and ∆ t = at t = 0 . , . .
75 (from left to right) for Example 2.The stopping criteria of the outer CG algorithm ( DI )–( DV ) and the inner preconditioned CGalgorithm ( DG1 )–(
DG5 ) are respectively set as∆ t (cid:80) Nn =1 (cid:82) Ω | g k +1 n | dx ∆ t (cid:80) Nn =1 (cid:82) Ω | g n | dx ≤ × − , and (cid:82) Ω |∇ r k +1 | dx max { , (cid:82) Ω |∇ r | dx } ≤ − . The initial values are chosen as u = (0 , (cid:62) and λ = 0; and we denote by u ∆ th and y ∆ th thecomputed control and state, respectively.First, we take h = i , i = 6 , ,
8, ∆ t = h , α = 10 , and implement the proposed nested CGalgorithm ( DI )–( DV ) for solving the problem (5.3). The numerical results reported in Table 3show that the CG algorithm converges fast and is robust with respect to different mesh sizes. Inaddition, the preconditioned CG algorithm ( DG1 )–(
DG5 ) converges within 10 iterations for allcases and thus is efficient for computing the gradient { g n } Nn =1 . We also observe that the targetfunction y d has been reached within a good accuracy. Similar comments hold for the approximationof the optimal control u and of the state y of problem (5.3).Taking h = and ∆ t = , the computed state y ∆ th , the error y ∆ th − y and y ∆ th − y d at t = 0 . , . , .
75 are reported in Figures 8, 9 and 10, respectively; and the computed control u ∆ th , ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 23
Table 3.
Results of the nested CG algorithm ( DI )–( DV ) with different h and ∆ t for Example 2. Mesh sizes
Iter CG M axIter
P CG (cid:107) u ∆ th − u (cid:107) L ( Q ) (cid:107) y ∆ th − y (cid:107) L ( Q ) (cid:107) y ∆ th − y d (cid:107) L Q ) (cid:107) y d (cid:107) L Q ) h = 1 / , ∆ t = 1 /
443 9 3.7450 × − × − × − h = 1 / , ∆ t = 1 /
410 9 1.8990 × − × − × − h = 1 / , ∆ t = 1 /
405 8 1.1223 × − × − × − the exact control u , and the error u ∆ th − u at t = 0 . , . , .
75 are presented in Figures 11, 12 and13.
Figure 8.
Computed state y ∆ th , error y ∆ th − y and y ∆ th − y d with h = and ∆ t = (from left to right) at t = 0 .
25 for Example 2.
Figure 9.
Computed state y ∆ th , error y ∆ th − y and y ∆ th − y d with h = and ∆ t = (from left to right) at t = 0 . Figure 10.
Computed state y ∆ th , error y ∆ th − y and y ∆ th − y d with h = and∆ t = (from left to right) at t = 0 .
75 for Example 2.
Figure 11.
Computed control u ∆ th and exact control u (left, from top to bottom)and the error u ∆ th − u (right) with h = and ∆ t = at t = 0 .
25 for Example 2.
Figure 12.
Computed control u ∆ th and exact control u (left, from top to bottom)and the error u ∆ th − u (right) with h = and ∆ t = at t = 0 . Figure 13.
Computed control u ∆ th and exact control u (left, from top to bottom)and the error u ∆ th − u (right) with h = and ∆ t = at t = 0 .
75 for Example 2.
ILINEAR OPTIMAL CONTROL OF AN ADVECTION-REACTION-DIFFUSION SYSTEM 25 Conclusion and Outlook
We studied the bilinear control of an advection-reaction-diffusion system, where the control vari-able enters the model as a velocity field of the advection term. Mathematically, we proved theexistence of optimal controls and derived the associated first-order optimality conditions. Compu-tationally, the conjugate gradient (CG) method was suggested and its implementation is nontrivial.In particular, an additional divergence-free constraint on the control variable leads to a projectionsubproblem to compute the gradient; and the computation of a stepsize at each CG iteration re-quires solving the state equation repeatedly due to the nonlinear relation between the state andcontrol variables. To resolve the above issues, we reformulated the gradient computation as aStokes-type problem and proposed a fast preconditioned CG method to solve it. We also proposedan efficient inexactness strategy to determine the stepsize, which only requires the solution of onelinear parabolic equation. An easily implementable nested CG method was thus proposed. Forthe numerical discretization, we employed the standard piecewise linear finite element method andthe Bercovier-Pironneau finite element method for the space discretizations of the bilinear optimalcontrol and the Stokes-type problem, respectively, and a semi-implicit finite difference method forthe time discretization. The resulting algorithm was shown to be numerically efficient by somepreliminary numerical experiments.We focused in this paper on an advection-reaction-diffusion system controlled by a general formvelocity field. In a real physical system, the velocity field may be determined by some partial differ-ential equations (PDEs), such as the Navier-Stokes equations. As a result, we meet some bilinearoptimal control problems constrained by coupled PDE systems. Moreover, instead of (1.1), one canalso consider other types of objective functionals in the bilinear optimal control of an advection-reaction-diffusion system. For instance, one can incorporate (cid:82)(cid:82) Q |∇ v | dxdt and (cid:82)(cid:82) Q | ∂ v ∂t | dxdt intothe objective functional to promote that the optimal velocity field has the least rotation and isalmost steady, respectively, which are essential in e.g., mixing enhancement for different flows [24].All these problems are of practical interest but more challenging from algorithmic design perspec-tives, and they have not been well-addressed numerically in the literature. Our current work haslaid a solid foundation for solving these problems and we leave them in the future. References
1. R. Becker and B. Vexler,
Optimal control of the convection-diffusion equation using stabilized finite elementmethods , Numerische Mathematik, 106 (2007), pp. 349–367.2. M. Bercovier and O. Pironneau,
Error estimates for finite element method solution of the Stokes problem in theprimitive variables . Numerische Mathematik, 33 (1979), pp. 211–224.3. A. Borz`ı, E.-J. Park and M. Vallejos Lass,
Multigrid optimization methods for the optimal control of convection-diffusion problems with bilinear control , Journal of Optimization Theory and Applications, 168 (2016), pp. 510–533.4. P. Cannarsa, G. Floridia and A. Y. Khapalov,
Multiplicative controllability for semilinear reaction–diffusionequations with finitely many changes of sign , Journal de Math´ematiques Pures et Appliqu´ees 108 (2017), pp.425–458.5. C. Carthel, R. Glowinski and J. L. Lions,
On exact and approximate boundary controllabilities for the heatequation: a numerical approach , Journal of Optimization Theory and Applications, 82 (1994), pp. 429–484.6. L. Dede’ and A. Quarteroni,
Optimal control and numerical adaptivity for advection-diffusion equations , ESAIM:Mathematical Modelling and Numerical Analysis, 39 (2005), pp. 1019–1040.7. A. Fleig and R. Guglielmi,
Optimal control of the Fokker–Planck equation with space-dependent controls , Journalof Optimization Theory and Applications, 174 (2017), pp. 408–427.8. R. Glowinski,
Ensuring well-posedness by analogy; Stokes problem and boundary control for the wave equation ,Journal of Computational Physics, 103 (1992), pp. 189–221.9. R. Glowinski,
Finite Element Methods for Incompressible Viscous Flow , Handbook of Numerical Analysis, Vol.9, Elsevier, Amsterdam, 2003, pp. 3-1176.
10. R. Glowinski,
Variational Methods for the Numerical Solution of Nonlinear Elliptic Problems , Society for Indus-trial and Applied Mathematics, Philadelphia, 2015.11. R. Glowinski and J. He,
On shape optimization and related issues , In Computational Methods for Optimal Designand Control, J. Borggaard, J. Burns, E. Cliff & S. Schreck (eds.), Birkh¨auser, Boston, MA, 1998, pp. 151–179.12. R. Glowinski and J. L. Lions,
Exact and approximate controllability for distributed parameter systems, Part I ,Acta Numerica, 3 (1994), pp. 269–378.13. R. Glowinski and J. L. Lions,
Exact and approximate controllability for distributed parameter systems, Part II ,Acta Numerica, 4 (1995), pp. 159–328.14. R. Glowinski, J. L. Lions and J. He,
Exact and Approximate Controllability for Distributed Parameter Systems:A Numerical Approach (Encyclopedia of Mathematics and its Applications) , Cambridge University Press, 2008.15. N. Handagama and S. Lenhart,
Optimal control of a PDE/ODE system modeling a gas-phase bioreactor , InMathematical Models in Medical and Health Sciences, M. A. Horn, G. Simonett, and G. Webb (eds.), VanderbiltUniversity Press, Nashville, TN, 1998.16. K. Ito and K. Kunisch,
Optimal bilinear control of an abstract Schr¨odinger equation , SIAM Journal on Controland Optimization, 46 (2007), pp. 274–287.17. H. R. Joshi,
Optimal control of the convective velocity coefficient in a parabolic problem , Nonlinear Analysis:Theory, Methods & Applications, 63 (2005), pp. e1383–e1390.18. A. Y. Khapalov,
Controllability of the semilinear parabolic equation governed by a multiplicative control in thereaction term: a qualitative approach , SIAM Journal on Control and Optimization, 41 (2003), pp. 1886–1900.19. A. Y. Khapalov,
Controllability of Partial Differential Equations Governed by Multiplicative Controls , Springer,2010.20. A. Kr¨oner and B. Vexler,
A priori error estimates for elliptic optimal control problems with a bilinear stateequation , Journal of Computational and Applied Mathematics, 230 (2009), pp. 781–802.21. S. Lenhart,
Optimal control of a convective-diffusive fluid problem , Mathematical Models and Methods in AppliedSciences, 5 (1995), pp. 225–237.22. J. L. Lions,
Optimal Control of Systems Governed by Partial Differential Equations (Grundlehren der Mathema-tischen Wissenschaften) , Vol. 170, Springer Berlin, 1971.23. J. L. Lions,
Exact controllability, stabilization and perturbations for distributed systems , SIAM Review, 30 (1988),pp. 1–68.24. W. Liu.
Mixing enhancement by optimal flow advection , SIAM Journal on Control and Optimization, 47 (2008),pp. 624–638.25. J. Nocedal, and S.J. Wright,
Numerical Optimization , Second Edition, Springer, 2006.26. F. Tr¨oltzsch,
Optimal Control of Partial Differential Equations: Theory, Methods, and Applications , Vol. 112,American Mathematical Society, 2010.27. E. Zuazua,
Propagation, observation, and control of waves approximated by finite difference methods , SIAMReview, 47 (2005), pp. 197–243.28. E. Zuazua,
Controllability of Partial Differential Equations , 3rd cycle, Castro Urdiales (Espagne), 2006, pp.311.cel-00392196.29. E. Zuazua,
Controllability and observability of partial differential equations: some results and open problems ,Handbook of differential equations: evolutionary equations, Vol. 3, North-Holland, 2007, pp. 527–621.
Department of Mathematics, University of Houston, 44800 Calhoun Road, Houston, TX 77204, USA.
E-mail address : [email protected] Department of Mathematics, The University of Hong Kong, Pok Fu Lam, Hong Kong, PRC.
E-mail address : ysong307 @ hku.hk. Department of Mathematics, The University of Hong Kong, Pok Fu Lam, Hong Kong, PRC.
E-mail address : xmyuan @ hku.hk. Department of Mathematics, The University of Hong Kong, Pok Fu Lam, Hong Kong, PRC.
E-mail address : yuehangrui @@