Adaptive finite element method for elliptic optimal control problems: convergence and optimality
aa r X i v : . [ m a t h . NA ] M a y ADAPTIVE FINITE ELEMENT METHOD FOR ELLIPTIC OPTIMALCONTROL PROBLEMS: CONVERGENCE AND OPTIMALITY
WEI GONG ∗ AND NINGNING YAN ⋄ Abstract:
In this paper we consider the convergence analysis of adaptive finite elementmethod for elliptic optimal control problems with pointwise control constraints. We use varia-tional discretization concept to discretize the control variable and piecewise linear and continu-ous finite elements to approximate the state variable. Based on the well-established convergencetheory of AFEM for elliptic boundary value problems, we rigorously prove the convergence andquasi-optimality of AFEM for optimal control problems with respect to the state and adjointstate variables, by using the so-called perturbation argument. Numerical experiments confirm ourtheoretical analysis.
Keywords: optimal control problem, elliptic equation, control constraint, adaptive finiteelement method, convergence and optimality
Subject Classification:
Introduction
Adaptive finite element method (AFEM for short), contributed to the pioneer work of Babuˇskaand Rheinboldt ([2]), becomes nowadays a popular approach in the community of engineeringand scientific computing. It aims at distributing more mesh nodes around the area where thesingularities happen to save the computational cost. Various types of reliable and efficient aposteriori error estimators, which are used to detect the location of singularity and essential forthe success of AFEM, have been developed in the last decades for different kind of problems, werefer to [36] for an overview.Although AFEM has been successfully applied for more than three decades, the convergenceanalysis is rather recent which started with D¨orfler [13] and was further studied in [6, 32, 33, 31, 7].Besides convergence, optimality is another important issue in AFEM which was firstly addressedby Binev et al. [6] and further studied by Stevenson ([34, 35]). The so-called D¨orfler’s markingproposed in [13] and quasi-error introduced in [7] consisting of the sum of the energy error and thescaled estimator are crucial to prove the contraction of the errors and quasi-optimal cardinality ofthe standard AFEM which avoids marking for oscillation ([13]) and circumvents the interior nodeproperty of mesh refinement ([32, 33]).AFEM also finds successful application in optimal control problems governed by partial differ-ential equations, starting from Liu, Yan [26] and Becker, Kapp, Rannacher [3]. In [3] the authorsproposed a dual-weighted goal-oriented adaptivity for optimal control problems while in [26] resid-ual type a posteriori error estimates were derived. We refer to [17, 18, 24, 27, 28, 29, 30] for moredetails of recent advance. Recently, Kohls, R¨osch and Siebert derived in [22] an error equivalenceproperty which enables one to derive reliable and efficient a posteriori error estimators for theoptimal control problems with either variational discretization or full control discretization.There also exist some attempts to prove the convergence of AFEM for optimal control problems.In [14] the authors considered the piecewise constant approximation of the control variable and gavea error reduction property for the quadruplet ( u, y, p, σ ), where u, y, p denote the optimal control,
Date : April 16, 2018. ∗ LSEC, Institute of Computational Mathematics, Academy of Mathematics and Systems Science, ChineseAcademy of Sciences, Beijing 100190, China. Email: [email protected] . ⋄ NCMIS, LSEC, Institute of Systems Science, Academy of Mathematics and Systems Science, Chinese Academyof Sciences, Beijing 100190, China. Email: [email protected] . tate, and adjoint state variables and σ the associated co-control variable. However, additionalrequirement on the strict complementarity of the continuous problem and non-degeneracy propertyof the discrete control problem are assumed and marking strategy is extended to include the discretefree boundary between the active and inactive control sets. In [4] the authors viewed the controlproblems as a nonlinear elliptic system of the state and adjoint variables and gave a convergenceproof for adaptive algorithm involving the marking of data oscillation. In [23] the authors provedthat the sequence of adaptively generated discrete solutions converged to the true solutions foroptimal control problems, but obtained only the plain convergence of adaptive algorithm withoutconvergence rate and optimality. In this paper we intend to give a rigorous convergence prooffor the adaptive finite element algorithm of elliptic optimal control problem in an optimal controlframework. We want to stress that the AFEM adopted in the current paper uses D¨orfler’s marking([13]) and is a standard algorithm in that it employs only the error indicators and does not usethe oscillation indicators.Inspired by the work [11] of Dai, Xu and Zhou where the convergence and optimality of AFEMfor elliptic eigenvalue problem are proved by exploiting the certain relationship between the finiteelement eigenvalue approximation and the associated finite element boundary value approximation,in this paper we will provide a rigorous convergence analysis of the adaptive finite element algorithmfor the optimal control problems governed by linear elliptic equation. Under mild assumption on theinitial mesh from which the adaptive algorithm starts, we show that the energy norm errors of thestate and adjoint state variables are equivalent to the boundary value approximations of the stateand adjoint state equations up to a higher order term. Then based on the well-known convergenceresult of AFEM for elliptic boundary value problems, we are able to prove the convergence ofAFEM for the optimal control problems (OCPs for short). To be more specific, the AFEM forOCPs is a contraction, for the sum of the energy errors and the scaled error estimators of the state y and the adjoint state p , between two consecutive adaptive loops. We also show that the AFEMyields a decay rate of the energy errors of the state y and the adjoint state p plus oscillations ofthe state and adjoint state equations in terms of the number of degrees of freedom. This result isan improvement over the plain convergence result presented in [23].The rest of the paper is organised as follows. In Section 2 we recall some well-known resultson the convergence analysis of AFEM for elliptic boundary values problems. In Section 3 weintroduce the finite element approximation of the optimal control problems and derive a posteriorierror estimates. Adaptive finite element algorithm for the optimal control problems based onD¨orfler’s marking is also presented. In Section 4 we give a rigorous convergence analysis of theAFEM for optimal control problems and the quasi-optimal cardinality is proved in Section 5.Numerical experiments are carried out in Section 6 to validate our theoretical result. Finally, wegive a conclusion in Section 7 and outlook the possible extensions and future work.Let Ω ⊂ R d ( d = 2 ,
3) be a bounded polygonal or polyhedral domain. We denote by W m,q (Ω)the usual Sobolev space of order m >
0, 1 q < ∞ with norm k · k m,q, Ω and seminorm | · | m,q, Ω .For q = 2 we denote W m,q (Ω) by H m (Ω) and k · k m, Ω = k · k m, , Ω , which is a Hilbert space. Notethat H (Ω) = L (Ω) and H (Ω) = { v ∈ H (Ω) : v = 0 on ∂ Ω } . We denote C a generic positiveconstant which may stand for different values at its different occurrences but does not depend onmesh size. We use the symbol A . B to denote A CB for some constant C that is independentof mesh size. 2. Preliminaries
In this section, we recall some well-known results on the adaptive finite element approximationto a linear elliptic boundary value problem, which are then used for the convergence analysis ofAFEM for optimal control problems. Some of the results are collected from [7] and [11], see also[16]. onsider the following second order elliptic equation(2.1) (cid:26) Ly = f in Ω ,y = 0 on ∂ Ω , where L is a linear second order elliptic operator of the following form: Ly := − d X i,j =1 ∂∂x j ( a ij ∂y∂x i ) + cy. We denote L ∗ the adjoint operator of LL ∗ y := − d X i,j =1 ∂∂x j ( a ji ∂y∂x i ) + cy. Here a ij ∈ W , ∞ (Ω) ( i, j = 1 , · · · , d ) is symmetric, positive definite and 0 c < ∞ . We denote A = ( a ij ) d × d and A ∗ its adjoint. Let a ( y, v ) = Z Ω d X i,j =1 a ij ∂y∂x i ∂v∂x j + cyv, ∀ y, v ∈ H (Ω) . It is clear that a ( · , · ) is a bounded bilinear form over H (Ω) and defines a norm k · k a, Ω = p a ( · , · )which is equivalent to k · k , Ω .The standard weak form of (2.1) reads as follows: Find y ∈ H (Ω) such that a ( y, v ) = ( f, v ) ∀ v ∈ H (Ω) . (2.2)For each f ∈ H − (Ω) the above problem admits a unique solution by the well-known Lax-Milgramtheorem. Since the elliptic equation (2.2) is linear with respect to the right hand side f , we candefine a linear solution operator S : L (Ω) → H (Ω) such that y = Sf .Let T h be a regular triangulation of Ω such that ¯Ω = ∪ T ∈T h ¯ T . We assume that T h is shaperegular in the sense that: there exists a constant γ ∗ > h T ρ T γ ∗ for all T ∈ T h , where h T denotes the diameter of T and ρ T is the diameter of the biggest ball contained in T . We set h = max T ∈T h h T . In this paper, we use E h to denote the set of interior faces (edges or sides) of T h and T h to denote the number of elements of T h .On T h we construct a family of nested finite element spaces V h consisting of piecewise linear andcontinuous polynomials such that V h ⊂ C ( ¯Ω) ∩ H (Ω). We define the standard Galerkin projectionoperator R h : H (Ω) → V h by ([9]) a ( y − R h y, v h ) = 0 ∀ v h ∈ V h , (2.3)which satisfies the following stability result kR h y k a, Ω . k y k a, Ω ∀ y ∈ H (Ω) . (2.4)A standard finite element approximation to (2.2) can then be formulated as: Find y h ∈ V h suchthat a ( y h , v h ) = ( f, v h ) ∀ v h ∈ V h . (2.5)Similarly, we can define a discrete solution operator S h : L (Ω) → V h such that y h = S h f . Thus,we have y h = R h y = R h Sf .For the following purpose, we follow the idea of [11] to introduce the quantity κ ( h ) as follows κ ( h ) = sup f ∈ L (Ω) , k f k , Ω =1 inf v h ∈ V h k Sf − v h k a, Ω . (2.6)We note that the quantity κ ( h ) is determined by the regularity of Sf which is further influenced bythe property of domain Ω. Indeed, if the boundary of Ω is smooth, like C , the additional regularity Sf ∈ H (Ω) holds and thus κ ( h ) = O ( h ). This is still true for polygonal or polyhedral boundariesif the domain is convex. The regularity is reduced, however, in the vicinity of nonconvex portions ofpolygonal or polyhedral boundaries. Grisvard proved in [15] the precise regularity results (Theorem .4.3 for the two-dimensional case and Corollary 2.6.7 for the three-dimensional case): there existsan ε ∈ (0 , ], which depends on the shape of the domain, such that Sf ∈ H + ε (Ω) for each f ∈ L (Ω). Obviously, κ ( h ) ≪ h ∈ (0 , h ) if h ≪ Proposition 2.1.
For each f ∈ L (Ω) , there hold k Sf − S h f k a, Ω . κ ( h ) k f k , Ω (2.7) and k Sf − S h f k , Ω . κ ( h ) k Sf − S h f k a, Ω . (2.8)Now we are in the position to review the residual type a posteriori error estimator for the finiteelement approximation of elliptic boundary value problem. We define the element residual ˜ r T ( y h )and the jump residual ˜ j E ( y h ) by˜ r T ( y h ) : = f − Ly h = f + ∇ · ( A ∇ y h ) − cy h in T h , (2.9) ˜ j E ( y h ) : = [ A ∇ y h ] E · n E on E ∈ E h , (2.10)where [ A ∇ y h ] E · n E denotes the jump of A ∇ y h across the common side E of elements T + and T − , n E denotes the outward normal oriented to T − . For each element T ∈ T h , we define the local errorindicator ˜ η h ( y h , T ) by˜ η h ( y h , T ) := (cid:16) h T k ˜ r T ( y h ) k ,T + X E ∈E h ,E ⊂ ∂T h E k ˜ j E ( y h ) k ,E (cid:17) . (2.11)Then on a subset ω ⊂ Ω, we define the error estimator ˜ η h ( y h , ω ) by˜ η h ( y h , ω ) := (cid:16) X T ∈T h ,T ⊂ ω ˜ η h ( y h , T ) (cid:17) . (2.12)Thus, ˜ η h ( y h , Ω) constitutes the error estimator on Ω with respect to T h .For f ∈ L (Ω) we also need to define the data oscillation as (see [32, 33])osc( f, T ) := k h T ( f − ¯ f T ) k ,T , osc( f, T h ) := (cid:16) X T ∈T h osc ( f, T ) (cid:17) , (2.13)where ¯ f T denotes the L -projection of f onto piecewise constant space on T . It is easy to see thatosc( f + f , T h ) osc( f , T h ) + osc( f , T h ) , ∀ f , f ∈ L (Ω) . (2.14)For the above defined data oscillation we have the following lemma whose proof can be found in[11, Lemma 2.4]. Lemma 2.2.
There exists a constant C ∗ which depends on A , the mesh regularity constant γ ∗ andcoefficient c , such thatosc ( Lv, T h ) C ∗ k v k a, Ω , osc ( L ∗ v, T h ) C ∗ k v k a, Ω ∀ v ∈ V h . (2.15)Now we can formulate the following global upper and lower bounds for the a posteriori errorestimators of elliptic boundary value problems (see, e.g., [13, 36]): k y − y h k a, Ω ˜ C ˜ η h ( y h , Ω) , (2.16) ˜ C ˜ η h ( y h , Ω) k y − y h k a, Ω + ˜ C osc ( f − Ly h , T h ) . (2.17)For our following purpose we also need to study the adjoint equation of elliptic boundary valueproblem (2.1). For each g ∈ L (Ω), let p ∈ H (Ω) be the solution of the following adjoint equation a ( v, p ) = ( g, v ) ∀ v ∈ H (Ω)(2.18)with its finite element approximation a ( v h , p h ) = ( g, v h ) ∀ v h ∈ V h . (2.19) e can also give the a posteriori global upper and lower error bounds: k p − p h k a, Ω ˜ C ˜ η h ( p h , Ω) , (2.20) ˜ C ˜ η h ( p h , Ω) k p − p h k a, Ω + ˜ C osc ( g − L ∗ p h , T h ) . (2.21)To analyse the adaptive finite element approximation for optimal control problem, we introducea system of two source problems associated with the state and adjoint state equations, which is sometrivial extension for the existing results of adaptive finite element approximation of scalar problem(see [7]). Specifically, we introduce the adaptive finite element algorithm to solve a system ofelliptic boundary value problems (2.2) and (2.18). There are different kinds of adaptive algorithmswhich differ from the marking strategies (see [31, 32, 33]). Here we follow the D¨orfler’s markingintroduced in [13] which marks only the error estimator and avoids the marking for oscillation: Algorithm 2.3.
The D¨orfler’s marking strategy for BVPs (1)
Given a parameter < θ < ; (2) Construct a minimal subset ˜ T h ⊂ T h such that X T ∈ ˜ T h (cid:0) ˜ η h ( y h , T ) + ˜ η h ( p h , T ) (cid:1) > θ (cid:0) ˜ η h ( y h , Ω) + ˜ η h ( p h , Ω) (cid:1) . (3) Mark all the elements in ˜ T h . The adaptive algorithm for solving elliptic boundary value problems can then be described asfollows (see [7]):
Algorithm 2.4.
Adaptive finite element method for BVPs: (1)
Given an initial mesh T h with mesh size h and construct the finite element space V h . (2) Set k = 0 , solve (2.5) and (2.19) to obtain ( y h k , p h k ) ∈ V h k × V h k . (3) Compute the local error indicators ˜ η h k ( y h k , T ) and ˜ η h k ( p h k , T ) for each T ∈ T h k . (4) Construct ˜ T h k ⊂ T h k by the marking Algorithm 2.3. (5) Refine ˜ T h k to get a new conforming mesh T h k +1 . (6) Construct the finite element space V h k +1 , solve (2.5) and (2.19) to obtain ( y h k +1 , p h k +1 ) ∈ V h k +1 × V h k +1 . (7) Set k = k + 1 and go to Step (3). We denote T the class of all conforming refinements by bisection of T h (see [7] for more details).Given a fixed number b >
1, for any T h k ∈ T and M h k ⊂ T h k of marked elements, T h k +1 = REFINE( T h k , M h k )outputs a conforming triangulation T h k +1 ∈ T , where at least all elements of M h k are bisected b times. We define R T hk →T hk +1 = T h k \ ( T h k ∩ T h k +1 ) as the set of refined elements satisfies M h k ⊂ R T hk →T hk +1 .Then we can formulate the following standard result on the complexity of refinement, see [7,Lemma 2.3] and [35] for more details. Lemma 2.5.
Assume that T h verifies condition (b) of Section 4 in [35] . Let T h k ( k > ) be asequence of conforming and nested triangulations of Ω generated by REFINE starting from theinitial mesh T h . Assume that T h k +1 is generated from T h k by T h k +1 = REFINE ( T h k , M h k ) with asubset M h k ⊂ T h k . Then there exists a constant ˆ C depending on T h and b such that T h k +1 − T h ˆ C k X i =0 M h i ∀ k > . (2.22)We define k ( y, p ) k a = a ( y, y ) + a ( p, p ) . he convergence of Algorithm 2.4 based on the marking Algorithm 2.3 is proven in [7] and now be-comes a standard theory for the convergence analysis of AFEM for different kind of boundary valueproblems. The following Theorem 2.6, Lemma 2.7 and Lemma 2.8 are extensions of correspondingresults for single elliptic equation in [7] by some primary operations. We remark that in [10] theauthors used the similar idea to prove the convergence of adaptive finite element computations formultiple eigenvalues. Theorem 2.6.
Let ( y h k , p h k ) ∈ V h k × V h k be a sequence of finite element solutions of problems(2.2) and (2.18) based on the adaptively refined mesh T h k produced by Algorithm 2.4. Then thereexist constants ˜ γ > and ˜ β ∈ (0 , , depending only on the shape regularity of meshes, the dataand the parameters used in Algorithm 2.4, such that for any two consecutive iterates k and k + 1 we have k ( y − y h k +1 , p − p h k +1 ) k a + ˜ γ (cid:0) ˜ η h k +1 ( y h k +1 , Ω) + ˜ η h k +1 ( p h k +1 , Ω) (cid:1) ˜ β (cid:16) k ( y − y h k , p − p h k ) k a + ˜ γ (cid:0) ˜ η h k ( y h k , Ω) + ˜ η h k ( p h k , Ω) (cid:1)(cid:17) . (2.23) Here ˜ γ := 1(1 + δ − ) C ∗ (2.24) with some constant δ ∈ (0 , . To prove the optimal complexity of the adaptive algorithm we need further results. The followinglemma presents a localised upper bound estimate for the distance between two nested solutions ofthe elliptic boundary value problems (2.2) and (2.18) (see [7, Lemma 3.6] and [11, Lemma 6.2]).
Lemma 2.7.
Let ( y h k , p h k ) ∈ V h k × V h k and ( y h k +1 , p h k +1 ) ∈ V h k +1 × V h k +1 be the discrete solutionsof problems (2.2) and (2.18) over a mesh T h k and its refinement T h k +1 with marked element M h k ⊂T h k . Let R T hk →T hk +1 be the set of refined elements. Then the following localised upper bound isvalid k ( y h k − y h k +1 , p h k − p h k +1 ) k a ˜ C X T ∈ R T hk →T hk +1 (cid:0) ˜ η h k ( y h k , T ) + ˜ η h k ( p h k , T ) (cid:1) . (2.25)Consequently, we can show the optimality of the D¨orfler’s marking strategy in the followinglemma (see [7, Lemma 5.9] and [11, Proposition 6.3] for the proof). Lemma 2.8.
Let ( y h k , p h k ) ∈ V h k × V h k and ( y h k +1 , p h k +1 ) ∈ V h k +1 × V h k +1 be the discrete solutionsof problems (2.2) and (2.18) over a mesh T h k and its refinement T h k +1 with marked element M h k ⊂T h k . Suppose that they satisfy the energy decrease property k ( y − y h k +1 , p − p h k +1 ) k a + ˜ γ (cid:0) osc ( f − Ly h k +1 , T h k +1 ) + osc ( g − L ∗ p h k +1 , T h k +1 ) (cid:1) ˜ β (cid:16) k ( y − y h k , p − p h k ) k a + ˜ γ (cid:0) osc ( f − Ly h k , T h k ) + osc ( g − L ∗ p h k , T h k ) (cid:1)(cid:17) (2.26) with ˜ γ > a constant and ˜ β ∈ (0 , ) . Then the set R T hk →T hk +1 of marked elements satisfies theD¨orfler property X T ∈ R T hk →T hk +1 (cid:0) ˜ η h k ( y h k , T ) + ˜ η h k ( p h k , T ) (cid:1) > ˜ θ X T ∈T hk (cid:0) ˜ η h k ( y h k , T ) + ˜ η h k ( p h k , T ) (cid:1) (2.27) with ˜ θ = ˜ C (1 − β )˜ C ( ˜ C +(1+2 C ∗ ˜ C )˜ γ ) , where ˜ C = max(1 , ˜ C ˜ γ ) . . Adaptive finite element method for optimal control problem
In this section we consider the following elliptic optimal control problem:min u ∈ U ad J ( y, u ) = 12 k y − y d k , Ω + α k u k , Ω (3.1)subject to(3.2) (cid:26) Ly = u in Ω ,y = 0 on ∂ Ω , where α > U ad is the admissible control set with bilateral control constraints: U ad := n u ∈ L (Ω) , a u b a.e. in Ω o , where a, b ∈ R and a < b . Remark 3.1.
We remark that all the theories presented below can be generalised to the case thatthe control acts on a subdomain ω ⊂ Ω . In this case the governing equation reads Ly = Bu withthe control operator B : L ( ω ) → L (Ω) an extension by zero operator from ω to Ω . With the solution operator S of elliptic equation (3.2) introduced in last section, we can formu-late a reduced optimization problemmin u ∈ U ad ˆ J ( u ) := J ( Su, u ) = 12 k Su − y d k , Ω + α k u k , Ω . (3.3)Since the above optimization problem is linear and strictly convex, there exists a unique solution u ∈ U ad by standard argument (see [25]). Moreover, the first order necessary and sufficient optimalitycondition can be stated as follows:(3.4) ˆ J ′ ( u )( v − u ) = ( αu + S ∗ ( Su − y d ) , v − u ) > , ∀ v ∈ U ad , where S ∗ is the adjoint of S ([21]). Introducing the adjoint state p := S ∗ ( Su − y d ) ∈ H (Ω), weare led to the following optimality system:(3.5) a ( y, v ) = ( u, v ) , ∀ v ∈ H (Ω) ,a ( w, p ) = ( y − y d , w ) , ∀ w ∈ H (Ω) , ( αu + p, v − u ) > , ∀ v ∈ U ad . Hereafter, we call u , y and p the optimal control, state and adjoint state, respectively. From thelast inequality of (3.5) we have the pointwise representation of u (see [25]): u ( x ) = P [ a,b ] n − α p ( x ) o , (3.6)where P [ a,b ] is the orthogonal projection operator from L (Ω) to U ad .Next, let us consider the finite element approximation of (3.1)-(3.2). In this paper, we usethe piecewise linear finite elements to approximate the state y , and variational discretization forthe optimal control u (see [20]). Based on the finite element space V h , we can define the finitedimensional approximation to the optimal control problem (3.1)-(3.2) as follows: Find ( u h , y h ) ∈ U ad × V h such that min u h ∈ U ad J h ( y h , u h ) = 12 k y h − y d k , Ω + α k u h k , Ω (3.7)subject to a ( y h , v h ) = ( f + u h , v h ) , ∀ v h ∈ V h . (3.8)Similar to the continuous case we have y h = S h u h . With this notation we can formulate a reduceddiscrete optimization problemmin u h ∈ U ad ˆ J h ( u h ) := J h ( S h u h , u h ) = 12 k S h u h − y d k , Ω + α k u h k , Ω . (3.9) e note that the above optimization problem can be solved by projected gradient method orsemi-smooth Newton method, see [19], [21] and [30] for more details.Similar to the continuous problem (3.1)-(3.2), the above discretized optimization problem alsoadmits a unique solution u h ∈ U ad . Moreover, the first order necessary and sufficient optimalitycondition can be stated as follows:ˆ J ′ h ( u h )( v h − u h ) = ( αu h + S ∗ h ( S h u h − y d ) , v h − u h ) > , ∀ v h ∈ U ad , (3.10)where S ∗ h is the adjoint of S h . Introducing the adjoint state p h := S ∗ h ( S h u h − y d ) ∈ V h , thediscretized first order necessary and sufficient optimality condition is equivalent to:(3.11) a ( y h , v h ) = ( u h , v h ) , ∀ v h ∈ V h ,a ( w h , p h ) = ( y h − y d , w h ) , ∀ w h ∈ V h , ( αu h + p h , v h − u h ) > , ∀ v h ∈ U ad . Hereafter, we call u h , y h and p h the discrete optimal control, state and adjoint state, respectively.Similar to the continuous case (3.6) we have u h ( x ) = P U ad n − α p h ( x ) o . (3.12)It should be noticed that u h is not generally a finite element function in V h .For convenience we define y h := Su h and p h := S ∗ ( S h u h − y d ). It is obvious that y h and p h arethe standard Galerkin projections of y h and p h , i.e., y h = R h y h and p h = R h p h . The followingequivalence property is established in [22]. Theorem 3.2.
Let ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) and ( u h , y h , p h ) ∈ U ad × V h × V h be thesolutions of problems (3.1)-(3.2) and (3.7)-(3.8), respectively. Then the following an equivalenceproperty holds: k u − u h k , Ω + k y − y h k a, Ω + k p − p h k a, Ω ≈ k y h − y h k a, Ω + k p h − p h k a, Ω . (3.13) Proof.
For completeness we include a brief proof. Setting v = u h in (3.4) and v h = u in (3.10) weare led to ( αu + S ∗ ( Su − y d ) , u h − u ) > , (3.14) ( αu h + S ∗ h ( S h u h − y d ) , u − u h ) > . (3.15)Adding the above two inequalities, we conclude from (3.5) and (3.11) that α k u − u h k , Ω ( S ∗ h ( S h u h − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S ∗ h ( S h u h − y d ) − S ∗ ( S h u h − y d ) , u − u h ) + ( S ∗ ( S h u h − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S ∗ h ( S h u h − y d ) − S ∗ ( S h u h − y d ) , u − u h ) + ( S h u h − Su, Su − Su h )= ( S ∗ h ( S h u h − y d ) − S ∗ ( S h u h − y d ) , u − u h ) + ( S h u h − Su, Su − S h u h )+( S h u h − Su, S h u h − Su h ) . (3.16)It follows from the ε -Young inequality that α k u − u h k , Ω C k Su h − S h u h k a, Ω + C k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω . (3.17)Moreover, we have k y − y h k a, Ω k y − Su h k a, Ω + k Su h − y h k a, Ω C k u − u h k , Ω + k Su h − y h k a, Ω and k p − p h k a, Ω k p − S ∗ ( S h u h − y d ) k a, Ω + k S ∗ ( S h u h − y d ) − p h k a, Ω k Su − S h u h k , Ω + k S ∗ ( S h u h − y d ) − p h k a, Ω C k u − u h k , Ω + k S ∗ ( S h u h − y d ) − p h k a, Ω + k Su h − y h k a, Ω . Combining the above estimates we prove the upper bound. ow we prove the lower bound. Note that k Su h − S h u h k a, Ω k Su h − Su k a, Ω + k Su − S h u h k a, Ω C k u − u h k , Ω + k y − y h k a, Ω . (3.18)Similarly, we can derive that k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω k S ∗ ( S h u h − y d ) − S ∗ ( Su − y d ) k a, Ω + k S ∗ ( Su − y d ) − S ∗ h ( S h u h − y d ) k a, Ω k S h u h − Su k , Ω + k p − p h k a, Ω = k y − y h k a, Ω + k p − p h k a, Ω . (3.19)Thus, we can conclude from the above estimates the lower bound. This completes the proof. (cid:3) Next, we will prove a compact equivalence property which shows the certain relationship betweenthe finite element optimal control approximation and the associated finite element boundary valueapproximation.
Theorem 3.3.
Let h ∈ (0 , h ) , ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) and ( u h , y h , p h ) ∈ U ad × V h × V h bethe solutions of problems (3.1)-(3.2) and (3.7)-(3.8), respectively. Then the following equivalenceproperties hold k y − y h k a, Ω = k y h − y h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) , (3.20) k p − p h k a, Ω = k p h − p h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) (3.21) provided h ≪ .Proof. It is obvious that y − y h = y h − y h + y − y h , p − p h = p h − p h + p − p h . (3.22)Moreover, it follows from the stability results of elliptic equation that k y − y h k a, Ω C k u − u h k , Ω , k p − p h k a, Ω C k y − y h k , Ω . (3.23)In the following we estimate k y − y h k , Ω . Let ψ ∈ H (Ω) be the solution of the following auxiliaryproblem(3.24) (cid:26) L ∗ ψ = y − y h in Ω ,ψ = 0 on ∂ Ω . Let ψ h ∈ V h be the finite element approximation of ψ . Then we can conclude from (2.7) and thestandard duality argument (see, e.g., [9]) that k y − y h k , Ω = a ( y − y h , ψ )= a ( y − y h , ψ − ψ h ) + a ( y − y h , ψ h )= a ( y − y h , ψ − ψ h ) + ( u − u h , ψ h − ψ ) + ( u − u h , ψ ) C (cid:16) κ ( h ) k y − y h k a, Ω + k u − u h k , Ω (cid:17) k y − y h k , Ω , which in turn implies k y − y h k , Ω Cκ ( h ) k y − y h k a, Ω + C k u − u h k , Ω . (3.25)Considering (3.23) we have k p − p h k a, Ω Cκ ( h ) k y − y h k a, Ω + C k u − u h k , Ω . (3.26) t remains to estimate k u − u h k , Ω . Note that it follows from (3.14) and (3.15) that α k u − u h k , Ω ( S ∗ h ( S h u h − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S ∗ h ( S h u h − y d ) − S ∗ h ( S h u − y d ) , u − u h )+( S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S h ( u h − u ) , S h ( u − u h )) + ( S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) , u − u h ) ( S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) , u − u h ) , which yields k u − u h k , Ω C k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω . (3.27)Let φ ∈ H (Ω) be the solution of the following auxiliary problem(3.28) (cid:26) Lφ = S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) in Ω ,φ = 0 on ∂ Ω . Then from the standard duality argument we have k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω = a ( φ, S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ))= a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) + a ( φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ))= a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) + ( φ h , S h u − Su )= a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) + ( φ h − φ, S h u − Su ) + ( φ, S h u − Su ) , (3.29)where φ h ∈ V h is the finite element approximation of φ . We can conclude from (2.7)-(2.8) that a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) Cκ ( h ) k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k a, Ω (3.30)and ( φ h − φ, S h u − Su ) Cκ ( h ) k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω k S h u − Su k a, Ω , (3.31) ( φ, S h u − Su ) Cκ ( h ) k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω k S h u − Su k a, Ω . (3.32)Then we are able to derive that k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω Cκ ( h )( k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k a, Ω + k S h u − Su k a, Ω ) . (3.33)Combining (3.27) and (3.33) we are led to k u − u h k , Ω . κ ( h )( k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k a, Ω + k S h u − Su k a, Ω ) . κ ( h )( k p h − p k a, Ω + k S ∗ h ( S h u − y d ) − S ∗ h ( S h u h − y d ) k a, Ω + k S h u − Su k a, Ω ) . κ ( h )( k p h − p k a, Ω + k S h u − S h u h k a, Ω + k S h u − Su k a, Ω ) . κ ( h )( k p h − p k a, Ω + k S h u h − Su k a, Ω + k S h u − S h u h k a, Ω ) . κ ( h )( k p h − p k a, Ω + k y h − y k a, Ω + k u − u h k , Ω ) . (3.34)If h ≪ κ ( h ) ≪ h ∈ (0 , h ), and we arrive at k u − u h k , Ω . κ ( h )( k p h − p k a, Ω + k y h − y k a, Ω ) . (3.35)Inserting the above estimate into (3.23) and (3.26), we can conclude from (3.22) the desired results(3.20)-(3.21). This completes the proof. (cid:3) Now we are in the position to consider the adaptive finite element method for optimal controlproblem (3.1)-(3.2). At first we will derive a posteriori error estimates for above optimal controlproblems. To begin with, we firstly introduce some notations. Similar to the definitions of (2.9) and r y,T ( y h ), r p,T ( p h ) and the jump residuals j y,E ( y h ), j p,E ( p h )by r y,T ( y h ) : = u h − Ly h = u h + ∇ · ( A ∇ y h ) − cy h in T h , (3.36) r p,T ( p h ) : = y h − y d − L ∗ p h = y h − y d + ∇ · ( A ∗ ∇ p h ) − cp h in T h , (3.37) j y,E ( y h ) : = [ A ∇ y h ] E · n E on E ∈ E h , (3.38) j p,E ( p h ) : = [ A ∗ ∇ p h ] E · n E on E ∈ E h . (3.39)For each element T ∈ T h , we define the local error indicators η y,h ( y h , T ) and η p,h ( p h , T ) by η y,h ( y h , T ) := (cid:16) h T k r y,T ( y h ) k ,T + X E ∈E h ,E ⊂ ∂T h E k j y,E ( y h ) k ,E (cid:17) , (3.40) η p,h ( p h , T ) := (cid:16) h T k r p,T ( p h ) k ,T + X E ∈E h ,E ⊂ ∂T h E k j p,E ( p h ) k ,E (cid:17) . (3.41)Then on a subset ω ⊂ Ω, we define the error estimators η y,h ( y h , ω ) and η p,h ( p h , ω ) by η y,h ( y h , ω ) := (cid:16) X T ∈T h ,T ⊂ ω η y,h ( y h , T ) (cid:17) , (3.42) η p,h ( p h , ω ) := (cid:16) X T ∈T h ,T ⊂ ω η p,h ( p h , T ) (cid:17) . (3.43)Thus, η y,h ( y h , Ω) and η p,h ( p h , Ω) constitute the error estimators for the state equation and theadjoint state equation on Ω with respect to T h .Note that S h u h and S ∗ h ( S h u h − y d ) are the standard Galerkin projections of Su h and S ∗ ( S h u h − y d ), respectively. Similar to (2.16)-(2.17), standard a posterior error estimates for elliptic boundaryvalue problem give the following upper bounds (see, e.g., [36]) which show the reliability of theerror estimators. Lemma 3.4.
Let S and S h be the continuous and discrete solution operators defined above. Thenthe following a posteriori error estimates hold k Su h − S h u h k a, Ω ˜ C η y,h ( y h , Ω) , (3.44) k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω ˜ C η p,h ( p h , Ω) . (3.45)Then we can also derive the following global a posteriori error lower bounds, i.e., the globalefficiency of the error estimators. Lemma 3.5.
Let S and S h be the continuous and discrete solution operators defined above. Thenthe following a posteriori error lower bounds hold ˜ C η y,h ( y h , Ω) k Su h − S h u h k a, Ω + ˜ C osc ( u h − Ly h , T h ) , (3.46) ˜ C η p,h ( p h , Ω) k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω + ˜ C osc ( y h − y d − L ∗ p h , T h ) . (3.47)Let h ∈ (0 ,
1) be the mesh size of the initial mesh T h and define˜ κ ( h ) := sup h ∈ (0 ,h ] κ ( h ) . It is obvious that ˜ κ ( h ) ≪ h ≪
1. For ease of exposition we also define the following quantities: η h (( y h , p h ) , T ) = η y,h ( y h , T ) + η p,h ( p h , T ) , osc (( y h , p h ) , T ) = osc ( u h − Ly h , T ) + osc ( y h − y d − L ∗ p h , T ) , and the straightforward modifications for η h (( y h , p h ) , Ω) and osc (( y h , p h ) , T h ).Now we state the following a posteriori error estimates for the finite element approximation ofthe optimal control problem. heorem 3.6. Let h ∈ (0 , h ) . Assume that ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) and ( u h , y h , p h ) ∈ U ad × V h × V h are the solutions of problems (3.1)-(3.2) and (3.7)-(3.8), respectively. Then thereexist positive constants C , C and C , independent of the mesh size h , such that k ( y − y h , p − p h ) k a C η h (( y h , p h ) , Ω)(3.48) and C η h (( y h , p h ) , Ω) k ( y − y h , p − p h ) k a + C osc (( y h , p h ) , T h )(3.49) provided h ≪ .Proof. Note that y h = Su h , y h = S h u h , p h = S ∗ ( S h u h − y d ) and p h = S ∗ h ( S h u h − y d ). From theestimates (3.20)-(3.21), Lemmas 3.4 and 3.5 we have k ( y − y h , p − p h ) k a k y h − y h k a, Ω + k p h − p h k a, Ω ) + ˆ C κ ( h ) k ( y − y h , p − p h ) k a C η h (( y h , p h ) , Ω) + ˆ C ˜ κ ( h ) k ( y − y h , p − p h ) k a and ˜ C η h (( y h , p h ) , Ω) ( k y h − y h k a, Ω + k p h − p h k a, Ω ) + ˜ C osc (( y h , p h ) , T h ) k ( y − y h , p − p h ) k a, Ω + ˜ C osc (( y h , p h ) , T h )+ ˆ C ˜ κ ( h ) k ( y − y h , p − p h ) k a . We obtain the desired results by choosing C = 2 ˜ C − ˆ C ˜ κ ( h ) , C = ˜ C C ˜ κ ( h ) , C = ˜ C C ˜ κ ( h ) . (3.50) (cid:3) The adaptive finite element procedure consists of the following loopsSOLVE → ESTIMATE → MARK → REFINE . The ESTIMATE step is based on the a posteriori error estimators presented in Theorem 3.6,while the step REFINE can be done by using iterative or recursive bisection of elements with theminimal refinement condition (see [34, 36]). Due to [7], the procedure REFINE here is not requiredto satisfy the interior node property of [32]. Note that there are two error estimators η y,h ( y h , T ) and η p,h ( p h , T ) contributed to the state approximation and adjoint state approximation, respectively.We use the sum of the two estimators as our indicators for the marking strategy. The markingalgorithm based on the D¨orfler’s strategy for optimal control problems can be described as follows Algorithm 3.7.
The D¨orfler’s marking strategy for OCPs (1)
Given a parameter < θ < ; (2) Construct a minimal subset ˜ T h ⊂ T h such that X T ∈ ˜ T h η h (( y h , p h ) , T ) > θη h (( y h , p h ) , Ω) . (3) Mark all the elements in ˜ T h . Then we can present the adaptive finite element algorithm for the optimal control problem(3.7)-(3.8) as follows:
Algorithm 3.8.
Adaptive finite element algorithm for OCPs: (1)
Given an initial mesh T h with mesh size h and construct the finite element space V h . (2) Set k = 0 and solve the optimal control problem (3.7)-(3.8) to obtain ( u h k , y h k , p h k ) ∈ U ad × V h k × V h k . (3) Compute the local error indicator η h k (( y h k , p h k ) , T ) . (4) Construct ˜ T h k ⊂ T h k by the marking Algorithm 3.7. Refine ˜ T h k to get a new conforming mesh T h k +1 by procedure REFINE. (6) Construct the finite element space V h k +1 and solve the optimal control problem (3.7)-(3.8)to obtain ( u h k +1 , y h k +1 , p h k +1 ) ∈ U ad × V h k +1 × V h k +1 . (7) Set k = k + 1 and go to Step (3). Convergence of AFEM for optimal control problem
In this section we intend to prove the convergence of the adaptive Algorithm 3.8. The proofuses some ideas of [11, 16] and some results of [7]. Following Theorem 3.3, we may firstly establishsome relationships between the two level approximations, which will be used in our analysis forboth convergence and optimal complexity.
Theorem 4.1.
Let h, H ∈ (0 , h ) and ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) be the solution of problem(3.1)-(3.2). Assume that ( u h , y h , p h ) ∈ U ad × V h × V h and ( u H , y H , p H ) ∈ U ad × V H × V H are thesolutions of problem (3.7)-(3.8), respectively. Define y H := Su H and p H := S ∗ ( S H u H − y d ) . Thenthe following properties hold k y − y h k a, Ω = k y H − R h y H k a, Ω + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) , (4.1) k p − p h k a, Ω = k p H − R h p H k a, Ω + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) , (4.2) osc ( u h − Ly h , T h ) = osc ( u H − L R h y H , T h ) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:1) , (4.3) osc ( y h − y d − L ∗ p h , T h ) = osc ( y H − y d − L ∗ R h p H , T h ) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:1) (4.4) and η y,h ( y h , Ω) = ˜ η h ( R h y H , Ω) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) , (4.5) η p,h ( p h , Ω) = ˜ η h ( R h p H , Ω) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) (4.6) provided h ≪ .Proof. Note that y − y h = y H − R h y H + R h ( y H − y h ) + y − y H (4.7)and p − p h = p H − R h p H + R h ( p H − p h ) + p − p H . (4.8)On the other hand, it follows from (2.4) that kR h ( y H − y h ) + y − y H k a, Ω . k y H − y h k a, Ω + k y − y H k a, Ω . k y − y h k a, Ω + k y − y H k a, Ω . k u − u h k , Ω + k u − u H k , Ω (4.9)and kR h ( p H − p h ) + p − p H k a, Ω . k p H − p h k a, Ω + k p − p H k , Ω . k y − y h k , Ω + k y − y H k , Ω . k u − u h k , Ω + κ ( h ) k y − y h k a, Ω + k u − u H k , Ω + κ ( H ) k y − y H k a, Ω , (4.10) here in the last inequality we used (3.25). It follows from (3.35) that kR h ( y H − y h ) + y − y H k a, Ω + kR h ( p H − p h ) + p − p H k a, Ω . κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) + κ ( H ) (cid:16) k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) (4.11)provided h ≪
1. This combining with (4.7)-(4.8) yields (4.1) and (4.2).Then we prove (4.3)-(4.4). Note that u h − Ly h = u H − L R h y H + L R h ( y H − y h ) + ( u h − u H ) , (4.12) y h − y d − L ∗ p h = y H − y d − L ∗ R h p H + L ∗ R h ( p H − p h ) + ( y h − y H ) . (4.13)From Lemma 2.2 we haveosc( L R h ( y H − y h ) , T h ) . kR h ( y H − y h ) k a, Ω , osc( L ∗ R h ( p H − p h ) , T h ) . kR h ( p H − p h ) k a, Ω , which together with (4.11) implyosc( L R h ( y H − y h ) , T h ) + osc( L ∗ R h ( p H − p h ) , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . (4.14)Moreover, since ¯ f T is the L -projection of f onto piecewise polynomials on T , there holdsosc( f, T h ) = (cid:16) X T ∈T h k h T ( f − ¯ f T ) k ,T (cid:17) . k f k , Ω . In view of (3.25) we thus haveosc( u h − u H , T h ) . k u h − u H k , Ω . k u − u h k , Ω + k u − u H k , Ω , osc( y h − y H , T h ) . k y h − y H k , Ω . k u − u h k , Ω + k u − u H k , Ω + κ ( H ) k y − y H k a, Ω + κ ( h ) k y − y h k a, Ω , which together with (3.35) yieldosc( u h − u H , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) , (4.15) osc( y h − y H , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . (4.16)We can conclude the desired results (4.3)-(4.4) from the definition of the data oscillation and(4.12)-(4.16).Now it remains to prove (4.5) and (4.6). From the definition of y H and y h we know that y h − y H is the solution of elliptic boundary value problem with right hand side u h − u H . It follows from(2.17) and (4.9) that˜ η h ( R h ( y h − y H ) , Ω) . k ( y h − y H ) − R h ( y h − y H ) k a, Ω +osc( u h − u H − L R h ( y h − y H ) , T h ) . k u − u h k , Ω + k u − u H k , Ω +osc( u h − u H − L R h ( y h − y H ) , T h ) . (4.17) rom (2.14), (3.35), (4.14) and (4.15) we are led toosc( u h − u H − L R h ( y h − y H ) , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . (4.18)Note that η y,h ( y h , Ω) = ˜ η h ( R h y h , Ω) = ˜ η h ( R h y H + R h ( y h − y H ) , Ω) . This combining with (4.17) and (4.18) gives η y,h ( y h , Ω) = ˜ η h ( R h y H , Ω) + ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:17) , which proves (4.5). Similarly we can prove (4.6). Thus, we complete the proof of the theorem. (cid:3) Now we are ready to prove the error reduction for the sum of the energy errors and the scalederror estimators of the state y and the adjoint state p , between two consecutive adaptive loops. Theorem 4.2.
Let ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) be the solution of problem (3.1)-(3.2) and ( u h k , y h k , p h k ) ∈ U ad × V h k × V h k be a sequence of solutions to problem (3.7)-(3.8) produced byAlgorithm 3.8. Then there exist constants γ > and β ∈ (0 , depending only on the shaperegularity of meshes and the parameter θ used by Algorithm 3.7, such that for any two consecutiveiterates k and k + 1 , we have k ( y − y h k +1 , p − p h k +1 ) k a + γη h k +1 (( y h k +1 , p h k +1 ) , Ω) β (cid:16) k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) (cid:17) (4.19) provided h ≪ . Therefore, Algorithm 3.8 converges with a linear rate β , namely, the k -th iteratesolution ( u h k , y h k , p h k ) of Algorithm 3.8 satisfies k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) C β k , (4.20) where C = k ( y − y h , p − p h ) k a + γη h (( y h , p h ) , Ω) .Proof. For convenience, we use ( u H , y H , p H ) and ( u h , y h , p h ) to denote ( u h k , y h k , p h k ) and ( u h k +1 , y h k +1 , p h k +1 ),respectively. So it suffices to prove that for ( u H , y H , p H ) and ( u h , y h , p h ), there holds k ( y − y h , p − p h ) k a + γη h (( y h , p h ) , Ω) β (cid:16) k ( y − y H , p − p H ) k a + γη H (( y H , p H ) , Ω) (cid:17) . (4.21)Recall that y H := Su H , y h := Su h and p H := S ∗ ( S H u H − y d ), p h := S ∗ ( S h u h − y d ). It followsfrom Algorithm 3.7 that the D¨orfler’s marking strategy in Algorithm 2.3 is satisfied for ( y H , p H ).So we conclude from Theorem 2.6 that there exist constants ˜ γ and ˜ β ∈ (0 ,
1) satisfying k ( y H − R h y H , p H − R h p H ) k a + ˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1) ˜ β (cid:16) k ( y H − R H y H , p H − R H p H ) k a + ˜ γ (cid:0) ˜ η H ( R H y H , Ω) + ˜ η H ( R H p H , Ω) (cid:1)(cid:17) . (4.22)Note that R H y H = y H and R H p H = p H , we thus have k ( y H − R h y H , p H − R h p H ) k a + ˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1) ˜ β (cid:16) k ( y H − y H , p H − p H ) k a + ˜ γ (cid:0) η y,H ( y H , Ω) + η p,H ( p H , Ω) (cid:1)(cid:17) . (4.23) e conclude from (4.1)-(4.2) and (4.5)-(4.6) that there exists a constant ˆ C > k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) k ( y H − R h y H , p H − R h p H ) k a + (1 + δ )˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1) + ˆ C (1 + δ − )˜ κ ( h ) (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) + ˆ C (1 + δ − )˜ κ ( h )˜ γ (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) , where the δ -Young inequality is used and δ ∈ (0 ,
1) satisfies(1 + δ ) ˜ β < . (4.24)Thus, there exists a positive constant ˆ C depending on ˆ C and ˜ γ such that k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) (cid:16) k ( y H − R h y H , p H − R h p H ) k a + ˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1)(cid:17) + ˆ C δ − ˜ κ (cid:16) h )( k ( y − y h , p − p h ) k a, Ω + k ( y − y H , p − p H ) k a, Ω (cid:17) . (4.25)It follows from (4.23) and (4.25) that k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) ˜ β (cid:16) k ( y H − y H , p H − p H ) k a + ˜ γη H (( y H , p H ) , Ω) (cid:17) + ˆ C δ − ˜ κ ( h ) (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) . (4.26)Then using Theorem 3.3 we arrive at k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) ˜ β (cid:16) (1 + ˆ C ˜ κ ( h )) k ( y − y H , p − p H ) k a + ˜ γη H (( y H , p H ) , Ω) (cid:17) + ˆ C δ − ˜ κ ( h ) (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) , and thus k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) ˜ β (cid:16) k ( y − y H , p − p H ) k a + ˜ γη H (( y H , p H ) , Ω) (cid:17) + C ˜ κ ( h ) k ( y − y H , p − p H ) k a + C δ − ˜ κ ( h ) k ( y − y h , p − p h ) k a , (4.27)where C is a positive constant depending on ˆ C and ˆ C when h ≪
1. So we can derive(1 − C δ − ˜ κ ( h )) k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (cid:16) (1 + δ ) ˜ β + C ˜ κ ( h ) (cid:17) k ( y − y H , p − p H ) k a + (1 + δ ) ˜ β ˜ γη H (( y H , p H ) , Ω) , (4.28)or equivalently, k ( y − y h , p − p h ) k a + ˜ γ − C δ − ˜ κ ( h ) η h (( y h , p h ) , Ω) (1 + δ ) ˜ β + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) k ( y − y H , p − p H ) k a + (1 + δ ) ˜ β ˜ γ − C δ − ˜ κ ( h ) η H (( y H , p H ) , Ω) . (4.29)Since ˜ κ ( h ) ≪ h ≪
1, we can define the constant β as β := (cid:16) (1 + δ ) ˜ β + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) (cid:17) , (4.30) hich satisfies β ∈ (0 ,
1) if h ≪
1. Then k ( y − y h , p − p h ) k a + ˜ γ − C δ − ˜ κ ( h ) η h (( y h , p h ) , Ω) β (cid:16) k ( y − y H , p − p H ) k a + (1 + δ ) ˜ β ˜ γ (1 + δ ) ˜ β + C ˜ κ ( h ) η H (( y H , p H ) , Ω) (cid:17) . (4.31)Now we choose γ := ˜ γ − C δ − ˜ κ ( h ) , (4.32)it is obvious that(1 + δ ) ˜ β ˜ γ (1 + δ ) ˜ β + C ˜ κ ( h ) = (1 + δ ) ˜ β (1 − C δ − ˜ κ ( h )) γ (1 + δ ) ˜ β + C ˜ κ ( h ) < (1 − C δ − ˜ κ ( h )) γ < γ. Then we obtain (4.21), this completes the proof. (cid:3)
Remark 4.3.
We remark that the requirement h ≪ on the initial mesh T h is not restrictivefor the convergence analysis of AFEM for nonlinear problems, such as optimal control problemsstudied in this paper, see, e.g., [14] . For similar requirement we refer to [10, 11] for the convergenceanalysis of adaptive finite element eigenvalue computations and [31] for the adaptive finite elementcomputations for nonsymmetric boundary value problems, we should also mention [16] for theadaptive finite element method of a semilinear elliptic equation. Remark 4.4.
In adaptive Algorithm 3.8 we use the sum of the error estimators η y,h ( y h , T ) con-tributed to the state approximation and η p,h ( p h , T ) contributed to the adjoint state approximationas indicator to select the subset ˜ T h for refinement. This marking strategy enables us to prove theconvergence and quasi-optimality (see Section 5) of AFEM for optimal control problems. We re-mark that it is also possible to use the separate marking for the contributions of η y,h ( y h , T ) and η p,h ( p h , T ) as follows: • Construct a minimal subset ˜ T h, ⊂ T h such that P T ∈ ˜ T h, η y,h ( y h , T ) > θη y,h ( y h , Ω) . • Construct another minimal subset ˜ T h, ⊂ T h such that P T ∈ ˜ T h, η p,h ( p h , T ) > θη p,h ( p h , Ω) . • Set ˜ T h := ˜ T h, ∪ ˜ T h, and mark all the elements in ˜ T h .With this marking strategy we can also prove the convergence of AFEM for optimal control problemsby using the results of [7, 11] for single boundary value problem. To be more specific, the errorreduction (4.22) can be derived separately for the state and adjoint state approximations. However,the resulting over-refinement for this marking strategy prevents us to prove the quasi-optimality ofthe adaptive algorithm. Complexity of AFEM for optimal control problem
In this section we intend to analyse the complexity of adaptive finite element algorithm foroptimal control problems based on the known results of the complexity for elliptic boundary valueproblems. The proof uses some ideas of [11, 16] and some results of [7].Similar to [7] and [11], for our purpose to analyse the complexity of AFEM for optimal controlproblems we need to introduce a function approximation class as follows A sγ := n ( y, p, y d ) ∈ H (Ω) × H (Ω) × L (Ω) : | ( y, p, y d ) | s,γ < + ∞ o , where γ > | ( y, p, y d ) | s,γ = sup ε> ε inf T ⊂T h : inf( k ( y − y T ,p − p T ) k a +( γ +1) osc (( y T ,p T ) , T )) / ε ( T − T h ) s . ere T ⊂ T h means T is a refinement of T h , y T and p T are elements of the finite element spacecorresponding to the partition T . It is seen from the definition that A sγ = A s for all γ >
0, thuswe use A s throughout the paper with corresponding norm | · | s . So A s is the class of functions thatcan be approximated with a given tolerance ε by continuous peicewise linear polynomial functionsover a partition T with number of degrees of freedom T − T h . ε − /s | v | /ss .Now we are in the position to prepare for the proof of optimal complexity of Algorithm 3.8 for theoptimal control problem (3.1)-(3.2). At first, we define y h k := Su h k and p h k := S ∗ ( S h k u h k − y d ).Then we have the following result. Lemma 5.1.
Let ( u h k , y h k , p h k ) ∈ U ad × V h k × V h k and ( u h k +1 , y h k +1 , p h k +1 ) ∈ U ad × V h k +1 × V h k +1 be discrete solutions of problem (3.7)-(3.8) over mesh T h k and its refinement T h k +1 with markedelement M h k . Suppose they satisfy the following property k ( y − y h k +1 , p − p h k +1 ) k a + γ ∗ osc (( y h k +1 , p h k +1 ) , T h k +1 ) β ∗ (cid:16) k ( y − y h k , p − p h k ) k a + γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) (5.1) with γ ∗ and β ∗ some positive constants. Then for the associated state and adjoint state approxi-mations we have k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a + ˜ γ ∗ osc (( R h k +1 y h k , R h k +1 p h k ) , T h k +1 ) ˜ β ∗ (cid:16) k ( y h k − R h k y h k , p h k − R h k p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) (5.2) with ˜ β ∗ := (cid:16) (1 + δ ) β ∗ + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) (cid:17) , ˜ γ ∗ := γ ∗ − C δ − ˜ κ ( h ) , where C is some constant depending on C ∗ , ˆ C and ˆ C . ˆ C , ˆ C and δ ∈ (0 , are some constantsas in the proof of Thoerem 4.2.Proof. The proof follows the similar procedure as in the proof of Theorem 4.2 when (4.5)-(4.6) arereplaced by (4.3)-(4.4). Specifically, in the proof of Theorem 4.2 we use (4.22), Theorem 3.3 andTheorem 4.1 to prove (4.21). Conversely, here we need to prove (4.22) from (4.21), Theorem 3.3and Theorem 4.1. (cid:3)
Next, we are able to derive a result similar to Lemma 2.8 concerning the optimality of D¨orfler’smarking strategy for the optimal control problems.
Corollary 5.2.
Let ( u h k , y h k , p h k ) ∈ U ad × V h k × V h k and ( u h k +1 , y h k +1 , p h k +1 ) ∈ U ad × V h k +1 × V h k +1 be discrete solutions of problem (3.7)-(3.8) over mesh T h k and its refinement T h k +1 with markedelement M h k . Suppose they satisfy the following property k ( y − y h k +1 , p − p h k +1 ) k a + γ ∗ osc (( y h k +1 , p h k +1 ) , T h k +1 ) β ∗ ( k ( y − y h k , p − p h k ) k a + γ ∗ osc (( y h k , p h k ) , T h k )) with constants γ ∗ > and β ∗ ∈ (0 , q ) . Then the set R T hk →T hk +1 of refined elements satisfies theD¨orfler property X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) > ˆ θ X T ∈T hk η h k (( y h k , p h k ) , T )(5.3) with ˆ θ = ˜ C (1 − β ∗ )˜ C ( ˜ C +(1+2 C ∗ ˜ C )˜ γ ∗ ) and ˜ C = max(1 , ˜ C ˜ γ ∗ ) . roof. From Lemma 5.1 we can conclude (5.2) from (5.1). Note that y h k = R h k y h k and p h k = R h k p h k . By the lower bounds in Lemma 3.5 we have(1 − β ∗ ) ˜ C η h k (( y h k , p h k ) , Ω) (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ C osc (( y h k , p h k ) , T h k ) (cid:17) = (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ C ˜ γ ∗ ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) ˜ C (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) . Thus, it follows from (5.2) that˜ C ˜ C (1 − β ∗ ) X T ∈T hk η h k (( y h k , p h k ) , T ) (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) = k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) − β ∗ (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) − (cid:16) k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a + ˜ γ ∗ osc (( R h k +1 y h k , R h k +1 p h k ) , T h k +1 ) (cid:17) k ( y h k − y h k , p h k − p h k ) k a − k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a +˜ γ ∗ (cid:16) osc (( y h k , p h k ) , T h k ) − (( R h k +1 y h k , R h k +1 p h k ) , T h k +1 ) (cid:17) . (5.4)Note that y h k and R h k +1 y h k are the Galerkin projections of y h k on V h k and V h k +1 , respectively.From the standard Galerkin orthogonality we have k ( y h k − y h k , p h k − p h k ) k a − k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a = k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a . (5.5)By (2.15), the triangle and the Young inequalities we haveosc (( y h k , p h k ) , T ) (( R h k +1 y h k , R h k +1 p h k ) , T )+2 C ∗ k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a , which together with the dominance of the indicator over oscillation (see [7, Remark 2.1])osc ( u h k − Ly h k , T ) η y,h k ( y h k , T ) , (5.6) osc ( y h k − y d − L ∗ p h k , T ) η p,h k ( p h k , T )(5.7)implies osc (( y h k , p h k ) , T h k ) − (( R h k +1 y h k , R h k +1 p h k ) , T h k +1 ) X T ∈ R T hk →T hk +1 osc (( y h k , p h k ) , T ) + osc (( y h k , p h k ) , T h k ∩ T h k +1 ) − (( R h k +1 y h k , R h k +1 p h k ) , T h k ∩ T h k +1 ) X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) + 2 C ∗ k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a (1 + 2 C ∗ ˜ C ) X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) , (5.8) here we used (2.25) in the last inequality. Combining (5.4)-(5.8) and (2.25) we obtain˜ C ˜ C (1 − β ∗ ) X T ∈T hk η h k (( y h k , p h k ) , T ) ( ˜ C + (1 + 2 C ∗ ˜ C )˜ γ ∗ ) X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) . (5.9)By choosing ˆ θ := ˜ C ˜ C (1 − β ∗ )˜ C + (1 + 2 C ∗ ˜ C )˜ γ ∗ = ˜ C (1 − β ∗ )˜ C ( ˜ C + (1 + 2 C ∗ ˜ C )˜ γ ∗ )we complete the proof. (cid:3) Lemma 5.3.
Let ( y, p, y d ) ∈ A s and T h k ( k > ) be a sequence of meshes generated by Algorithm3.8 starting from the initial mesh T h . Let T h k +1 = REFINE ( T h k , M h k ) where M h k is produced byAlgorithm 3.7 with θ satisfying θ ∈ (0 , C γC ( C +(1+2 C ∗ C ) γ ) ) . Then M h k C (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) − s | ( y, p, y d ) | s s , (5.10) where the constant C depends on the discrepancy between θ and C γC ( C +(1+2 C ∗ C ) γ ) .Proof. Let ρ, ρ ∈ (0 ,
1) satisfy ρ ∈ (0 , ρ ) and θ < C γC ( C + (1 + 2 C ∗ C ) γ ) (1 − ρ ) . Choose δ ∈ (0 ,
1) to satisfy (4.24) and(1 + δ ) ρ ρ , (5.11)which implies (1 + δ ) ρ < . (5.12)Set ε = 1 √ ρ (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) and let T h ε be a refinement of T h with minimal degrees of freedom satisfying k ( y − y h ε , p − p h ε ) k a + ( γ + 1)osc (( y h ε , p h ε ) , T h ε ) ε . (5.13)We can conclude from the definition of A s that T h ε − T h . ε − s | ( y, p, y d ) | s s . Let T h ∗ := T h ε ⊕ T h k be the smallest common refinement of T h ε and T h k . Let V h ε ⊂ H (Ω) and V h ∗ ⊂ H (Ω) be the finite element spaces defined on T h ε and T h ∗ , respectively. Assume that( u h ε , y h ε , p h ε ) ∈ U ad × V h ε × V h ε is the solution of problem (3.7)-(3.8).Define y h ε := Su h ε and p h ε := S ∗ ( S h ε u h ε − y d ). From the definition of oscillation we canconclude from Lemma 2.2 thatosc( u h ε − L R h ∗ y h ε , T h ∗ ) osc( u h ε − L R h ε y h ε , T h ∗ ) + osc( L ( R h ∗ − R h ε ) y h ε , T h ∗ ) osc( u h ε − L R h ε y h ε , T h ∗ ) + C ∗ k ( R h ∗ − R h ε ) y h ε k a andosc( y h ε − y d − L ∗ R h ∗ p h ε , T h ∗ ) osc( y h ε − y d − L ∗ R h ε p h ε , T h ∗ ) + osc( L ∗ ( R h ∗ − R h ε ) p h ε , T h ∗ ) osc( y h ε − y d − L ∗ R h ε p h ε , T h ∗ ) + C ∗ k ( R h ∗ − R h ε ) p h ε ) k a . hen from the Young’s inequality we haveosc (( R h ∗ y h ε , R h ∗ p h ε ) , T h ∗ ) (( R h ε y h ε , R h ε p h ε ) , T h ∗ )+2 C ∗ k (( R h ∗ − R h ε ) y h ε , ( R h ∗ − R h ε ) p h ε )) k a . Due to the orthogonality k ( y h ε − R h ∗ y h ε , p h ε − R h ∗ p h ε ) k a = k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a −k (( R h ∗ − R h ε ) y h ε , ( R h ∗ − R h ε ) p h ε )) k a , we arrive at k ( y h ε − R h ∗ y h ε , p h ε − R h ∗ p h ε ) k a + 12 C ∗ osc (( R h ∗ y h ε , R h ∗ p h ε ) , T h ∗ ) k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a + 1 C ∗ osc (( R h ε y h ε , R h ε p h ε ) , T h ∗ ) . From Theorem 2.6 we can see that ˜ γ C ∗ , which implies k ( y h ε − R h ∗ y h ε , p h ε − R h ∗ p h ε ) k a + ˜ γ osc (( R h ∗ y h ε , R h ∗ p h ε ) , T h ∗ ) k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a + 1 C ∗ osc (( R h ε y h ε , R h ε p h ε ) , T h ∗ ) k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a + (˜ γ + σ )osc (( R h ε y h ε , R h ε p h ε ) , T h ∗ )with σ = C ∗ − ˜ γ ∈ (0 , k ( y − y h ∗ , p − p h ∗ ) k a + γ osc (( y h ∗ , p h ∗ ) , T h ∗ ) β (cid:16) k ( y − y h ε , p − p h ε ) k a + ( γ + σ )osc (( y h ε , p h ε ) , T h ε ) (cid:17) β (cid:16) k ( y − y h ε , p − p h ε ) k a + ( γ + 1)osc (( y h ε , p h ε ) , T h ε ) (cid:17) , (5.14)where β := (cid:16) (1 + δ ) + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) (cid:17) and C is the constant appeared in the proof of Theorem 4.2. Thus, by (5.13) and (5.14) it follows k ( y − y h ∗ , p − p h ∗ ) k a + γ osc (( y h ∗ , p h ∗ ) , T h ∗ ) β (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) (5.15)with β = √ β ρ .In view of (5.12) we have β ∈ (0 , ) provided h ≪
1. It follows from Corollary 5.2 that X T ∈ R T hk →T h ∗ η h k (( y h k , p h k ) , T ) > θ X T ∈T hk η h k (( y h k , p h k ) , T ) , (5.16)where θ = ˜ C (1 − β )˜ C ( ˜ C +(1+2 C ∗ ˜ C )˜ γ ) , ˜ γ = γ − C δ − ˜ κ ( h ) , ˜ C = max(1 , ˜ C ˜ γ ) and˜ β = (cid:16) (1 + δ ) β + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) (cid:17) . It follows from the definition of γ in (2.24) and ˜ γ in (4.32) that ˜ γ <
1, which together with ˜ C > C = ˜ C ˜ γ . Since h ≪
1, we obtain that ˜ γ > γ and ˜ β ∈ (0 , √ ρ ) from (5.11). It s easy to see from (3.50) and ˜ γ > γ that θ = ˜ C (1 − β ) ˜ C ˜ γ ( ˜ C + (1 + 2 C ∗ ˜ C )˜ γ ) > ˜ C ˜ C ( ˜ C ˜ γ + 1 + 2 C ∗ ˜ C ) (1 − ρ )= C (1 + ˆ C ˜ κ ( h )) C (1 + ˆ C ˜ κ ( h ))( C (1 − ˆ C ˜ κ ( h ))2˜ γ + 1 + C ∗ C (1 − ˆ C ˜ κ ( h ))) (1 − ρ ) > C C ( C γ + 1 + 2 C ∗ C ) (1 − ρ ) = C γC ( C + (1 + 2 C ∗ C ) γ ) (1 − ρ ) > θ, (5.17)provided h ≪
1. This implies X T ∈ R T hk →T h ∗ η h k (( y h k , p h k ) , T ) > θ X T ∈T hk η h k (( y h k , p h k ) , T ) . Note that Algorithm 3.7 selects a minimal set M h k = ˜ T h k satisfying X T ∈M hk η h k (( y h k , p h k ) , T ) > θ X T ∈T hk η h k (( y h k , p h k ) , T ) . Thus, M h k R T hk →T h ∗ T h ∗ − T h k T h ε − T h ( 1 √ ρ ) − s (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) − s | ( y, p, y d ) | s s , which is the desired result with an explicit dependance on the discrepancy between θ and C γC ( C +(1+2 C ∗ C ) γ ) . (cid:3) We are now ready to prove that Algorithm 3.8 possesses optimal complexity for the state andadjoint state approximations.
Theorem 5.4.
Let ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) be the solution of problem (3.1)-(3.2) and ( u h n , y h n , p h n ) ∈ U ad × V h n × V h n be a sequence of solutions of problem (3.7)-(3.8) correspondingto a sequence of finite element spaces V h n with partitions T h n produced by Algorithm 3.8. Then the n -th iterate solution ( y h n , p h n ) of Algorithm 3.8 satisfies the optimal bound k ( y − y h n , p − p h n ) k a + γ osc (( y h n , p h n ) , T h n ) . ( T h n − T h ) − s , (5.18) where the hidden constant depends on the exact solution ( u, y, p ) and the discrepancy between θ and C γC ( C +(1+2 C ∗ C ) γ ) .Proof. It follows from (2.22) and (5.10) that T h n − T h . n − X k =0 M h k . n − X k =0 (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) − s | ( y, p, y d ) | s s . (5.19)From the lower bound (3.49) we have k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) C (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) , where C = max(1 + γC , C C ). Then we arrive at T h n − T h . n − X k =0 (cid:16) k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) (cid:17) − s | ( y, p, y d ) | s s . (5.20) ue to (4.19), we obtain for 0 k < n that k ( y − y h n , p − p h n ) k a + γη h n (( y h n , p h n ) , Ω) β n − k ) (cid:16) k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) (cid:17) . Thus, T h n − T h . (cid:16) k ( y − y h n , p − p h n ) k a + γη h n (( y h n , p h n ) , Ω) (cid:17) − s | ( y, p, y d ) | s s n − X k =0 β n − ks . (cid:16) k ( y − y h n , p − p h n ) k a + γη h n (( y h n , p h n ) , Ω) (cid:17) − s | ( y, p, y d ) | s s , (5.21)where the last inequality holds due to the fact that β < (( y h n , p h n ) , T h n ) η h n (( y h n , p h n ) , Ω) , which together with (5.21) yields T h n − T h . (cid:16) k ( y − y h n , p − p h n ) k a + γ osc (( y h n , p h n ) , T h n ) (cid:17) − s , (5.22)this completes the proof. (cid:3) Remark 5.5.
From (3.35) and the equivalence property (3.13) we can conclude that Theorem 4.2also implies the convergence of k u − u h k k , Ω , namely, for the n -th iterate solution u h n of Algorithm3.8 there holds k u − u h n k , Ω . β n . (5.23) We remark that the control variable can also be included into the complexity analysis of AFEM foroptimal control problems to obtain k u − u h n k , Ω . ( T h n − T h ) − s . (5.24) However, the above results are sub-optimal for the optimal control as illustrated by the numericalresults in Section 6. To prove the optimality of AFEM for control variable it seems that we needto work with AFEM based on L -norm error estimators, we refer to [20] for optimal a priori errorestimate. We expect that the results in [12] will enable us to prove the optimal convergence ofAFEM for the optimal control u , this will be postponed to future work. Numerical experiments
In this section we carry out some numerical tests in two dimensions to support our theoreticalresults obtained in this paper. We take the elliptic operator L as − ∆ with homogeneous Dirichletboundary condition for all the examples. Example 6.1.
This example is taken from [1] . The domain Ω can be described in polar coordinatesby Ω = { ( r, ϑ ) , < r < , < ϑ < π } . We take the exact solutions as y ( r, ϑ ) = ( r λ − r ν ) sin( λϑ ) ,p ( r, ϑ ) = α ( r λ − r ν ) sin( λϑ ) ,u ( r, ϑ ) = P U ad ( − pα ) with λ = and ν = ν = . We set α = 0 . , a = − . and b = 1 . We assume the additionalright hand side f for the state equation. e give the numerical results for the optimal control approximation by Algorithm 3.8 withparameter θ = 0 . θ = 0 .
5. Figure 1 shows the profiles of the numerically computed optimalstate and adjoint state. We present in Figure 2 the triangulations by Algorithm 3.8 after 8 and 10adaptive iterations. We can see that the meshes are concentrated on the reentrant corner wherethe singularities located. −1 −0.5 0 0.5 1−10100.10.20.30.40.5 −1 −0.5 0 0.5 1−1−0.500.5100.010.020.030.040.05
Figure 1.
The profiles of the discretised optimal state y h (left) and adjoint state p h (right) for Example 6.1 on adaptively refined mesh. −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1−0.8−0.6−0.4−0.200.20.40.60.81 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1−0.8−0.6−0.4−0.200.20.40.60.81 Figure 2.
The meshes after 8 (left) and 10 (right) adaptive iterations for Example6.1 generated by Algorithm 3.8 with θ = 0 . L -norm of the control. In the right plot of Figure 3 we present the convergence behavioursof the optimal control, state and adjoint state, as well as the error estimators η y,h ( y h , Ω) and η p,y ( p h , Ω) for the state and adjoint state equations with adaptive refinement. In Figure 4 we resent the convergence of the error k ( y − y h , p − p h ) k a and error indicator η h (( y h , p h ) , Ω) with θ = 0 . θ = 0 .
5, respectively. It is shown from Figure 4 that the error k ( y − y h , p − p h ) k a isproportional to the a posteriori error estimators, which implies the efficiency of the a posteriorierror estimators given in Section 3. Moreover, we can also observe that the convergence order oferror k ( y − y h , p − p h ) k a is approximately parallel to the line with slope − / k u − u h k , Ω we can observe the reduction with slope −
1, which is betterthan the results presented in Remark 5.5, and strongly suggests that the convergence rate for theoptimal control is not optimal. −6 −5 −4 −3 −2 −1 Number of elements E rr o r s Uniform refinement||y−y h || ||p−p h || ||u−u h || slope=−1slope=−1/2 10 −5 −4 −3 −2 −1 Number of elements E rr o r s Error reductions||u−u h || ||y−y h || ||p−p h || state estimatoradjoint estimatorslope=−1slope=−1/2 Figure 3.
The convergence history of the optimal control, state and adjoint stateon uniformly refined meshes (left), and the convergence of the errors and estima-tors on adaptively refined meshes (right) for Example 6.1 generated by Algorithm3.8.
Example 6.2.
In the second example we consider an optimal control problem without explicitsolutions. we set
Ω = ( − , , α = 10 − , a = − and b = 10 . The desired state y d is chosen as , , − and − in the first, second, third and fourth quadrant, respectively. Similar to the above example Figure 5 shows the profiles of the numerically computed optimalstate and adjoint state. We present in the left plot of Figure 6 the triangulation generated byAlgorithm 3.8 after 8 adaptive iteration with parameter θ = 0 .
5. Since there are no explicitsolutions we can not show the convergence of the error k ( y − y h , p − p h ) k a as in Example 6.1.Instead we show in the right plot of Figure 6 the convergence of the error indicator η h (( y h , p h ) , Ω),the error estimators η y,h ( y h , Ω) and η p,y ( p h , Ω) for the state and adjoint state equations. We canobserve the error reduction with slope − / Example 6.3.
In the third example we also consider an optimal control problem without explicitsolutions defined on domain
Ω = ( − , × ( − , \ [0 , × ( x , . We set α = 10 − , a = 0 and b = 8 . We take the desired state y d = 2 . We show in Figure 7 the profiles of the numerically computed optimal state and adjoint state,singularities for both the state and adjoint state can be observed around the reentrant corner. Wepresent in the left plot of Figure 8 the triangulation generated by Algorithm 3.8 after 8 adaptiveiteration with parameter θ = 0 . −5 −4 −3 −2 −1 Number of elements E rr o r s Error reductions||u−u h || ||(y−y h ,p−p h )|| Indicatorsslope=−1slope=−1/2 −5 −4 −3 −2 −1 Number of elements E rr o r s Error reductions||u−u h || ||(y−y h ,p−p h )|| Indicatorsslope=−1slope=−1/2
Figure 4.
The convergence history of the optimal control, the state and adjointstate and error indicator on adaptively refined meshes with θ = 0 . θ = 0 . −1 −0.5 0 0.5 1−1−0.500.51−1.5−1−0.500.511.5 −1 −0.5 0 0.5 1−1−0.500.51−1−0.500.51 Figure 5.
The profiles of the discretised optimal state y h (left) and adjoint state p h (right) for Example 6.2 on adaptively refined mesh. η h (( y h , p h ) , Ω), the error estimators η y,h ( y h , Ω) and η p,y ( p h , Ω) for the state and adjoint stateequations. We can also observe the error reduction with slope − / Conclusion and outlook
In this paper we give a rigorous convergence analysis of the adaptive finite element algorithmfor optimal control problems governed by linear elliptic equation. We prove that the AFEM is acontraction, for the sum of the energy errors and the scaled error estimators of the state y and theadjoint state p , between two consecutive adaptive loops. We also show that the AFEM yields a esh after 8 iterations −2 −1 Number of elements E rr o r s Error reductions Indicatorstate estimatoradjoint estimatorslope=−1/2
Figure 6.
The mesh (left) after 8 adaptive iterations and the convergence historyof the error estimators on adaptively refined meshes (right) with θ = 0 . −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 100.20.40.60.811.21.4 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1−0.2−0.15−0.1−0.050 Figure 7.
The profiles of the discretised optimal state y h (left) and adjoint state p h (right) for Example 6.3 on adaptively refined mesh.decay rate of the energy errors of the state y and the adjoint state p plus oscillations of the stateand adjoint state equations in terms of the number of degrees of freedom.We expect that the results should also be valid for optimal Neumann boundary control prob-lems (see [27]) by the following observations. The key point for the convergence analysis is theequivalence properties presented in Theorem 3.3 where the relation between the finite elementoptimal control approximation and the standard finite element boundary value approximation isestablished. Consider the governing equation of the Neumann boundary control problem: (cid:26) Ly = f in Ω ,A ∇ y · n = u on ∂ Ω . Similar to the proof of Theorem 3.3 we can conclude from the trace theorem that k u − u h k ,∂ Ω . κ ( h )( k y − y h k a, Ω + k p − p h k a, Ω ) , esh after 10 iterations −2 −1 Number of elements E rr o r s Error reductions Indicatorstate estimatoradjoint estimatorslope=−1/2
Figure 8.
The mesh (left) after 10 adaptive iteration and the convergence historyof the error estimators on adaptively refined meshes (right) with θ = 0 . u h is the discrete optimal control. Then we can obtain the counterpart of (3.20)-(3.21) forNeumann boundary control problems k y − y h k a, Ω = k y h − y h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) , k p − p h k a, Ω = k p h − p h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) provided h ≪
1. Thus, the convergence and complexity analysis of AFEM carries out to theNeumann boundary control problems.There are many important issues remained unsolved for the convergence analysis of AFEM foroptimal control problems compared to AFEM for boundary value problems. Firstly, at this momentwe only prove the optimality of AFEM for energy errors of the state and adjoint state variables,the convergence for the optimal control u is sub-optimal. To prove the optimality of AFEM forthe optimal control u it seems that we should work on the optimality of AFEM for boundaryvalue problems under L -norms, as done in [12]. This complicates the convergence analysis withadditional restrictions to the adaptive algorithms and will be postponed to future work.Secondly, the convergence analysis of the adaptive finite element algorithm for other kind ofoptimal control problems like Stokes control problems (see [28]), and non-standard finite elementalgorithm such as mixed finite element methods (see [8]) remains open and will be addressed inforthcoming papers.Thirdly, we only prove the convergence of AFEM for optimal control problems with controlconstraints by using variational control discretization. The full control discretization conceptby using piecewise constant or piecewise linear finite elements is also very important among thenumerical methods for control problems. This kind of control discretizations results in an additionaldiscretised control space and an additional contribution to the a posteriori error estimators (see [22])which should be incorporated within the adaptive algorithm and the corresponding convergenceanalysis. We also intend to generalise our approach in this paper to analyse the convergence ofAFEM for optimal control problems with full control discretization in the future. Acknowledgements
The first author was supported by the National Basic Research Program of China under grant2012CB821204 and the National Natural Science Foundation of China under grant 11201464. The econd author acknowledged the support of the National Natural Science Foundation of Chinaunder grant 11171337. References [1] T. Apel, A. R¨osch and G. Winkler, Optimal control in non-convex domains: a priori discretization errorestimates, Calcolo, 44 (2007), pp. 137-158.[2] I. Babuˇska and W.C. Rheinboldt, Error estimates for adaptive finite element computations, SIAM J. Numer.Anal., 15 (1978), pp. 736-754.[3] R. Becker, H. Kapp and R. Rannacher, Adaptive finite element methods for optimal control of partial differentialequations: Basic concept, SIAM J. Control Optim., 39 (2000), pp. 113-132.[4] R. Becker and S.P. Mao, Quasi-optimality of an adaptive finite element method for an optimal control problems,Comput. Methods Appl. Math., 11 (2011), pp. 107-128.[5] M. Bergounioux, K. Ito and K. Kunisch, Primal-dual strategy for constrained optimal control problems, SIAMJ. Control Optim., 37 (1999), pp. 1176-1194.[6] P. Binev, W. Dahmen and R. DeVore, Adaptive finite element methods with convergence rates, Numer. Math.,97 (2004), pp. 219-268.[7] J.M. Cascon, C. Kreuzer, R.H. Nochetto and K.G. Siebert, Quasi-optimal convergence rate for an adaptivefinite element method, SIAM J. Numer. Anal., 46 (2008), no. 5, pp. 2524-2550.[8] Y.P. Chen and W.B. Liu, A posteriori error estimates for mixed finite element solutions of convex optimalcontrol problems, J. Comput. Appl. Math., 211 (2008), no. 1, pp. 76-89.[9] P.G. Ciarlet, The Finite Element Methods for Elliptic Problems, North-Holland, Amsterdam, 1978.[10] X.Y. Dai, L.H. He and A.H. Zhou, Convergence and quasi-optimal complexity of adaptive finite element com-putations for multiple eigenvalues, IMA J. Numer. Anal., 2014, DOI:10.1093/imanum/dru059.[11] X.Y. Dai, J.C. Xu and A.H. Zhou, Convergence and optimal complexity of adaptive finite element eigenvaluecomputations, Numer. Math., 110 (2008), pp. 313-355.[12] A. Demlow and R. Stevenson, Convergence and quasi-optimality of an adaptive finite element method forcontrolling L errors, Numer. Math., 117 (2011), no. 2, pp. 185-218.[13] W. D¨orfler, A convergent adaptive algorithm for Poissons equation, SIAM J. Numer. Anal., 33 (1996), pp.1106-1124.[14] A. Gaevskaya, R.H.W. Hoppe, Y. Iliash and M. Kieweg, Convergence analysis of an adaptive finite elementmethod for distributed control problems with control constraints, in Control of coupled partial differentialequations, vol. 155 of Internat. Ser. Numer. Math., Birkh¨auser, Basel, 2007, pp. 47-68.[15] P. Grisvard, Singularities in Boundary Value Problems, Masson, Paris, and Springer-Verlag, Berlin, 1992.[16] L.H. He and A.H. Zhou, Convergence and complexity of adaptive finite element methods for elliptic partialdifferential equations, Inter. J. Numer. Anal. Model., 8 (2011), no. 4, pp. 615-640.[17] M. Hinterm¨uller and R.H.W. Hoppe, Goal-oriented adaptivity in control constrained optimal control of partialdifferential equations, SIAM J. Control Optim., 47 (2008), no. 4, pp. 1721-1743.[18] M. Hinterm¨uller, R.H.W. Hoppe, Y. Iliash and M. Kieweg, An a posteriori error analysis of adaptive finiteelement methods for distributed elliptic control problems with control constraints, ESAIM: Control Optim.Calc. Var., 14 (2008), pp. 540-560.[19] M. Hinterm¨uller, K. Ito and K. Kunisch, The primal-dual active set strategy as a semismooth Newton method,SIAM J. Optim., 13 (2003), pp. 865-888.[20] M. Hinze, A variational discretization concept in control constrained optimization: The linear-quadratic case,Comput. Optim. Appl., 30 (2005), pp. 45-63.[21] M. Hinze, R. Pinnau, M. Ulbrich and S. Ulbrich, Optimization with PDE Constraints, Math. Model. Theo.Appl., 23, Springer, New York, 2009.[22] K. Kohls, A. R¨osch and K.G. Siebert, A posteriori error analysis of optimal control problems with controlconstraints, SIAM J. Control Optim., 52 (2014), pp. 1832-1861.[23] K. Kohls, A. R¨osch and K.G. Siebert, Convergence of adaptive finite elements for control constrained optimalcontrol problems, Preprint-Nr.: SPP1253-153, 2013.[24] R. Li, W.B. Liu, H.P. Ma and T. Tang, Adaptive finite element approximation for distributed elliptic optimalcontrol problems, SIAM J. Control Optim., 41 (2002), no. 5, pp. 1321-1349.[25] J.L. Lions, Optimal Control of Systems Governed by Partial Differential Equations, Springer-Verlag, Berlin,1971.[26] W.B. Liu and N.N. Yan, A posteriori error analysis for convex distributed optimal control problems, Adv.Comp. Math., 15 (2001), no. 1-4, pp. 285-309.[27] W.B. Liu and N.N. Yan, A posteriori error estimates for convex boundary control problems, SIAM J. Numer.Anal., 39 (2001), no. 1, pp. 73-99.[28] W.B. Liu and N.N. Yan, A posteriori error estimates for optimal problems governed by Stokes equations, SIAMJ. Numer. Anal., 40 (2003), pp. 1850-1869.
29] W.B. Liu and N.N. Yan, A posteriori error estimates for optimal control problems governed by parabolicequations, Numer. Math., 93 (2003), pp. 497-521.[30] W.B. Liu and N.N. Yan, Adaptive Finite Element Methods for Optimal Control Governed by PDEs, Sciencepress, Beijing, 2008.[31] K. Mekchay and R.H. Nochetto, Convergence of adaptive finite element methods for general second order linearelliplic PDEs, SIAM J. Numer. Anal., 43 (2005), pp. 1803-1827.[32] P. Morin, R.H. Nochetto and K.G. Siebert, Data oscillation and convergence of adaptive FEM, SIAM J. Numer.Anal., 38 (2000), pp. 466-488.[33] P. Morin, R.H. Nochetto, and K.G. Siebert, Convergence of adaptive finite element methods, SIAM Rev., 44(2002), pp. 631-658.[34] R. Stevenson, Optimality of a standard adaptive finite element method, Found. Comput. Math., 7 (2007), pp.245-269.[35] R. Stevenson, The completion of locally refined simplicial partitions created by bisection, Math. Comput., 77(2008), pp. 227-241.[36] R. Verf¨urth, A Review of a Posteriori Error Estimates and Adaptive Mesh Refinement Techniques, Wiley-Teubner, New York, 1996.29] W.B. Liu and N.N. Yan, A posteriori error estimates for optimal control problems governed by parabolicequations, Numer. Math., 93 (2003), pp. 497-521.[30] W.B. Liu and N.N. Yan, Adaptive Finite Element Methods for Optimal Control Governed by PDEs, Sciencepress, Beijing, 2008.[31] K. Mekchay and R.H. Nochetto, Convergence of adaptive finite element methods for general second order linearelliplic PDEs, SIAM J. Numer. Anal., 43 (2005), pp. 1803-1827.[32] P. Morin, R.H. Nochetto and K.G. Siebert, Data oscillation and convergence of adaptive FEM, SIAM J. Numer.Anal., 38 (2000), pp. 466-488.[33] P. Morin, R.H. Nochetto, and K.G. Siebert, Convergence of adaptive finite element methods, SIAM Rev., 44(2002), pp. 631-658.[34] R. Stevenson, Optimality of a standard adaptive finite element method, Found. Comput. Math., 7 (2007), pp.245-269.[35] R. Stevenson, The completion of locally refined simplicial partitions created by bisection, Math. Comput., 77(2008), pp. 227-241.[36] R. Verf¨urth, A Review of a Posteriori Error Estimates and Adaptive Mesh Refinement Techniques, Wiley-Teubner, New York, 1996.