[PDF] Adaptive finite element method for elliptic optimal control problems: convergence and optimality

Abstract

In this paper we consider the convergence analysis of adaptive finite element method for elliptic optimal control problems with pointwise control constraints. We use variational discretization concept to discretize the control variable and piecewise linear and continuous finite elements to approximate the state variable. Based on the well-established convergence theory of AFEM for elliptic boundary value problems, we rigorously prove the convergence and quasi-optimality of AFEM for optimal control problems with respect to the state and adjoint state variables, by using the so-called perturbation argument. Numerical experiments confirm our theoretical analysis.

Full PDF

aa r X i v : . [ m a t h . NA ] M a y ADAPTIVE FINITE ELEMENT METHOD FOR ELLIPTIC OPTIMALCONTROL PROBLEMS: CONVERGENCE AND OPTIMALITY

WEI GONG ∗ AND NINGNING YAN ⋄ Abstract:

In this paper we consider the convergence analysis of adaptive ﬁnite elementmethod for elliptic optimal control problems with pointwise control constraints. We use varia-tional discretization concept to discretize the control variable and piecewise linear and continu-ous ﬁnite elements to approximate the state variable. Based on the well-established convergencetheory of AFEM for elliptic boundary value problems, we rigorously prove the convergence andquasi-optimality of AFEM for optimal control problems with respect to the state and adjointstate variables, by using the so-called perturbation argument. Numerical experiments conﬁrm ourtheoretical analysis.

Keywords: optimal control problem, elliptic equation, control constraint, adaptive ﬁniteelement method, convergence and optimality

Subject Classiﬁcation:

Introduction

Adaptive ﬁnite element method (AFEM for short), contributed to the pioneer work of Babuˇskaand Rheinboldt ([2]), becomes nowadays a popular approach in the community of engineeringand scientiﬁc computing. It aims at distributing more mesh nodes around the area where thesingularities happen to save the computational cost. Various types of reliable and eﬃcient aposteriori error estimators, which are used to detect the location of singularity and essential forthe success of AFEM, have been developed in the last decades for diﬀerent kind of problems, werefer to [36] for an overview.Although AFEM has been successfully applied for more than three decades, the convergenceanalysis is rather recent which started with D¨orﬂer [13] and was further studied in [6, 32, 33, 31, 7].Besides convergence, optimality is another important issue in AFEM which was ﬁrstly addressedby Binev et al. [6] and further studied by Stevenson ([34, 35]). The so-called D¨orﬂer’s markingproposed in [13] and quasi-error introduced in [7] consisting of the sum of the energy error and thescaled estimator are crucial to prove the contraction of the errors and quasi-optimal cardinality ofthe standard AFEM which avoids marking for oscillation ([13]) and circumvents the interior nodeproperty of mesh reﬁnement ([32, 33]).AFEM also ﬁnds successful application in optimal control problems governed by partial diﬀer-ential equations, starting from Liu, Yan [26] and Becker, Kapp, Rannacher [3]. In [3] the authorsproposed a dual-weighted goal-oriented adaptivity for optimal control problems while in [26] resid-ual type a posteriori error estimates were derived. We refer to [17, 18, 24, 27, 28, 29, 30] for moredetails of recent advance. Recently, Kohls, R¨osch and Siebert derived in [22] an error equivalenceproperty which enables one to derive reliable and eﬃcient a posteriori error estimators for theoptimal control problems with either variational discretization or full control discretization.There also exist some attempts to prove the convergence of AFEM for optimal control problems.In [14] the authors considered the piecewise constant approximation of the control variable and gavea error reduction property for the quadruplet ( u, y, p, σ ), where u, y, p denote the optimal control,

Date : April 16, 2018. ∗ LSEC, Institute of Computational Mathematics, Academy of Mathematics and Systems Science, ChineseAcademy of Sciences, Beijing 100190, China. Email: [email protected] . ⋄ NCMIS, LSEC, Institute of Systems Science, Academy of Mathematics and Systems Science, Chinese Academyof Sciences, Beijing 100190, China. Email: [email protected] . tate, and adjoint state variables and σ the associated co-control variable. However, additionalrequirement on the strict complementarity of the continuous problem and non-degeneracy propertyof the discrete control problem are assumed and marking strategy is extended to include the discretefree boundary between the active and inactive control sets. In [4] the authors viewed the controlproblems as a nonlinear elliptic system of the state and adjoint variables and gave a convergenceproof for adaptive algorithm involving the marking of data oscillation. In [23] the authors provedthat the sequence of adaptively generated discrete solutions converged to the true solutions foroptimal control problems, but obtained only the plain convergence of adaptive algorithm withoutconvergence rate and optimality. In this paper we intend to give a rigorous convergence prooffor the adaptive ﬁnite element algorithm of elliptic optimal control problem in an optimal controlframework. We want to stress that the AFEM adopted in the current paper uses D¨orﬂer’s marking([13]) and is a standard algorithm in that it employs only the error indicators and does not usethe oscillation indicators.Inspired by the work [11] of Dai, Xu and Zhou where the convergence and optimality of AFEMfor elliptic eigenvalue problem are proved by exploiting the certain relationship between the ﬁniteelement eigenvalue approximation and the associated ﬁnite element boundary value approximation,in this paper we will provide a rigorous convergence analysis of the adaptive ﬁnite element algorithmfor the optimal control problems governed by linear elliptic equation. Under mild assumption on theinitial mesh from which the adaptive algorithm starts, we show that the energy norm errors of thestate and adjoint state variables are equivalent to the boundary value approximations of the stateand adjoint state equations up to a higher order term. Then based on the well-known convergenceresult of AFEM for elliptic boundary value problems, we are able to prove the convergence ofAFEM for the optimal control problems (OCPs for short). To be more speciﬁc, the AFEM forOCPs is a contraction, for the sum of the energy errors and the scaled error estimators of the state y and the adjoint state p , between two consecutive adaptive loops. We also show that the AFEMyields a decay rate of the energy errors of the state y and the adjoint state p plus oscillations ofthe state and adjoint state equations in terms of the number of degrees of freedom. This result isan improvement over the plain convergence result presented in [23].The rest of the paper is organised as follows. In Section 2 we recall some well-known resultson the convergence analysis of AFEM for elliptic boundary values problems. In Section 3 weintroduce the ﬁnite element approximation of the optimal control problems and derive a posteriorierror estimates. Adaptive ﬁnite element algorithm for the optimal control problems based onD¨orﬂer’s marking is also presented. In Section 4 we give a rigorous convergence analysis of theAFEM for optimal control problems and the quasi-optimal cardinality is proved in Section 5.Numerical experiments are carried out in Section 6 to validate our theoretical result. Finally, wegive a conclusion in Section 7 and outlook the possible extensions and future work.Let Ω ⊂ R d ( d = 2 ,

3) be a bounded polygonal or polyhedral domain. We denote by W m,q (Ω)the usual Sobolev space of order m >

0, 1 q < ∞ with norm k · k m,q, Ω and seminorm | · | m,q, Ω .For q = 2 we denote W m,q (Ω) by H m (Ω) and k · k m, Ω = k · k m, , Ω , which is a Hilbert space. Notethat H (Ω) = L (Ω) and H (Ω) = { v ∈ H (Ω) : v = 0 on ∂ Ω } . We denote C a generic positiveconstant which may stand for diﬀerent values at its diﬀerent occurrences but does not depend onmesh size. We use the symbol A . B to denote A CB for some constant C that is independentof mesh size. 2. Preliminaries

In this section, we recall some well-known results on the adaptive ﬁnite element approximationto a linear elliptic boundary value problem, which are then used for the convergence analysis ofAFEM for optimal control problems. Some of the results are collected from [7] and [11], see also[16]. onsider the following second order elliptic equation(2.1) (cid:26) Ly = f in Ω ,y = 0 on ∂ Ω , where L is a linear second order elliptic operator of the following form: Ly := − d X i,j =1 ∂∂x j ( a ij ∂y∂x i ) + cy. We denote L ∗ the adjoint operator of LL ∗ y := − d X i,j =1 ∂∂x j ( a ji ∂y∂x i ) + cy. Here a ij ∈ W , ∞ (Ω) ( i, j = 1 , · · · , d ) is symmetric, positive deﬁnite and 0 c < ∞ . We denote A = ( a ij ) d × d and A ∗ its adjoint. Let a ( y, v ) = Z Ω d X i,j =1 a ij ∂y∂x i ∂v∂x j + cyv, ∀ y, v ∈ H (Ω) . It is clear that a ( · , · ) is a bounded bilinear form over H (Ω) and deﬁnes a norm k · k a, Ω = p a ( · , · )which is equivalent to k · k , Ω .The standard weak form of (2.1) reads as follows: Find y ∈ H (Ω) such that a ( y, v ) = ( f, v ) ∀ v ∈ H (Ω) . (2.2)For each f ∈ H − (Ω) the above problem admits a unique solution by the well-known Lax-Milgramtheorem. Since the elliptic equation (2.2) is linear with respect to the right hand side f , we candeﬁne a linear solution operator S : L (Ω) → H (Ω) such that y = Sf .Let T h be a regular triangulation of Ω such that ¯Ω = ∪ T ∈T h ¯ T . We assume that T h is shaperegular in the sense that: there exists a constant γ ∗ > h T ρ T γ ∗ for all T ∈ T h , where h T denotes the diameter of T and ρ T is the diameter of the biggest ball contained in T . We set h = max T ∈T h h T . In this paper, we use E h to denote the set of interior faces (edges or sides) of T h and T h to denote the number of elements of T h .On T h we construct a family of nested ﬁnite element spaces V h consisting of piecewise linear andcontinuous polynomials such that V h ⊂ C ( ¯Ω) ∩ H (Ω). We deﬁne the standard Galerkin projectionoperator R h : H (Ω) → V h by ([9]) a ( y − R h y, v h ) = 0 ∀ v h ∈ V h , (2.3)which satisﬁes the following stability result kR h y k a, Ω . k y k a, Ω ∀ y ∈ H (Ω) . (2.4)A standard ﬁnite element approximation to (2.2) can then be formulated as: Find y h ∈ V h suchthat a ( y h , v h ) = ( f, v h ) ∀ v h ∈ V h . (2.5)Similarly, we can deﬁne a discrete solution operator S h : L (Ω) → V h such that y h = S h f . Thus,we have y h = R h y = R h Sf .For the following purpose, we follow the idea of [11] to introduce the quantity κ ( h ) as follows κ ( h ) = sup f ∈ L (Ω) , k f k , Ω =1 inf v h ∈ V h k Sf − v h k a, Ω . (2.6)We note that the quantity κ ( h ) is determined by the regularity of Sf which is further inﬂuenced bythe property of domain Ω. Indeed, if the boundary of Ω is smooth, like C , the additional regularity Sf ∈ H (Ω) holds and thus κ ( h ) = O ( h ). This is still true for polygonal or polyhedral boundariesif the domain is convex. The regularity is reduced, however, in the vicinity of nonconvex portions ofpolygonal or polyhedral boundaries. Grisvard proved in [15] the precise regularity results (Theorem .4.3 for the two-dimensional case and Corollary 2.6.7 for the three-dimensional case): there existsan ε ∈ (0 , ], which depends on the shape of the domain, such that Sf ∈ H + ε (Ω) for each f ∈ L (Ω). Obviously, κ ( h ) ≪ h ∈ (0 , h ) if h ≪ Proposition 2.1.

For each f ∈ L (Ω) , there hold k Sf − S h f k a, Ω . κ ( h ) k f k , Ω (2.7) and k Sf − S h f k , Ω . κ ( h ) k Sf − S h f k a, Ω . (2.8)Now we are in the position to review the residual type a posteriori error estimator for the ﬁniteelement approximation of elliptic boundary value problem. We deﬁne the element residual ˜ r T ( y h )and the jump residual ˜ j E ( y h ) by˜ r T ( y h ) : = f − Ly h = f + ∇ · ( A ∇ y h ) − cy h in T h , (2.9) ˜ j E ( y h ) : = [ A ∇ y h ] E · n E on E ∈ E h , (2.10)where [ A ∇ y h ] E · n E denotes the jump of A ∇ y h across the common side E of elements T + and T − , n E denotes the outward normal oriented to T − . For each element T ∈ T h , we deﬁne the local errorindicator ˜ η h ( y h , T ) by˜ η h ( y h , T ) := (cid:16) h T k ˜ r T ( y h ) k ,T + X E ∈E h ,E ⊂ ∂T h E k ˜ j E ( y h ) k ,E (cid:17) . (2.11)Then on a subset ω ⊂ Ω, we deﬁne the error estimator ˜ η h ( y h , ω ) by˜ η h ( y h , ω ) := (cid:16) X T ∈T h ,T ⊂ ω ˜ η h ( y h , T ) (cid:17) . (2.12)Thus, ˜ η h ( y h , Ω) constitutes the error estimator on Ω with respect to T h .For f ∈ L (Ω) we also need to deﬁne the data oscillation as (see [32, 33])osc( f, T ) := k h T ( f − ¯ f T ) k ,T , osc( f, T h ) := (cid:16) X T ∈T h osc ( f, T ) (cid:17) , (2.13)where ¯ f T denotes the L -projection of f onto piecewise constant space on T . It is easy to see thatosc( f + f , T h ) osc( f , T h ) + osc( f , T h ) , ∀ f , f ∈ L (Ω) . (2.14)For the above deﬁned data oscillation we have the following lemma whose proof can be found in[11, Lemma 2.4]. Lemma 2.2.

There exists a constant C ∗ which depends on A , the mesh regularity constant γ ∗ andcoeﬃcient c , such thatosc ( Lv, T h ) C ∗ k v k a, Ω , osc ( L ∗ v, T h ) C ∗ k v k a, Ω ∀ v ∈ V h . (2.15)Now we can formulate the following global upper and lower bounds for the a posteriori errorestimators of elliptic boundary value problems (see, e.g., [13, 36]): k y − y h k a, Ω ˜ C ˜ η h ( y h , Ω) , (2.16) ˜ C ˜ η h ( y h , Ω) k y − y h k a, Ω + ˜ C osc ( f − Ly h , T h ) . (2.17)For our following purpose we also need to study the adjoint equation of elliptic boundary valueproblem (2.1). For each g ∈ L (Ω), let p ∈ H (Ω) be the solution of the following adjoint equation a ( v, p ) = ( g, v ) ∀ v ∈ H (Ω)(2.18)with its ﬁnite element approximation a ( v h , p h ) = ( g, v h ) ∀ v h ∈ V h . (2.19) e can also give the a posteriori global upper and lower error bounds: k p − p h k a, Ω ˜ C ˜ η h ( p h , Ω) , (2.20) ˜ C ˜ η h ( p h , Ω) k p − p h k a, Ω + ˜ C osc ( g − L ∗ p h , T h ) . (2.21)To analyse the adaptive ﬁnite element approximation for optimal control problem, we introducea system of two source problems associated with the state and adjoint state equations, which is sometrivial extension for the existing results of adaptive ﬁnite element approximation of scalar problem(see [7]). Speciﬁcally, we introduce the adaptive ﬁnite element algorithm to solve a system ofelliptic boundary value problems (2.2) and (2.18). There are diﬀerent kinds of adaptive algorithmswhich diﬀer from the marking strategies (see [31, 32, 33]). Here we follow the D¨orﬂer’s markingintroduced in [13] which marks only the error estimator and avoids the marking for oscillation: Algorithm 2.3.

The D¨orﬂer’s marking strategy for BVPs (1)

Given a parameter < θ < ; (2) Construct a minimal subset ˜ T h ⊂ T h such that X T ∈ ˜ T h (cid:0) ˜ η h ( y h , T ) + ˜ η h ( p h , T ) (cid:1) > θ (cid:0) ˜ η h ( y h , Ω) + ˜ η h ( p h , Ω) (cid:1) . (3) Mark all the elements in ˜ T h . The adaptive algorithm for solving elliptic boundary value problems can then be described asfollows (see [7]):

Algorithm 2.4.

Adaptive ﬁnite element method for BVPs: (1)

Given an initial mesh T h with mesh size h and construct the ﬁnite element space V h . (2) Set k = 0 , solve (2.5) and (2.19) to obtain ( y h k , p h k ) ∈ V h k × V h k . (3) Compute the local error indicators ˜ η h k ( y h k , T ) and ˜ η h k ( p h k , T ) for each T ∈ T h k . (4) Construct ˜ T h k ⊂ T h k by the marking Algorithm 2.3. (5) Reﬁne ˜ T h k to get a new conforming mesh T h k +1 . (6) Construct the ﬁnite element space V h k +1 , solve (2.5) and (2.19) to obtain ( y h k +1 , p h k +1 ) ∈ V h k +1 × V h k +1 . (7) Set k = k + 1 and go to Step (3). We denote T the class of all conforming reﬁnements by bisection of T h (see [7] for more details).Given a ﬁxed number b >

1, for any T h k ∈ T and M h k ⊂ T h k of marked elements, T h k +1 = REFINE( T h k , M h k )outputs a conforming triangulation T h k +1 ∈ T , where at least all elements of M h k are bisected b times. We deﬁne R T hk →T hk +1 = T h k \ ( T h k ∩ T h k +1 ) as the set of reﬁned elements satisﬁes M h k ⊂ R T hk →T hk +1 .Then we can formulate the following standard result on the complexity of reﬁnement, see [7,Lemma 2.3] and [35] for more details. Lemma 2.5.

Assume that T h veriﬁes condition (b) of Section 4 in [35] . Let T h k ( k > ) be asequence of conforming and nested triangulations of Ω generated by REFINE starting from theinitial mesh T h . Assume that T h k +1 is generated from T h k by T h k +1 = REFINE ( T h k , M h k ) with asubset M h k ⊂ T h k . Then there exists a constant ˆ C depending on T h and b such that T h k +1 − T h ˆ C k X i =0 M h i ∀ k > . (2.22)We deﬁne k ( y, p ) k a = a ( y, y ) + a ( p, p ) . he convergence of Algorithm 2.4 based on the marking Algorithm 2.3 is proven in [7] and now be-comes a standard theory for the convergence analysis of AFEM for diﬀerent kind of boundary valueproblems. The following Theorem 2.6, Lemma 2.7 and Lemma 2.8 are extensions of correspondingresults for single elliptic equation in [7] by some primary operations. We remark that in [10] theauthors used the similar idea to prove the convergence of adaptive ﬁnite element computations formultiple eigenvalues. Theorem 2.6.

Let ( y h k , p h k ) ∈ V h k × V h k be a sequence of ﬁnite element solutions of problems(2.2) and (2.18) based on the adaptively reﬁned mesh T h k produced by Algorithm 2.4. Then thereexist constants ˜ γ > and ˜ β ∈ (0 , , depending only on the shape regularity of meshes, the dataand the parameters used in Algorithm 2.4, such that for any two consecutive iterates k and k + 1 we have k ( y − y h k +1 , p − p h k +1 ) k a + ˜ γ (cid:0) ˜ η h k +1 ( y h k +1 , Ω) + ˜ η h k +1 ( p h k +1 , Ω) (cid:1) ˜ β (cid:16) k ( y − y h k , p − p h k ) k a + ˜ γ (cid:0) ˜ η h k ( y h k , Ω) + ˜ η h k ( p h k , Ω) (cid:1)(cid:17) . (2.23) Here ˜ γ := 1(1 + δ − ) C ∗ (2.24) with some constant δ ∈ (0 , . To prove the optimal complexity of the adaptive algorithm we need further results. The followinglemma presents a localised upper bound estimate for the distance between two nested solutions ofthe elliptic boundary value problems (2.2) and (2.18) (see [7, Lemma 3.6] and [11, Lemma 6.2]).

Lemma 2.7.

Let ( y h k , p h k ) ∈ V h k × V h k and ( y h k +1 , p h k +1 ) ∈ V h k +1 × V h k +1 be the discrete solutionsof problems (2.2) and (2.18) over a mesh T h k and its reﬁnement T h k +1 with marked element M h k ⊂T h k . Let R T hk →T hk +1 be the set of reﬁned elements. Then the following localised upper bound isvalid k ( y h k − y h k +1 , p h k − p h k +1 ) k a ˜ C X T ∈ R T hk →T hk +1 (cid:0) ˜ η h k ( y h k , T ) + ˜ η h k ( p h k , T ) (cid:1) . (2.25)Consequently, we can show the optimality of the D¨orﬂer’s marking strategy in the followinglemma (see [7, Lemma 5.9] and [11, Proposition 6.3] for the proof). Lemma 2.8.

Let ( y h k , p h k ) ∈ V h k × V h k and ( y h k +1 , p h k +1 ) ∈ V h k +1 × V h k +1 be the discrete solutionsof problems (2.2) and (2.18) over a mesh T h k and its reﬁnement T h k +1 with marked element M h k ⊂T h k . Suppose that they satisfy the energy decrease property k ( y − y h k +1 , p − p h k +1 ) k a + ˜ γ (cid:0) osc ( f − Ly h k +1 , T h k +1 ) + osc ( g − L ∗ p h k +1 , T h k +1 ) (cid:1) ˜ β (cid:16) k ( y − y h k , p − p h k ) k a + ˜ γ (cid:0) osc ( f − Ly h k , T h k ) + osc ( g − L ∗ p h k , T h k ) (cid:1)(cid:17) (2.26) with ˜ γ > a constant and ˜ β ∈ (0 , ) . Then the set R T hk →T hk +1 of marked elements satisﬁes theD¨orﬂer property X T ∈ R T hk →T hk +1 (cid:0) ˜ η h k ( y h k , T ) + ˜ η h k ( p h k , T ) (cid:1) > ˜ θ X T ∈T hk (cid:0) ˜ η h k ( y h k , T ) + ˜ η h k ( p h k , T ) (cid:1) (2.27) with ˜ θ = ˜ C (1 − β )˜ C ( ˜ C +(1+2 C ∗ ˜ C )˜ γ ) , where ˜ C = max(1 , ˜ C ˜ γ ) . . Adaptive finite element method for optimal control problem

In this section we consider the following elliptic optimal control problem:min u ∈ U ad J ( y, u ) = 12 k y − y d k , Ω + α k u k , Ω (3.1)subject to(3.2) (cid:26) Ly = u in Ω ,y = 0 on ∂ Ω , where α > U ad is the admissible control set with bilateral control constraints: U ad := n u ∈ L (Ω) , a u b a.e. in Ω o , where a, b ∈ R and a < b . Remark 3.1.

We remark that all the theories presented below can be generalised to the case thatthe control acts on a subdomain ω ⊂ Ω . In this case the governing equation reads Ly = Bu withthe control operator B : L ( ω ) → L (Ω) an extension by zero operator from ω to Ω . With the solution operator S of elliptic equation (3.2) introduced in last section, we can formu-late a reduced optimization problemmin u ∈ U ad ˆ J ( u ) := J ( Su, u ) = 12 k Su − y d k , Ω + α k u k , Ω . (3.3)Since the above optimization problem is linear and strictly convex, there exists a unique solution u ∈ U ad by standard argument (see [25]). Moreover, the ﬁrst order necessary and suﬃcient optimalitycondition can be stated as follows:(3.4) ˆ J ′ ( u )( v − u ) = ( αu + S ∗ ( Su − y d ) , v − u ) > , ∀ v ∈ U ad , where S ∗ is the adjoint of S ([21]). Introducing the adjoint state p := S ∗ ( Su − y d ) ∈ H (Ω), weare led to the following optimality system:(3.5)  a ( y, v ) = ( u, v ) , ∀ v ∈ H (Ω) ,a ( w, p ) = ( y − y d , w ) , ∀ w ∈ H (Ω) , ( αu + p, v − u ) > , ∀ v ∈ U ad . Hereafter, we call u , y and p the optimal control, state and adjoint state, respectively. From thelast inequality of (3.5) we have the pointwise representation of u (see [25]): u ( x ) = P [ a,b ] n − α p ( x ) o , (3.6)where P [ a,b ] is the orthogonal projection operator from L (Ω) to U ad .Next, let us consider the ﬁnite element approximation of (3.1)-(3.2). In this paper, we usethe piecewise linear ﬁnite elements to approximate the state y , and variational discretization forthe optimal control u (see [20]). Based on the ﬁnite element space V h , we can deﬁne the ﬁnitedimensional approximation to the optimal control problem (3.1)-(3.2) as follows: Find ( u h , y h ) ∈ U ad × V h such that min u h ∈ U ad J h ( y h , u h ) = 12 k y h − y d k , Ω + α k u h k , Ω (3.7)subject to a ( y h , v h ) = ( f + u h , v h ) , ∀ v h ∈ V h . (3.8)Similar to the continuous case we have y h = S h u h . With this notation we can formulate a reduceddiscrete optimization problemmin u h ∈ U ad ˆ J h ( u h ) := J h ( S h u h , u h ) = 12 k S h u h − y d k , Ω + α k u h k , Ω . (3.9) e note that the above optimization problem can be solved by projected gradient method orsemi-smooth Newton method, see [19], [21] and [30] for more details.Similar to the continuous problem (3.1)-(3.2), the above discretized optimization problem alsoadmits a unique solution u h ∈ U ad . Moreover, the ﬁrst order necessary and suﬃcient optimalitycondition can be stated as follows:ˆ J ′ h ( u h )( v h − u h ) = ( αu h + S ∗ h ( S h u h − y d ) , v h − u h ) > , ∀ v h ∈ U ad , (3.10)where S ∗ h is the adjoint of S h . Introducing the adjoint state p h := S ∗ h ( S h u h − y d ) ∈ V h , thediscretized ﬁrst order necessary and suﬃcient optimality condition is equivalent to:(3.11)  a ( y h , v h ) = ( u h , v h ) , ∀ v h ∈ V h ,a ( w h , p h ) = ( y h − y d , w h ) , ∀ w h ∈ V h , ( αu h + p h , v h − u h ) > , ∀ v h ∈ U ad . Hereafter, we call u h , y h and p h the discrete optimal control, state and adjoint state, respectively.Similar to the continuous case (3.6) we have u h ( x ) = P U ad n − α p h ( x ) o . (3.12)It should be noticed that u h is not generally a ﬁnite element function in V h .For convenience we deﬁne y h := Su h and p h := S ∗ ( S h u h − y d ). It is obvious that y h and p h arethe standard Galerkin projections of y h and p h , i.e., y h = R h y h and p h = R h p h . The followingequivalence property is established in [22]. Theorem 3.2.

Let ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) and ( u h , y h , p h ) ∈ U ad × V h × V h be thesolutions of problems (3.1)-(3.2) and (3.7)-(3.8), respectively. Then the following an equivalenceproperty holds: k u − u h k , Ω + k y − y h k a, Ω + k p − p h k a, Ω ≈ k y h − y h k a, Ω + k p h − p h k a, Ω . (3.13) Proof.

For completeness we include a brief proof. Setting v = u h in (3.4) and v h = u in (3.10) weare led to ( αu + S ∗ ( Su − y d ) , u h − u ) > , (3.14) ( αu h + S ∗ h ( S h u h − y d ) , u − u h ) > . (3.15)Adding the above two inequalities, we conclude from (3.5) and (3.11) that α k u − u h k , Ω ( S ∗ h ( S h u h − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S ∗ h ( S h u h − y d ) − S ∗ ( S h u h − y d ) , u − u h ) + ( S ∗ ( S h u h − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S ∗ h ( S h u h − y d ) − S ∗ ( S h u h − y d ) , u − u h ) + ( S h u h − Su, Su − Su h )= ( S ∗ h ( S h u h − y d ) − S ∗ ( S h u h − y d ) , u − u h ) + ( S h u h − Su, Su − S h u h )+( S h u h − Su, S h u h − Su h ) . (3.16)It follows from the ε -Young inequality that α k u − u h k , Ω C k Su h − S h u h k a, Ω + C k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω . (3.17)Moreover, we have k y − y h k a, Ω k y − Su h k a, Ω + k Su h − y h k a, Ω C k u − u h k , Ω + k Su h − y h k a, Ω and k p − p h k a, Ω k p − S ∗ ( S h u h − y d ) k a, Ω + k S ∗ ( S h u h − y d ) − p h k a, Ω k Su − S h u h k , Ω + k S ∗ ( S h u h − y d ) − p h k a, Ω C k u − u h k , Ω + k S ∗ ( S h u h − y d ) − p h k a, Ω + k Su h − y h k a, Ω . Combining the above estimates we prove the upper bound. ow we prove the lower bound. Note that k Su h − S h u h k a, Ω k Su h − Su k a, Ω + k Su − S h u h k a, Ω C k u − u h k , Ω + k y − y h k a, Ω . (3.18)Similarly, we can derive that k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω k S ∗ ( S h u h − y d ) − S ∗ ( Su − y d ) k a, Ω + k S ∗ ( Su − y d ) − S ∗ h ( S h u h − y d ) k a, Ω k S h u h − Su k , Ω + k p − p h k a, Ω = k y − y h k a, Ω + k p − p h k a, Ω . (3.19)Thus, we can conclude from the above estimates the lower bound. This completes the proof. (cid:3) Next, we will prove a compact equivalence property which shows the certain relationship betweenthe ﬁnite element optimal control approximation and the associated ﬁnite element boundary valueapproximation.

Theorem 3.3.

Let h ∈ (0 , h ) , ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) and ( u h , y h , p h ) ∈ U ad × V h × V h bethe solutions of problems (3.1)-(3.2) and (3.7)-(3.8), respectively. Then the following equivalenceproperties hold k y − y h k a, Ω = k y h − y h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) , (3.20) k p − p h k a, Ω = k p h − p h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) (3.21) provided h ≪ .Proof. It is obvious that y − y h = y h − y h + y − y h , p − p h = p h − p h + p − p h . (3.22)Moreover, it follows from the stability results of elliptic equation that k y − y h k a, Ω C k u − u h k , Ω , k p − p h k a, Ω C k y − y h k , Ω . (3.23)In the following we estimate k y − y h k , Ω . Let ψ ∈ H (Ω) be the solution of the following auxiliaryproblem(3.24) (cid:26) L ∗ ψ = y − y h in Ω ,ψ = 0 on ∂ Ω . Let ψ h ∈ V h be the ﬁnite element approximation of ψ . Then we can conclude from (2.7) and thestandard duality argument (see, e.g., [9]) that k y − y h k , Ω = a ( y − y h , ψ )= a ( y − y h , ψ − ψ h ) + a ( y − y h , ψ h )= a ( y − y h , ψ − ψ h ) + ( u − u h , ψ h − ψ ) + ( u − u h , ψ ) C (cid:16) κ ( h ) k y − y h k a, Ω + k u − u h k , Ω (cid:17) k y − y h k , Ω , which in turn implies k y − y h k , Ω Cκ ( h ) k y − y h k a, Ω + C k u − u h k , Ω . (3.25)Considering (3.23) we have k p − p h k a, Ω Cκ ( h ) k y − y h k a, Ω + C k u − u h k , Ω . (3.26) t remains to estimate k u − u h k , Ω . Note that it follows from (3.14) and (3.15) that α k u − u h k , Ω ( S ∗ h ( S h u h − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S ∗ h ( S h u h − y d ) − S ∗ h ( S h u − y d ) , u − u h )+( S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) , u − u h )= ( S h ( u h − u ) , S h ( u − u h )) + ( S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) , u − u h ) ( S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) , u − u h ) , which yields k u − u h k , Ω C k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω . (3.27)Let φ ∈ H (Ω) be the solution of the following auxiliary problem(3.28) (cid:26) Lφ = S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) in Ω ,φ = 0 on ∂ Ω . Then from the standard duality argument we have k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω = a ( φ, S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ))= a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) + a ( φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ))= a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) + ( φ h , S h u − Su )= a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) + ( φ h − φ, S h u − Su ) + ( φ, S h u − Su ) , (3.29)where φ h ∈ V h is the ﬁnite element approximation of φ . We can conclude from (2.7)-(2.8) that a ( φ − φ h , S ∗ h ( S h u − y d ) − S ∗ ( Su − y d )) Cκ ( h ) k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k a, Ω (3.30)and ( φ h − φ, S h u − Su ) Cκ ( h ) k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω k S h u − Su k a, Ω , (3.31) ( φ, S h u − Su ) Cκ ( h ) k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω k S h u − Su k a, Ω . (3.32)Then we are able to derive that k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k , Ω Cκ ( h )( k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k a, Ω + k S h u − Su k a, Ω ) . (3.33)Combining (3.27) and (3.33) we are led to k u − u h k , Ω . κ ( h )( k S ∗ h ( S h u − y d ) − S ∗ ( Su − y d ) k a, Ω + k S h u − Su k a, Ω ) . κ ( h )( k p h − p k a, Ω + k S ∗ h ( S h u − y d ) − S ∗ h ( S h u h − y d ) k a, Ω + k S h u − Su k a, Ω ) . κ ( h )( k p h − p k a, Ω + k S h u − S h u h k a, Ω + k S h u − Su k a, Ω ) . κ ( h )( k p h − p k a, Ω + k S h u h − Su k a, Ω + k S h u − S h u h k a, Ω ) . κ ( h )( k p h − p k a, Ω + k y h − y k a, Ω + k u − u h k , Ω ) . (3.34)If h ≪ κ ( h ) ≪ h ∈ (0 , h ), and we arrive at k u − u h k , Ω . κ ( h )( k p h − p k a, Ω + k y h − y k a, Ω ) . (3.35)Inserting the above estimate into (3.23) and (3.26), we can conclude from (3.22) the desired results(3.20)-(3.21). This completes the proof. (cid:3) Now we are in the position to consider the adaptive ﬁnite element method for optimal controlproblem (3.1)-(3.2). At ﬁrst we will derive a posteriori error estimates for above optimal controlproblems. To begin with, we ﬁrstly introduce some notations. Similar to the deﬁnitions of (2.9) and r y,T ( y h ), r p,T ( p h ) and the jump residuals j y,E ( y h ), j p,E ( p h )by r y,T ( y h ) : = u h − Ly h = u h + ∇ · ( A ∇ y h ) − cy h in T h , (3.36) r p,T ( p h ) : = y h − y d − L ∗ p h = y h − y d + ∇ · ( A ∗ ∇ p h ) − cp h in T h , (3.37) j y,E ( y h ) : = [ A ∇ y h ] E · n E on E ∈ E h , (3.38) j p,E ( p h ) : = [ A ∗ ∇ p h ] E · n E on E ∈ E h . (3.39)For each element T ∈ T h , we deﬁne the local error indicators η y,h ( y h , T ) and η p,h ( p h , T ) by η y,h ( y h , T ) := (cid:16) h T k r y,T ( y h ) k ,T + X E ∈E h ,E ⊂ ∂T h E k j y,E ( y h ) k ,E (cid:17) , (3.40) η p,h ( p h , T ) := (cid:16) h T k r p,T ( p h ) k ,T + X E ∈E h ,E ⊂ ∂T h E k j p,E ( p h ) k ,E (cid:17) . (3.41)Then on a subset ω ⊂ Ω, we deﬁne the error estimators η y,h ( y h , ω ) and η p,h ( p h , ω ) by η y,h ( y h , ω ) := (cid:16) X T ∈T h ,T ⊂ ω η y,h ( y h , T ) (cid:17) , (3.42) η p,h ( p h , ω ) := (cid:16) X T ∈T h ,T ⊂ ω η p,h ( p h , T ) (cid:17) . (3.43)Thus, η y,h ( y h , Ω) and η p,h ( p h , Ω) constitute the error estimators for the state equation and theadjoint state equation on Ω with respect to T h .Note that S h u h and S ∗ h ( S h u h − y d ) are the standard Galerkin projections of Su h and S ∗ ( S h u h − y d ), respectively. Similar to (2.16)-(2.17), standard a posterior error estimates for elliptic boundaryvalue problem give the following upper bounds (see, e.g., [36]) which show the reliability of theerror estimators. Lemma 3.4.

Let S and S h be the continuous and discrete solution operators deﬁned above. Thenthe following a posteriori error estimates hold k Su h − S h u h k a, Ω ˜ C η y,h ( y h , Ω) , (3.44) k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω ˜ C η p,h ( p h , Ω) . (3.45)Then we can also derive the following global a posteriori error lower bounds, i.e., the globaleﬃciency of the error estimators. Lemma 3.5.

Let S and S h be the continuous and discrete solution operators deﬁned above. Thenthe following a posteriori error lower bounds hold ˜ C η y,h ( y h , Ω) k Su h − S h u h k a, Ω + ˜ C osc ( u h − Ly h , T h ) , (3.46) ˜ C η p,h ( p h , Ω) k S ∗ ( S h u h − y d ) − S ∗ h ( S h u h − y d ) k a, Ω + ˜ C osc ( y h − y d − L ∗ p h , T h ) . (3.47)Let h ∈ (0 ,

1) be the mesh size of the initial mesh T h and deﬁne˜ κ ( h ) := sup h ∈ (0 ,h ] κ ( h ) . It is obvious that ˜ κ ( h ) ≪ h ≪

1. For ease of exposition we also deﬁne the following quantities: η h (( y h , p h ) , T ) = η y,h ( y h , T ) + η p,h ( p h , T ) , osc (( y h , p h ) , T ) = osc ( u h − Ly h , T ) + osc ( y h − y d − L ∗ p h , T ) , and the straightforward modiﬁcations for η h (( y h , p h ) , Ω) and osc (( y h , p h ) , T h ).Now we state the following a posteriori error estimates for the ﬁnite element approximation ofthe optimal control problem. heorem 3.6. Let h ∈ (0 , h ) . Assume that ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) and ( u h , y h , p h ) ∈ U ad × V h × V h are the solutions of problems (3.1)-(3.2) and (3.7)-(3.8), respectively. Then thereexist positive constants C , C and C , independent of the mesh size h , such that k ( y − y h , p − p h ) k a C η h (( y h , p h ) , Ω)(3.48) and C η h (( y h , p h ) , Ω) k ( y − y h , p − p h ) k a + C osc (( y h , p h ) , T h )(3.49) provided h ≪ .Proof. Note that y h = Su h , y h = S h u h , p h = S ∗ ( S h u h − y d ) and p h = S ∗ h ( S h u h − y d ). From theestimates (3.20)-(3.21), Lemmas 3.4 and 3.5 we have k ( y − y h , p − p h ) k a k y h − y h k a, Ω + k p h − p h k a, Ω ) + ˆ C κ ( h ) k ( y − y h , p − p h ) k a C η h (( y h , p h ) , Ω) + ˆ C ˜ κ ( h ) k ( y − y h , p − p h ) k a and ˜ C η h (( y h , p h ) , Ω) ( k y h − y h k a, Ω + k p h − p h k a, Ω ) + ˜ C osc (( y h , p h ) , T h ) k ( y − y h , p − p h ) k a, Ω + ˜ C osc (( y h , p h ) , T h )+ ˆ C ˜ κ ( h ) k ( y − y h , p − p h ) k a . We obtain the desired results by choosing C = 2 ˜ C − ˆ C ˜ κ ( h ) , C = ˜ C C ˜ κ ( h ) , C = ˜ C C ˜ κ ( h ) . (3.50) (cid:3) The adaptive ﬁnite element procedure consists of the following loopsSOLVE → ESTIMATE → MARK → REFINE . The ESTIMATE step is based on the a posteriori error estimators presented in Theorem 3.6,while the step REFINE can be done by using iterative or recursive bisection of elements with theminimal reﬁnement condition (see [34, 36]). Due to [7], the procedure REFINE here is not requiredto satisfy the interior node property of [32]. Note that there are two error estimators η y,h ( y h , T ) and η p,h ( p h , T ) contributed to the state approximation and adjoint state approximation, respectively.We use the sum of the two estimators as our indicators for the marking strategy. The markingalgorithm based on the D¨orﬂer’s strategy for optimal control problems can be described as follows Algorithm 3.7.

The D¨orﬂer’s marking strategy for OCPs (1)

Given a parameter < θ < ; (2) Construct a minimal subset ˜ T h ⊂ T h such that X T ∈ ˜ T h η h (( y h , p h ) , T ) > θη h (( y h , p h ) , Ω) . (3) Mark all the elements in ˜ T h . Then we can present the adaptive ﬁnite element algorithm for the optimal control problem(3.7)-(3.8) as follows:

Algorithm 3.8.

Adaptive ﬁnite element algorithm for OCPs: (1)

Given an initial mesh T h with mesh size h and construct the ﬁnite element space V h . (2) Set k = 0 and solve the optimal control problem (3.7)-(3.8) to obtain ( u h k , y h k , p h k ) ∈ U ad × V h k × V h k . (3) Compute the local error indicator η h k (( y h k , p h k ) , T ) . (4) Construct ˜ T h k ⊂ T h k by the marking Algorithm 3.7. Reﬁne ˜ T h k to get a new conforming mesh T h k +1 by procedure REFINE. (6) Construct the ﬁnite element space V h k +1 and solve the optimal control problem (3.7)-(3.8)to obtain ( u h k +1 , y h k +1 , p h k +1 ) ∈ U ad × V h k +1 × V h k +1 . (7) Set k = k + 1 and go to Step (3). Convergence of AFEM for optimal control problem

In this section we intend to prove the convergence of the adaptive Algorithm 3.8. The proofuses some ideas of [11, 16] and some results of [7]. Following Theorem 3.3, we may ﬁrstly establishsome relationships between the two level approximations, which will be used in our analysis forboth convergence and optimal complexity.

Theorem 4.1.

Let h, H ∈ (0 , h ) and ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) be the solution of problem(3.1)-(3.2). Assume that ( u h , y h , p h ) ∈ U ad × V h × V h and ( u H , y H , p H ) ∈ U ad × V H × V H are thesolutions of problem (3.7)-(3.8), respectively. Deﬁne y H := Su H and p H := S ∗ ( S H u H − y d ) . Thenthe following properties hold k y − y h k a, Ω = k y H − R h y H k a, Ω + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) , (4.1) k p − p h k a, Ω = k p H − R h p H k a, Ω + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) , (4.2) osc ( u h − Ly h , T h ) = osc ( u H − L R h y H , T h ) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:1) , (4.3) osc ( y h − y d − L ∗ p h , T h ) = osc ( y H − y d − L ∗ R h p H , T h ) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:1) (4.4) and η y,h ( y h , Ω) = ˜ η h ( R h y H , Ω) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) , (4.5) η p,h ( p h , Ω) = ˜ η h ( R h p H , Ω) + O (˜ κ ( h )) (cid:0) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:1) (4.6) provided h ≪ .Proof. Note that y − y h = y H − R h y H + R h ( y H − y h ) + y − y H (4.7)and p − p h = p H − R h p H + R h ( p H − p h ) + p − p H . (4.8)On the other hand, it follows from (2.4) that kR h ( y H − y h ) + y − y H k a, Ω . k y H − y h k a, Ω + k y − y H k a, Ω . k y − y h k a, Ω + k y − y H k a, Ω . k u − u h k , Ω + k u − u H k , Ω (4.9)and kR h ( p H − p h ) + p − p H k a, Ω . k p H − p h k a, Ω + k p − p H k , Ω . k y − y h k , Ω + k y − y H k , Ω . k u − u h k , Ω + κ ( h ) k y − y h k a, Ω + k u − u H k , Ω + κ ( H ) k y − y H k a, Ω , (4.10) here in the last inequality we used (3.25). It follows from (3.35) that kR h ( y H − y h ) + y − y H k a, Ω + kR h ( p H − p h ) + p − p H k a, Ω . κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) + κ ( H ) (cid:16) k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) (4.11)provided h ≪

1. This combining with (4.7)-(4.8) yields (4.1) and (4.2).Then we prove (4.3)-(4.4). Note that u h − Ly h = u H − L R h y H + L R h ( y H − y h ) + ( u h − u H ) , (4.12) y h − y d − L ∗ p h = y H − y d − L ∗ R h p H + L ∗ R h ( p H − p h ) + ( y h − y H ) . (4.13)From Lemma 2.2 we haveosc( L R h ( y H − y h ) , T h ) . kR h ( y H − y h ) k a, Ω , osc( L ∗ R h ( p H − p h ) , T h ) . kR h ( p H − p h ) k a, Ω , which together with (4.11) implyosc( L R h ( y H − y h ) , T h ) + osc( L ∗ R h ( p H − p h ) , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . (4.14)Moreover, since ¯ f T is the L -projection of f onto piecewise polynomials on T , there holdsosc( f, T h ) = (cid:16) X T ∈T h k h T ( f − ¯ f T ) k ,T (cid:17) . k f k , Ω . In view of (3.25) we thus haveosc( u h − u H , T h ) . k u h − u H k , Ω . k u − u h k , Ω + k u − u H k , Ω , osc( y h − y H , T h ) . k y h − y H k , Ω . k u − u h k , Ω + k u − u H k , Ω + κ ( H ) k y − y H k a, Ω + κ ( h ) k y − y h k a, Ω , which together with (3.35) yieldosc( u h − u H , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) , (4.15) osc( y h − y H , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . (4.16)We can conclude the desired results (4.3)-(4.4) from the deﬁnition of the data oscillation and(4.12)-(4.16).Now it remains to prove (4.5) and (4.6). From the deﬁnition of y H and y h we know that y h − y H is the solution of elliptic boundary value problem with right hand side u h − u H . It follows from(2.17) and (4.9) that˜ η h ( R h ( y h − y H ) , Ω) . k ( y h − y H ) − R h ( y h − y H ) k a, Ω +osc( u h − u H − L R h ( y h − y H ) , T h ) . k u − u h k , Ω + k u − u H k , Ω +osc( u h − u H − L R h ( y h − y H ) , T h ) . (4.17) rom (2.14), (3.35), (4.14) and (4.15) we are led toosc( u h − u H − L R h ( y h − y H ) , T h ) . ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω + k y − y H k a, Ω + k p − p H k a, Ω (cid:17) . (4.18)Note that η y,h ( y h , Ω) = ˜ η h ( R h y h , Ω) = ˜ η h ( R h y H + R h ( y h − y H ) , Ω) . This combining with (4.17) and (4.18) gives η y,h ( y h , Ω) = ˜ η h ( R h y H , Ω) + ˜ κ ( h ) (cid:16) k y − y h k a, Ω + k y − y H k a, Ω + k p − p h k a, Ω + k p − p H k a, Ω (cid:17) , which proves (4.5). Similarly we can prove (4.6). Thus, we complete the proof of the theorem. (cid:3) Now we are ready to prove the error reduction for the sum of the energy errors and the scalederror estimators of the state y and the adjoint state p , between two consecutive adaptive loops. Theorem 4.2.

Let ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) be the solution of problem (3.1)-(3.2) and ( u h k , y h k , p h k ) ∈ U ad × V h k × V h k be a sequence of solutions to problem (3.7)-(3.8) produced byAlgorithm 3.8. Then there exist constants γ > and β ∈ (0 , depending only on the shaperegularity of meshes and the parameter θ used by Algorithm 3.7, such that for any two consecutiveiterates k and k + 1 , we have k ( y − y h k +1 , p − p h k +1 ) k a + γη h k +1 (( y h k +1 , p h k +1 ) , Ω) β (cid:16) k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) (cid:17) (4.19) provided h ≪ . Therefore, Algorithm 3.8 converges with a linear rate β , namely, the k -th iteratesolution ( u h k , y h k , p h k ) of Algorithm 3.8 satisﬁes k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) C β k , (4.20) where C = k ( y − y h , p − p h ) k a + γη h (( y h , p h ) , Ω) .Proof. For convenience, we use ( u H , y H , p H ) and ( u h , y h , p h ) to denote ( u h k , y h k , p h k ) and ( u h k +1 , y h k +1 , p h k +1 ),respectively. So it suﬃces to prove that for ( u H , y H , p H ) and ( u h , y h , p h ), there holds k ( y − y h , p − p h ) k a + γη h (( y h , p h ) , Ω) β (cid:16) k ( y − y H , p − p H ) k a + γη H (( y H , p H ) , Ω) (cid:17) . (4.21)Recall that y H := Su H , y h := Su h and p H := S ∗ ( S H u H − y d ), p h := S ∗ ( S h u h − y d ). It followsfrom Algorithm 3.7 that the D¨orﬂer’s marking strategy in Algorithm 2.3 is satisﬁed for ( y H , p H ).So we conclude from Theorem 2.6 that there exist constants ˜ γ and ˜ β ∈ (0 ,

1) satisfying k ( y H − R h y H , p H − R h p H ) k a + ˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1) ˜ β (cid:16) k ( y H − R H y H , p H − R H p H ) k a + ˜ γ (cid:0) ˜ η H ( R H y H , Ω) + ˜ η H ( R H p H , Ω) (cid:1)(cid:17) . (4.22)Note that R H y H = y H and R H p H = p H , we thus have k ( y H − R h y H , p H − R h p H ) k a + ˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1) ˜ β (cid:16) k ( y H − y H , p H − p H ) k a + ˜ γ (cid:0) η y,H ( y H , Ω) + η p,H ( p H , Ω) (cid:1)(cid:17) . (4.23) e conclude from (4.1)-(4.2) and (4.5)-(4.6) that there exists a constant ˆ C > k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) k ( y H − R h y H , p H − R h p H ) k a + (1 + δ )˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1) + ˆ C (1 + δ − )˜ κ ( h ) (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) + ˆ C (1 + δ − )˜ κ ( h )˜ γ (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) , where the δ -Young inequality is used and δ ∈ (0 ,

1) satisﬁes(1 + δ ) ˜ β < . (4.24)Thus, there exists a positive constant ˆ C depending on ˆ C and ˜ γ such that k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) (cid:16) k ( y H − R h y H , p H − R h p H ) k a + ˜ γ (cid:0) ˜ η h ( R h y H , Ω) + ˜ η h ( R h p H , Ω) (cid:1)(cid:17) + ˆ C δ − ˜ κ (cid:16) h )( k ( y − y h , p − p h ) k a, Ω + k ( y − y H , p − p H ) k a, Ω (cid:17) . (4.25)It follows from (4.23) and (4.25) that k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) ˜ β (cid:16) k ( y H − y H , p H − p H ) k a + ˜ γη H (( y H , p H ) , Ω) (cid:17) + ˆ C δ − ˜ κ ( h ) (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) . (4.26)Then using Theorem 3.3 we arrive at k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) ˜ β (cid:16) (1 + ˆ C ˜ κ ( h )) k ( y − y H , p − p H ) k a + ˜ γη H (( y H , p H ) , Ω) (cid:17) + ˆ C δ − ˜ κ ( h ) (cid:16) k ( y − y h , p − p h ) k a + k ( y − y H , p − p H ) k a (cid:17) , and thus k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (1 + δ ) ˜ β (cid:16) k ( y − y H , p − p H ) k a + ˜ γη H (( y H , p H ) , Ω) (cid:17) + C ˜ κ ( h ) k ( y − y H , p − p H ) k a + C δ − ˜ κ ( h ) k ( y − y h , p − p h ) k a , (4.27)where C is a positive constant depending on ˆ C and ˆ C when h ≪

1. So we can derive(1 − C δ − ˜ κ ( h )) k ( y − y h , p − p h ) k a + ˜ γη h (( y h , p h ) , Ω) (cid:16) (1 + δ ) ˜ β + C ˜ κ ( h ) (cid:17) k ( y − y H , p − p H ) k a + (1 + δ ) ˜ β ˜ γη H (( y H , p H ) , Ω) , (4.28)or equivalently, k ( y − y h , p − p h ) k a + ˜ γ − C δ − ˜ κ ( h ) η h (( y h , p h ) , Ω) (1 + δ ) ˜ β + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) k ( y − y H , p − p H ) k a + (1 + δ ) ˜ β ˜ γ − C δ − ˜ κ ( h ) η H (( y H , p H ) , Ω) . (4.29)Since ˜ κ ( h ) ≪ h ≪

1, we can deﬁne the constant β as β := (cid:16) (1 + δ ) ˜ β + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) (cid:17) , (4.30) hich satisﬁes β ∈ (0 ,

1) if h ≪

1. Then k ( y − y h , p − p h ) k a + ˜ γ − C δ − ˜ κ ( h ) η h (( y h , p h ) , Ω) β (cid:16) k ( y − y H , p − p H ) k a + (1 + δ ) ˜ β ˜ γ (1 + δ ) ˜ β + C ˜ κ ( h ) η H (( y H , p H ) , Ω) (cid:17) . (4.31)Now we choose γ := ˜ γ − C δ − ˜ κ ( h ) , (4.32)it is obvious that(1 + δ ) ˜ β ˜ γ (1 + δ ) ˜ β + C ˜ κ ( h ) = (1 + δ ) ˜ β (1 − C δ − ˜ κ ( h )) γ (1 + δ ) ˜ β + C ˜ κ ( h ) < (1 − C δ − ˜ κ ( h )) γ < γ. Then we obtain (4.21), this completes the proof. (cid:3)

Remark 4.3.

We remark that the requirement h ≪ on the initial mesh T h is not restrictivefor the convergence analysis of AFEM for nonlinear problems, such as optimal control problemsstudied in this paper, see, e.g., [14] . For similar requirement we refer to [10, 11] for the convergenceanalysis of adaptive ﬁnite element eigenvalue computations and [31] for the adaptive ﬁnite elementcomputations for nonsymmetric boundary value problems, we should also mention [16] for theadaptive ﬁnite element method of a semilinear elliptic equation. Remark 4.4.

In adaptive Algorithm 3.8 we use the sum of the error estimators η y,h ( y h , T ) con-tributed to the state approximation and η p,h ( p h , T ) contributed to the adjoint state approximationas indicator to select the subset ˜ T h for reﬁnement. This marking strategy enables us to prove theconvergence and quasi-optimality (see Section 5) of AFEM for optimal control problems. We re-mark that it is also possible to use the separate marking for the contributions of η y,h ( y h , T ) and η p,h ( p h , T ) as follows: • Construct a minimal subset ˜ T h, ⊂ T h such that P T ∈ ˜ T h, η y,h ( y h , T ) > θη y,h ( y h , Ω) . • Construct another minimal subset ˜ T h, ⊂ T h such that P T ∈ ˜ T h, η p,h ( p h , T ) > θη p,h ( p h , Ω) . • Set ˜ T h := ˜ T h, ∪ ˜ T h, and mark all the elements in ˜ T h .With this marking strategy we can also prove the convergence of AFEM for optimal control problemsby using the results of [7, 11] for single boundary value problem. To be more speciﬁc, the errorreduction (4.22) can be derived separately for the state and adjoint state approximations. However,the resulting over-reﬁnement for this marking strategy prevents us to prove the quasi-optimality ofthe adaptive algorithm. Complexity of AFEM for optimal control problem

In this section we intend to analyse the complexity of adaptive ﬁnite element algorithm foroptimal control problems based on the known results of the complexity for elliptic boundary valueproblems. The proof uses some ideas of [11, 16] and some results of [7].Similar to [7] and [11], for our purpose to analyse the complexity of AFEM for optimal controlproblems we need to introduce a function approximation class as follows A sγ := n ( y, p, y d ) ∈ H (Ω) × H (Ω) × L (Ω) : | ( y, p, y d ) | s,γ < + ∞ o , where γ > | ( y, p, y d ) | s,γ = sup ε> ε inf T ⊂T h : inf( k ( y − y T ,p − p T ) k a +( γ +1) osc (( y T ,p T ) , T )) / ε ( T − T h ) s . ere T ⊂ T h means T is a reﬁnement of T h , y T and p T are elements of the ﬁnite element spacecorresponding to the partition T . It is seen from the deﬁnition that A sγ = A s for all γ >

0, thuswe use A s throughout the paper with corresponding norm | · | s . So A s is the class of functions thatcan be approximated with a given tolerance ε by continuous peicewise linear polynomial functionsover a partition T with number of degrees of freedom T − T h . ε − /s | v | /ss .Now we are in the position to prepare for the proof of optimal complexity of Algorithm 3.8 for theoptimal control problem (3.1)-(3.2). At ﬁrst, we deﬁne y h k := Su h k and p h k := S ∗ ( S h k u h k − y d ).Then we have the following result. Lemma 5.1.

Next, we are able to derive a result similar to Lemma 2.8 concerning the optimality of D¨orﬂer’smarking strategy for the optimal control problems.

Corollary 5.2.

Let ( u h k , y h k , p h k ) ∈ U ad × V h k × V h k and ( u h k +1 , y h k +1 , p h k +1 ) ∈ U ad × V h k +1 × V h k +1 be discrete solutions of problem (3.7)-(3.8) over mesh T h k and its reﬁnement T h k +1 with markedelement M h k . Suppose they satisfy the following property k ( y − y h k +1 , p − p h k +1 ) k a + γ ∗ osc (( y h k +1 , p h k +1 ) , T h k +1 ) β ∗ ( k ( y − y h k , p − p h k ) k a + γ ∗ osc (( y h k , p h k ) , T h k )) with constants γ ∗ > and β ∗ ∈ (0 , q ) . Then the set R T hk →T hk +1 of reﬁned elements satisﬁes theD¨orﬂer property X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) > ˆ θ X T ∈T hk η h k (( y h k , p h k ) , T )(5.3) with ˆ θ = ˜ C (1 − β ∗ )˜ C ( ˜ C +(1+2 C ∗ ˜ C )˜ γ ∗ ) and ˜ C = max(1 , ˜ C ˜ γ ∗ ) . roof. From Lemma 5.1 we can conclude (5.2) from (5.1). Note that y h k = R h k y h k and p h k = R h k p h k . By the lower bounds in Lemma 3.5 we have(1 − β ∗ ) ˜ C η h k (( y h k , p h k ) , Ω) (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ C osc (( y h k , p h k ) , T h k ) (cid:17) = (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ C ˜ γ ∗ ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) ˜ C (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) . Thus, it follows from (5.2) that˜ C ˜ C (1 − β ∗ ) X T ∈T hk η h k (( y h k , p h k ) , T ) (1 − β ∗ ) (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) = k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) − β ∗ (cid:16) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) (cid:17) k ( y h k − y h k , p h k − p h k ) k a + ˜ γ ∗ osc (( y h k , p h k ) , T h k ) − (cid:16) k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a + ˜ γ ∗ osc (( R h k +1 y h k , R h k +1 p h k ) , T h k +1 ) (cid:17) k ( y h k − y h k , p h k − p h k ) k a − k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a +˜ γ ∗ (cid:16) osc (( y h k , p h k ) , T h k ) − (( R h k +1 y h k , R h k +1 p h k ) , T h k +1 ) (cid:17) . (5.4)Note that y h k and R h k +1 y h k are the Galerkin projections of y h k on V h k and V h k +1 , respectively.From the standard Galerkin orthogonality we have k ( y h k − y h k , p h k − p h k ) k a − k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a = k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a . (5.5)By (2.15), the triangle and the Young inequalities we haveosc (( y h k , p h k ) , T ) (( R h k +1 y h k , R h k +1 p h k ) , T )+2 C ∗ k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a , which together with the dominance of the indicator over oscillation (see [7, Remark 2.1])osc ( u h k − Ly h k , T ) η y,h k ( y h k , T ) , (5.6) osc ( y h k − y d − L ∗ p h k , T ) η p,h k ( p h k , T )(5.7)implies osc (( y h k , p h k ) , T h k ) − (( R h k +1 y h k , R h k +1 p h k ) , T h k +1 ) X T ∈ R T hk →T hk +1 osc (( y h k , p h k ) , T ) + osc (( y h k , p h k ) , T h k ∩ T h k +1 ) − (( R h k +1 y h k , R h k +1 p h k ) , T h k ∩ T h k +1 ) X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) + 2 C ∗ k ( y h k − R h k +1 y h k , p h k − R h k +1 p h k ) k a (1 + 2 C ∗ ˜ C ) X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) , (5.8) here we used (2.25) in the last inequality. Combining (5.4)-(5.8) and (2.25) we obtain˜ C ˜ C (1 − β ∗ ) X T ∈T hk η h k (( y h k , p h k ) , T ) ( ˜ C + (1 + 2 C ∗ ˜ C )˜ γ ∗ ) X T ∈ R T hk →T hk +1 η h k (( y h k , p h k ) , T ) . (5.9)By choosing ˆ θ := ˜ C ˜ C (1 − β ∗ )˜ C + (1 + 2 C ∗ ˜ C )˜ γ ∗ = ˜ C (1 − β ∗ )˜ C ( ˜ C + (1 + 2 C ∗ ˜ C )˜ γ ∗ )we complete the proof. (cid:3) Lemma 5.3.

Let ( y, p, y d ) ∈ A s and T h k ( k > ) be a sequence of meshes generated by Algorithm3.8 starting from the initial mesh T h . Let T h k +1 = REFINE ( T h k , M h k ) where M h k is produced byAlgorithm 3.7 with θ satisfying θ ∈ (0 , C γC ( C +(1+2 C ∗ C ) γ ) ) . Then M h k C (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) − s | ( y, p, y d ) | s s , (5.10) where the constant C depends on the discrepancy between θ and C γC ( C +(1+2 C ∗ C ) γ ) .Proof. Let ρ, ρ ∈ (0 ,

1) satisfy ρ ∈ (0 , ρ ) and θ < C γC ( C + (1 + 2 C ∗ C ) γ ) (1 − ρ ) . Choose δ ∈ (0 ,

1) to satisfy (4.24) and(1 + δ ) ρ ρ , (5.11)which implies (1 + δ ) ρ < . (5.12)Set ε = 1 √ ρ (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) and let T h ε be a reﬁnement of T h with minimal degrees of freedom satisfying k ( y − y h ε , p − p h ε ) k a + ( γ + 1)osc (( y h ε , p h ε ) , T h ε ) ε . (5.13)We can conclude from the deﬁnition of A s that T h ε − T h . ε − s | ( y, p, y d ) | s s . Let T h ∗ := T h ε ⊕ T h k be the smallest common reﬁnement of T h ε and T h k . Let V h ε ⊂ H (Ω) and V h ∗ ⊂ H (Ω) be the ﬁnite element spaces deﬁned on T h ε and T h ∗ , respectively. Assume that( u h ε , y h ε , p h ε ) ∈ U ad × V h ε × V h ε is the solution of problem (3.7)-(3.8).Deﬁne y h ε := Su h ε and p h ε := S ∗ ( S h ε u h ε − y d ). From the deﬁnition of oscillation we canconclude from Lemma 2.2 thatosc( u h ε − L R h ∗ y h ε , T h ∗ ) osc( u h ε − L R h ε y h ε , T h ∗ ) + osc( L ( R h ∗ − R h ε ) y h ε , T h ∗ ) osc( u h ε − L R h ε y h ε , T h ∗ ) + C ∗ k ( R h ∗ − R h ε ) y h ε k a andosc( y h ε − y d − L ∗ R h ∗ p h ε , T h ∗ ) osc( y h ε − y d − L ∗ R h ε p h ε , T h ∗ ) + osc( L ∗ ( R h ∗ − R h ε ) p h ε , T h ∗ ) osc( y h ε − y d − L ∗ R h ε p h ε , T h ∗ ) + C ∗ k ( R h ∗ − R h ε ) p h ε ) k a . hen from the Young’s inequality we haveosc (( R h ∗ y h ε , R h ∗ p h ε ) , T h ∗ ) (( R h ε y h ε , R h ε p h ε ) , T h ∗ )+2 C ∗ k (( R h ∗ − R h ε ) y h ε , ( R h ∗ − R h ε ) p h ε )) k a . Due to the orthogonality k ( y h ε − R h ∗ y h ε , p h ε − R h ∗ p h ε ) k a = k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a −k (( R h ∗ − R h ε ) y h ε , ( R h ∗ − R h ε ) p h ε )) k a , we arrive at k ( y h ε − R h ∗ y h ε , p h ε − R h ∗ p h ε ) k a + 12 C ∗ osc (( R h ∗ y h ε , R h ∗ p h ε ) , T h ∗ ) k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a + 1 C ∗ osc (( R h ε y h ε , R h ε p h ε ) , T h ∗ ) . From Theorem 2.6 we can see that ˜ γ C ∗ , which implies k ( y h ε − R h ∗ y h ε , p h ε − R h ∗ p h ε ) k a + ˜ γ osc (( R h ∗ y h ε , R h ∗ p h ε ) , T h ∗ ) k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a + 1 C ∗ osc (( R h ε y h ε , R h ε p h ε ) , T h ∗ ) k ( y h ε − R h ε y h ε , p h ε − R h ε p h ε ) k a + (˜ γ + σ )osc (( R h ε y h ε , R h ε p h ε ) , T h ∗ )with σ = C ∗ − ˜ γ ∈ (0 , k ( y − y h ∗ , p − p h ∗ ) k a + γ osc (( y h ∗ , p h ∗ ) , T h ∗ ) β (cid:16) k ( y − y h ε , p − p h ε ) k a + ( γ + σ )osc (( y h ε , p h ε ) , T h ε ) (cid:17) β (cid:16) k ( y − y h ε , p − p h ε ) k a + ( γ + 1)osc (( y h ε , p h ε ) , T h ε ) (cid:17) , (5.14)where β := (cid:16) (1 + δ ) + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) (cid:17) and C is the constant appeared in the proof of Theorem 4.2. Thus, by (5.13) and (5.14) it follows k ( y − y h ∗ , p − p h ∗ ) k a + γ osc (( y h ∗ , p h ∗ ) , T h ∗ ) β (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) (5.15)with β = √ β ρ .In view of (5.12) we have β ∈ (0 , ) provided h ≪

1. It follows from Corollary 5.2 that X T ∈ R T hk →T h ∗ η h k (( y h k , p h k ) , T ) > θ X T ∈T hk η h k (( y h k , p h k ) , T ) , (5.16)where θ = ˜ C (1 − β )˜ C ( ˜ C +(1+2 C ∗ ˜ C )˜ γ ) , ˜ γ = γ − C δ − ˜ κ ( h ) , ˜ C = max(1 , ˜ C ˜ γ ) and˜ β = (cid:16) (1 + δ ) β + C ˜ κ ( h )1 − C δ − ˜ κ ( h ) (cid:17) . It follows from the deﬁnition of γ in (2.24) and ˜ γ in (4.32) that ˜ γ <

1, which together with ˜ C > C = ˜ C ˜ γ . Since h ≪

1, we obtain that ˜ γ > γ and ˜ β ∈ (0 , √ ρ ) from (5.11). It s easy to see from (3.50) and ˜ γ > γ that θ = ˜ C (1 − β ) ˜ C ˜ γ ( ˜ C + (1 + 2 C ∗ ˜ C )˜ γ ) > ˜ C ˜ C ( ˜ C ˜ γ + 1 + 2 C ∗ ˜ C ) (1 − ρ )= C (1 + ˆ C ˜ κ ( h )) C (1 + ˆ C ˜ κ ( h ))( C (1 − ˆ C ˜ κ ( h ))2˜ γ + 1 + C ∗ C (1 − ˆ C ˜ κ ( h ))) (1 − ρ ) > C C ( C γ + 1 + 2 C ∗ C ) (1 − ρ ) = C γC ( C + (1 + 2 C ∗ C ) γ ) (1 − ρ ) > θ, (5.17)provided h ≪

1. This implies X T ∈ R T hk →T h ∗ η h k (( y h k , p h k ) , T ) > θ X T ∈T hk η h k (( y h k , p h k ) , T ) . Note that Algorithm 3.7 selects a minimal set M h k = ˜ T h k satisfying X T ∈M hk η h k (( y h k , p h k ) , T ) > θ X T ∈T hk η h k (( y h k , p h k ) , T ) . Thus, M h k R T hk →T h ∗ T h ∗ − T h k T h ε − T h ( 1 √ ρ ) − s (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) − s | ( y, p, y d ) | s s , which is the desired result with an explicit dependance on the discrepancy between θ and C γC ( C +(1+2 C ∗ C ) γ ) . (cid:3) We are now ready to prove that Algorithm 3.8 possesses optimal complexity for the state andadjoint state approximations.

Theorem 5.4.

Let ( u, y, p ) ∈ U ad × H (Ω) × H (Ω) be the solution of problem (3.1)-(3.2) and ( u h n , y h n , p h n ) ∈ U ad × V h n × V h n be a sequence of solutions of problem (3.7)-(3.8) correspondingto a sequence of ﬁnite element spaces V h n with partitions T h n produced by Algorithm 3.8. Then the n -th iterate solution ( y h n , p h n ) of Algorithm 3.8 satisﬁes the optimal bound k ( y − y h n , p − p h n ) k a + γ osc (( y h n , p h n ) , T h n ) . ( T h n − T h ) − s , (5.18) where the hidden constant depends on the exact solution ( u, y, p ) and the discrepancy between θ and C γC ( C +(1+2 C ∗ C ) γ ) .Proof. It follows from (2.22) and (5.10) that T h n − T h . n − X k =0 M h k . n − X k =0 (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) − s | ( y, p, y d ) | s s . (5.19)From the lower bound (3.49) we have k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) C (cid:16) k ( y − y h k , p − p h k ) k a + γ osc (( y h k , p h k ) , T h k ) (cid:17) , where C = max(1 + γC , C C ). Then we arrive at T h n − T h . n − X k =0 (cid:16) k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) (cid:17) − s | ( y, p, y d ) | s s . (5.20) ue to (4.19), we obtain for 0 k < n that k ( y − y h n , p − p h n ) k a + γη h n (( y h n , p h n ) , Ω) β n − k ) (cid:16) k ( y − y h k , p − p h k ) k a + γη h k (( y h k , p h k ) , Ω) (cid:17) . Thus, T h n − T h . (cid:16) k ( y − y h n , p − p h n ) k a + γη h n (( y h n , p h n ) , Ω) (cid:17) − s | ( y, p, y d ) | s s n − X k =0 β n − ks . (cid:16) k ( y − y h n , p − p h n ) k a + γη h n (( y h n , p h n ) , Ω) (cid:17) − s | ( y, p, y d ) | s s , (5.21)where the last inequality holds due to the fact that β < (( y h n , p h n ) , T h n ) η h n (( y h n , p h n ) , Ω) , which together with (5.21) yields T h n − T h . (cid:16) k ( y − y h n , p − p h n ) k a + γ osc (( y h n , p h n ) , T h n ) (cid:17) − s , (5.22)this completes the proof. (cid:3) Remark 5.5.

From (3.35) and the equivalence property (3.13) we can conclude that Theorem 4.2also implies the convergence of k u − u h k k , Ω , namely, for the n -th iterate solution u h n of Algorithm3.8 there holds k u − u h n k , Ω . β n . (5.23) We remark that the control variable can also be included into the complexity analysis of AFEM foroptimal control problems to obtain k u − u h n k , Ω . ( T h n − T h ) − s . (5.24) However, the above results are sub-optimal for the optimal control as illustrated by the numericalresults in Section 6. To prove the optimality of AFEM for control variable it seems that we needto work with AFEM based on L -norm error estimators, we refer to [20] for optimal a priori errorestimate. We expect that the results in [12] will enable us to prove the optimal convergence ofAFEM for the optimal control u , this will be postponed to future work. Numerical experiments

In this section we carry out some numerical tests in two dimensions to support our theoreticalresults obtained in this paper. We take the elliptic operator L as − ∆ with homogeneous Dirichletboundary condition for all the examples. Example 6.1.

This example is taken from [1] . The domain Ω can be described in polar coordinatesby Ω = { ( r, ϑ ) , < r < , < ϑ < π } . We take the exact solutions as y ( r, ϑ ) = ( r λ − r ν ) sin( λϑ ) ,p ( r, ϑ ) = α ( r λ − r ν ) sin( λϑ ) ,u ( r, ϑ ) = P U ad ( − pα ) with λ = and ν = ν = . We set α = 0 . , a = − . and b = 1 . We assume the additionalright hand side f for the state equation. e give the numerical results for the optimal control approximation by Algorithm 3.8 withparameter θ = 0 . θ = 0 .

5. Figure 1 shows the proﬁles of the numerically computed optimalstate and adjoint state. We present in Figure 2 the triangulations by Algorithm 3.8 after 8 and 10adaptive iterations. We can see that the meshes are concentrated on the reentrant corner wherethe singularities located. −1 −0.5 0 0.5 1−10100.10.20.30.40.5 −1 −0.5 0 0.5 1−1−0.500.5100.010.020.030.040.05

Figure 1.

The proﬁles of the discretised optimal state y h (left) and adjoint state p h (right) for Example 6.1 on adaptively reﬁned mesh. −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1−0.8−0.6−0.4−0.200.20.40.60.81 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1−0.8−0.6−0.4−0.200.20.40.60.81 Figure 2.

The meshes after 8 (left) and 10 (right) adaptive iterations for Example6.1 generated by Algorithm 3.8 with θ = 0 . L -norm of the control. In the right plot of Figure 3 we present the convergence behavioursof the optimal control, state and adjoint state, as well as the error estimators η y,h ( y h , Ω) and η p,y ( p h , Ω) for the state and adjoint state equations with adaptive reﬁnement. In Figure 4 we resent the convergence of the error k ( y − y h , p − p h ) k a and error indicator η h (( y h , p h ) , Ω) with θ = 0 . θ = 0 .

5, respectively. It is shown from Figure 4 that the error k ( y − y h , p − p h ) k a isproportional to the a posteriori error estimators, which implies the eﬃciency of the a posteriorierror estimators given in Section 3. Moreover, we can also observe that the convergence order oferror k ( y − y h , p − p h ) k a is approximately parallel to the line with slope − / k u − u h k , Ω we can observe the reduction with slope −

1, which is betterthan the results presented in Remark 5.5, and strongly suggests that the convergence rate for theoptimal control is not optimal. −6 −5 −4 −3 −2 −1 Number of elements E rr o r s Uniform refinement||y−y h || ||p−p h || ||u−u h || slope=−1slope=−1/2 10 −5 −4 −3 −2 −1 Number of elements E rr o r s Error reductions||u−u h || ||y−y h || ||p−p h || state estimatoradjoint estimatorslope=−1slope=−1/2 Figure 3.

The convergence history of the optimal control, state and adjoint stateon uniformly reﬁned meshes (left), and the convergence of the errors and estima-tors on adaptively reﬁned meshes (right) for Example 6.1 generated by Algorithm3.8.

Example 6.2.

In the second example we consider an optimal control problem without explicitsolutions. we set

Ω = ( − , , α = 10 − , a = − and b = 10 . The desired state y d is chosen as , , − and − in the ﬁrst, second, third and fourth quadrant, respectively. Similar to the above example Figure 5 shows the proﬁles of the numerically computed optimalstate and adjoint state. We present in the left plot of Figure 6 the triangulation generated byAlgorithm 3.8 after 8 adaptive iteration with parameter θ = 0 .

5. Since there are no explicitsolutions we can not show the convergence of the error k ( y − y h , p − p h ) k a as in Example 6.1.Instead we show in the right plot of Figure 6 the convergence of the error indicator η h (( y h , p h ) , Ω),the error estimators η y,h ( y h , Ω) and η p,y ( p h , Ω) for the state and adjoint state equations. We canobserve the error reduction with slope − / Example 6.3.

In the third example we also consider an optimal control problem without explicitsolutions deﬁned on domain

Ω = ( − , × ( − , \ [0 , × ( x , . We set α = 10 − , a = 0 and b = 8 . We take the desired state y d = 2 . We show in Figure 7 the proﬁles of the numerically computed optimal state and adjoint state,singularities for both the state and adjoint state can be observed around the reentrant corner. Wepresent in the left plot of Figure 8 the triangulation generated by Algorithm 3.8 after 8 adaptiveiteration with parameter θ = 0 . −5 −4 −3 −2 −1 Number of elements E rr o r s Error reductions||u−u h || ||(y−y h ,p−p h )|| Indicatorsslope=−1slope=−1/2 −5 −4 −3 −2 −1 Number of elements E rr o r s Error reductions||u−u h || ||(y−y h ,p−p h )|| Indicatorsslope=−1slope=−1/2

Figure 4.

The convergence history of the optimal control, the state and adjointstate and error indicator on adaptively reﬁned meshes with θ = 0 . θ = 0 . −1 −0.5 0 0.5 1−1−0.500.51−1.5−1−0.500.511.5 −1 −0.5 0 0.5 1−1−0.500.51−1−0.500.51 Figure 5.

The proﬁles of the discretised optimal state y h (left) and adjoint state p h (right) for Example 6.2 on adaptively reﬁned mesh. η h (( y h , p h ) , Ω), the error estimators η y,h ( y h , Ω) and η p,y ( p h , Ω) for the state and adjoint stateequations. We can also observe the error reduction with slope − / Conclusion and outlook

In this paper we give a rigorous convergence analysis of the adaptive ﬁnite element algorithmfor optimal control problems governed by linear elliptic equation. We prove that the AFEM is acontraction, for the sum of the energy errors and the scaled error estimators of the state y and theadjoint state p , between two consecutive adaptive loops. We also show that the AFEM yields a esh after 8 iterations −2 −1 Number of elements E rr o r s Error reductions Indicatorstate estimatoradjoint estimatorslope=−1/2

Figure 6.

The mesh (left) after 8 adaptive iterations and the convergence historyof the error estimators on adaptively reﬁned meshes (right) with θ = 0 . −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 100.20.40.60.811.21.4 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1−0.2−0.15−0.1−0.050 Figure 7.

The proﬁles of the discretised optimal state y h (left) and adjoint state p h (right) for Example 6.3 on adaptively reﬁned mesh.decay rate of the energy errors of the state y and the adjoint state p plus oscillations of the stateand adjoint state equations in terms of the number of degrees of freedom.We expect that the results should also be valid for optimal Neumann boundary control prob-lems (see [27]) by the following observations. The key point for the convergence analysis is theequivalence properties presented in Theorem 3.3 where the relation between the ﬁnite elementoptimal control approximation and the standard ﬁnite element boundary value approximation isestablished. Consider the governing equation of the Neumann boundary control problem: (cid:26) Ly = f in Ω ,A ∇ y · n = u on ∂ Ω . Similar to the proof of Theorem 3.3 we can conclude from the trace theorem that k u − u h k ,∂ Ω . κ ( h )( k y − y h k a, Ω + k p − p h k a, Ω ) , esh after 10 iterations −2 −1 Number of elements E rr o r s Error reductions Indicatorstate estimatoradjoint estimatorslope=−1/2

Figure 8.

The mesh (left) after 10 adaptive iteration and the convergence historyof the error estimators on adaptively reﬁned meshes (right) with θ = 0 . u h is the discrete optimal control. Then we can obtain the counterpart of (3.20)-(3.21) forNeumann boundary control problems k y − y h k a, Ω = k y h − y h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) , k p − p h k a, Ω = k p h − p h k a, Ω + O ( κ ( h )) (cid:16) k y − y h k a, Ω + k p − p h k a, Ω (cid:17) provided h ≪

1. Thus, the convergence and complexity analysis of AFEM carries out to theNeumann boundary control problems.There are many important issues remained unsolved for the convergence analysis of AFEM foroptimal control problems compared to AFEM for boundary value problems. Firstly, at this momentwe only prove the optimality of AFEM for energy errors of the state and adjoint state variables,the convergence for the optimal control u is sub-optimal. To prove the optimality of AFEM forthe optimal control u it seems that we should work on the optimality of AFEM for boundaryvalue problems under L -norms, as done in [12]. This complicates the convergence analysis withadditional restrictions to the adaptive algorithms and will be postponed to future work.Secondly, the convergence analysis of the adaptive ﬁnite element algorithm for other kind ofoptimal control problems like Stokes control problems (see [28]), and non-standard ﬁnite elementalgorithm such as mixed ﬁnite element methods (see [8]) remains open and will be addressed inforthcoming papers.Thirdly, we only prove the convergence of AFEM for optimal control problems with controlconstraints by using variational control discretization. The full control discretization conceptby using piecewise constant or piecewise linear ﬁnite elements is also very important among thenumerical methods for control problems. This kind of control discretizations results in an additionaldiscretised control space and an additional contribution to the a posteriori error estimators (see [22])which should be incorporated within the adaptive algorithm and the corresponding convergenceanalysis. We also intend to generalise our approach in this paper to analyse the convergence ofAFEM for optimal control problems with full control discretization in the future. Acknowledgements

The ﬁrst author was supported by the National Basic Research Program of China under grant2012CB821204 and the National Natural Science Foundation of China under grant 11201464. The econd author acknowledged the support of the National Natural Science Foundation of Chinaunder grant 11171337. References [1] T. Apel, A. R¨osch and G. Winkler, Optimal control in non-convex domains: a priori discretization errorestimates, Calcolo, 44 (2007), pp. 137-158.[2] I. Babuˇska and W.C. Rheinboldt, Error estimates for adaptive ﬁnite element computations, SIAM J. Numer.Anal., 15 (1978), pp. 736-754.[3] R. Becker, H. Kapp and R. Rannacher, Adaptive ﬁnite element methods for optimal control of partial diﬀerentialequations: Basic concept, SIAM J. Control Optim., 39 (2000), pp. 113-132.[4] R. Becker and S.P. Mao, Quasi-optimality of an adaptive ﬁnite element method for an optimal control problems,Comput. Methods Appl. Math., 11 (2011), pp. 107-128.[5] M. Bergounioux, K. Ito and K. Kunisch, Primal-dual strategy for constrained optimal control problems, SIAMJ. Control Optim., 37 (1999), pp. 1176-1194.[6] P. Binev, W. Dahmen and R. DeVore, Adaptive ﬁnite element methods with convergence rates, Numer. Math.,97 (2004), pp. 219-268.[7] J.M. Cascon, C. Kreuzer, R.H. Nochetto and K.G. Siebert, Quasi-optimal convergence rate for an adaptiveﬁnite element method, SIAM J. Numer. Anal., 46 (2008), no. 5, pp. 2524-2550.[8] Y.P. Chen and W.B. Liu, A posteriori error estimates for mixed ﬁnite element solutions of convex optimalcontrol problems, J. Comput. Appl. Math., 211 (2008), no. 1, pp. 76-89.[9] P.G. Ciarlet, The Finite Element Methods for Elliptic Problems, North-Holland, Amsterdam, 1978.[10] X.Y. Dai, L.H. He and A.H. Zhou, Convergence and quasi-optimal complexity of adaptive ﬁnite element com-putations for multiple eigenvalues, IMA J. Numer. Anal., 2014, DOI:10.1093/imanum/dru059.[11] X.Y. Dai, J.C. Xu and A.H. Zhou, Convergence and optimal complexity of adaptive ﬁnite element eigenvaluecomputations, Numer. Math., 110 (2008), pp. 313-355.[12] A. Demlow and R. Stevenson, Convergence and quasi-optimality of an adaptive ﬁnite element method forcontrolling L errors, Numer. Math., 117 (2011), no. 2, pp. 185-218.[13] W. D¨orﬂer, A convergent adaptive algorithm for Poissons equation, SIAM J. Numer. Anal., 33 (1996), pp.1106-1124.[14] A. Gaevskaya, R.H.W. Hoppe, Y. Iliash and M. Kieweg, Convergence analysis of an adaptive ﬁnite elementmethod for distributed control problems with control constraints, in Control of coupled partial diﬀerentialequations, vol. 155 of Internat. Ser. Numer. Math., Birkh¨auser, Basel, 2007, pp. 47-68.[15] P. Grisvard, Singularities in Boundary Value Problems, Masson, Paris, and Springer-Verlag, Berlin, 1992.[16] L.H. He and A.H. Zhou, Convergence and complexity of adaptive ﬁnite element methods for elliptic partialdiﬀerential equations, Inter. J. Numer. Anal. Model., 8 (2011), no. 4, pp. 615-640.[17] M. Hinterm¨uller and R.H.W. Hoppe, Goal-oriented adaptivity in control constrained optimal control of partialdiﬀerential equations, SIAM J. Control Optim., 47 (2008), no. 4, pp. 1721-1743.[18] M. Hinterm¨uller, R.H.W. Hoppe, Y. Iliash and M. Kieweg, An a posteriori error analysis of adaptive ﬁniteelement methods for distributed elliptic control problems with control constraints, ESAIM: Control Optim.Calc. Var., 14 (2008), pp. 540-560.[19] M. Hinterm¨uller, K. Ito and K. Kunisch, The primal-dual active set strategy as a semismooth Newton method,SIAM J. Optim., 13 (2003), pp. 865-888.[20] M. Hinze, A variational discretization concept in control constrained optimization: The linear-quadratic case,Comput. Optim. Appl., 30 (2005), pp. 45-63.[21] M. Hinze, R. Pinnau, M. Ulbrich and S. Ulbrich, Optimization with PDE Constraints, Math. Model. Theo.Appl., 23, Springer, New York, 2009.[22] K. Kohls, A. R¨osch and K.G. Siebert, A posteriori error analysis of optimal control problems with controlconstraints, SIAM J. Control Optim., 52 (2014), pp. 1832-1861.[23] K. Kohls, A. R¨osch and K.G. Siebert, Convergence of adaptive ﬁnite elements for control constrained optimalcontrol problems, Preprint-Nr.: SPP1253-153, 2013.[24] R. Li, W.B. Liu, H.P. Ma and T. Tang, Adaptive ﬁnite element approximation for distributed elliptic optimalcontrol problems, SIAM J. Control Optim., 41 (2002), no. 5, pp. 1321-1349.[25] J.L. Lions, Optimal Control of Systems Governed by Partial Diﬀerential Equations, Springer-Verlag, Berlin,1971.[26] W.B. Liu and N.N. Yan, A posteriori error analysis for convex distributed optimal control problems, Adv.Comp. Math., 15 (2001), no. 1-4, pp. 285-309.[27] W.B. Liu and N.N. Yan, A posteriori error estimates for convex boundary control problems, SIAM J. Numer.Anal., 39 (2001), no. 1, pp. 73-99.[28] W.B. Liu and N.N. Yan, A posteriori error estimates for optimal problems governed by Stokes equations, SIAMJ. Numer. Anal., 40 (2003), pp. 1850-1869.

29] W.B. Liu and N.N. Yan, A posteriori error estimates for optimal control problems governed by parabolicequations, Numer. Math., 93 (2003), pp. 497-521.[30] W.B. Liu and N.N. Yan, Adaptive Finite Element Methods for Optimal Control Governed by PDEs, Sciencepress, Beijing, 2008.[31] K. Mekchay and R.H. Nochetto, Convergence of adaptive ﬁnite element methods for general second order linearelliplic PDEs, SIAM J. Numer. Anal., 43 (2005), pp. 1803-1827.[32] P. Morin, R.H. Nochetto and K.G. Siebert, Data oscillation and convergence of adaptive FEM, SIAM J. Numer.Anal., 38 (2000), pp. 466-488.[33] P. Morin, R.H. Nochetto, and K.G. Siebert, Convergence of adaptive ﬁnite element methods, SIAM Rev., 44(2002), pp. 631-658.[34] R. Stevenson, Optimality of a standard adaptive ﬁnite element method, Found. Comput. Math., 7 (2007), pp.245-269.[35] R. Stevenson, The completion of locally reﬁned simplicial partitions created by bisection, Math. Comput., 77(2008), pp. 227-241.[36] R. Verf¨urth, A Review of a Posteriori Error Estimates and Adaptive Mesh Reﬁnement Techniques, Wiley-Teubner, New York, 1996.29] W.B. Liu and N.N. Yan, A posteriori error estimates for optimal control problems governed by parabolicequations, Numer. Math., 93 (2003), pp. 497-521.[30] W.B. Liu and N.N. Yan, Adaptive Finite Element Methods for Optimal Control Governed by PDEs, Sciencepress, Beijing, 2008.[31] K. Mekchay and R.H. Nochetto, Convergence of adaptive ﬁnite element methods for general second order linearelliplic PDEs, SIAM J. Numer. Anal., 43 (2005), pp. 1803-1827.[32] P. Morin, R.H. Nochetto and K.G. Siebert, Data oscillation and convergence of adaptive FEM, SIAM J. Numer.Anal., 38 (2000), pp. 466-488.[33] P. Morin, R.H. Nochetto, and K.G. Siebert, Convergence of adaptive ﬁnite element methods, SIAM Rev., 44(2002), pp. 631-658.[34] R. Stevenson, Optimality of a standard adaptive ﬁnite element method, Found. Comput. Math., 7 (2007), pp.245-269.[35] R. Stevenson, The completion of locally reﬁned simplicial partitions created by bisection, Math. Comput., 77(2008), pp. 227-241.[36] R. Verf¨urth, A Review of a Posteriori Error Estimates and Adaptive Mesh Reﬁnement Techniques, Wiley-Teubner, New York, 1996.