Constraint Splitting and Projection Methods for Optimal Control of Double Integrator
aa r X i v : . [ m a t h . O C ] A p r Constraint Splitting and Pro jection Methods forOptimal Control of Double Integrator
Heinz H. Bauschke ∗ Regina S. Burachik † C. Yal¸cın Kaya ‡ April 12, 2018
Dedicated to the memory of Jonathan M. Borwein
Abstract
We consider the minimum-energy control of a car, which is modelled as a point mass slidingon the ground in a fixed direction, and so it can be mathematically described as the doubleintegrator. The control variable, representing the acceleration or the deceleration, is con-strained by simple bounds from above and below. Despite the simplicity of the problem, it isnot possible to find an analytical solution to it because of the constrained control variable.To find a numerical solution to this problem we apply three different projection-type meth-ods: (i) Dykstra’s algorithm, (ii) the Douglas–Rachford (DR) method and (iii) the Arag´onArtacho–Campoy (AAC) algorithm. To the knowledge of the authors, these kinds of (projec-tion) methods have not previously been applied to continuous-time optimal control problems,which are infinite-dimensional optimization problems. The problem we study in this article isposed in infinite-dimensional Hilbert spaces. Behaviour of the DR and AAC algorithms areexplored via numerical experiments with respect to their parameters. An error analysis is alsocarried out numerically for a particular instance of the problem for each of the algorithms.
Key words : Optimal control, Dykstra projection method, Douglas-Rachford method,Arag´on Artacho–Campoy algorithm, Linear quadratic optimal control, control con-straints, Numerical methods.
In this paper, we provide (to the best of our knowledge also first) application of various bestapproximation algorithms to solve a continuous-time optimal control problem. Operatorsplitting methods were applied previously to discrete-time optimal control problems [17, 24],which are finite-dimensional problems. In [24], for example, the state difference equationscomprise the constraint A , and the box constraints on the state and control variables comprise B . The condition of belonging to the sets A and B are then appended to the objective functionvia indicator functions. The original objective function that is considered in [24] is quadraticin the state and control variables. In the next step in [24], the new objective function is splitinto its quadratic and convex parts and the Douglas-Rachford splitting method is applied tosolve the problem. ∗ Department of Mathematics, University of British Columbia, Kelowna, B.C. V1V 1V7, Canada. E-mail:[email protected] . † School of Information Technology and Mathematical Sciences, University of South Australia, MawsonLakes, S.A. 5095, Australia. E-mail: [email protected] . ‡ School of Information Technology and Mathematical Sciences, University of South Australia, MawsonLakes, S.A. 5095, Australia. E-mail: [email protected] . rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya In the current paper, we deal with continuous-time optimal control problems, which areinfinite-dimensional optimization problems that are set in Hilbert spaces. After splitting theconstraints of the problem, we apply Dykstra’s algorithm [9], the Douglas–Rachford (DR)method [4, 7, 15, 16, 23, 27], and the Arag´on Artacho–Campoy (AAC) algorithm [2], all ofwhich solve the underlying best approximation problem.The exposure of the current paper is more in the style of a tutorial. We pose the problemof minimum-energy control of a simplified model of a car, amounting to the double integrator,where the control variable has simple lower and upper bounds and the initial and terminalstate variables are specified. We split the constraints into two, A and B , representing re-spectively the state differential equations (the double integrator) along with their boundaryconditions and the constraints on the control variable. We define two subproblems, one sub-ject to A , and the other one subject to B . We take advantage of the relatively simple form ofthe optimal control problem and derive analytical expressions for the optimality conditionsand implement these in defining the projections onto A and B .The solutions of these subproblems provide the projections of a given point in the controlvariable space onto the constraint sets A and B , respectively, in some optimal way. Byperforming these projections in the way prescribed by the above-listed algorithms, we canensure convergence to a solution of the original optimal control problem,Note that while the minimum-energy control of the double integrator without any con-straints on the control variable can be solved analytically, the same problem with (evensimple bound, i.e., box) constraints on the control variable can in general be solved onlynumerically. This problem should be considered within the framework of control-constrainedlinear-quadratic optimal control problems for which new numerical methods are constantlybeing developed—see for example [1, 10] and the references therein.The current paper is a prototype for future applications of projection methods to solvingmore general optimal control problems. Indeed, the minimum-energy control of double inte-grator is a special case of linear quadratic optimal control problems; so, with the reportingof the current study, an extension to more general problems will be imminent.The paper is organized as follows. In Section 2, we state the control-constrained minimum-energy problem for the double integrator, and write down the optimality conditions. Weprovide the analytical solution for the unconstrained problem. For the control-constrainedcase, we briefly describe the standard numerical approach and consider an instance of theproblem which we use in the numerical experiments in the rest of the paper. We define theconstraint sets A and B . In Section 3, we provide the expressions for the projections onto A and B . We describe the algorithms in Section 4 and in the beginning of Section 5. In theremaining part of Section 5, we present numerical experiments to study parametric behaviourof the algorithms as well as the errors in the state and control variables with each algorithm.In Section 6, we provide concluding remarks and list some open problems. We consider the minimum-energy control of a car, with a constrained control variable. Con-sider the car as a point unit mass, moving on a frictionless ground in a fixed line of action.Let the position of the car at time t be given by y ( t ) and the velocity by ˙ y ( t ) := ( dy/dt )( t ).By Newton’s second law of motion, ¨ y ( t ) = u ( t ), where u ( t ) is the summation of all the exter-nal forces applied on the car, in this case the force simply representing the acceleration anddeceleration of the car. This differential equation model is referred to as the double integrator in system theory literature, since y ( t ) can be obtained by integrating u ( t ) twice. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya Suppose that the total force on the car, i.e., the accelerationor deceleration of the car, is constrained by a magnitude of a >
0. Let x := y and x := ˙ y .Then the problem of minimizing the energy of the car, which starts at a position x (0) = s with a velocity x (0) = v and finishes at some other position x (1) = s f with velocity x (1) = v f , within one unit of time, can be posed as follows.(P) min 12 Z u ( t ) dt subject to ˙ x ( t ) = x ( t ) , x (0) = s , x (1) = s f , ˙ x ( t ) = u ( t ) , x (0) = v , x (1) = v f , | u ( t ) | ≤ a . Here, the functions x and x are referred to as the state variables and u the control variable .As a first step in writing the conditions of optimality for this optimization problem, definethe Hamiltonian function H for Problem (P) simply as H ( x , x , u, λ , λ ) := 12 u + λ x + λ u , (1)where λ ( t ) := ( λ ( t ) , λ ( t )) ∈ IR is the adjoint variable (or costate ) vector such that (see [19])˙ λ = − ∂H/∂x and ˙ λ = − ∂H/∂x . (2)Equations in (2) simply reduce to λ ( t ) = c and λ ( t ) = − c t − c , (3)where c and c are real constants. Let the state variable vector x ( t ) := ( x ( t ) , x ( t )) ∈ IR . Maximum Principle. If u is an optimal control for Problem (P), then there exists acontinuously differentiable vector of adjoint variables λ , as defined in (2), such that λ ( t ) = 0for all t ∈ [0 , t f ], and that, for a.e. t ∈ [0 , t f ], u ( t ) = argmin v ∈ [ − a,a ] H ( x, v, λ ( t )) , (4)i.e., u ( t ) = argmin v ∈ [ − a,a ] v + λ ( t ) v ; (5)see e.g. [19]. Condition (5) implies that the optimal control is given by u ( t ) = − λ ( t ) , if − a ≤ λ ( t ) ≤ a ,a , if λ ( t ) ≤ − a , − a , if λ ( t ) ≥ a . (6)From (6), we can also conclude that the optimal control u for Problem (P) is continuous.When a is large enough, the control constraint does not become active, so the optimalcontrol is simply − λ , and it is a straightforward classroom exercise to find the analyticalsolution as u ( t ) = c t + c ,x ( t ) = 16 c t + 12 c t + v t + s ,x ( t ) = 12 c t + c t + v , rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya (a) Optimal state variables. (b) Optimal control variable.Figure 1: Solution of Problem (P) with large a (so that u ( t ) is unconstrained), s = 0 , s f = 0 , v = 1 , v f = 0 . for all t ∈ [0 , c = −
12 ( s f − s ) + 6 ( v + v f ) ,c = 6 ( s f − s ) − v + v f ) . The solution of an instance of Problem (P), with s = 0, s f = 0, v = 1, v f = 0, and large a ,say a = 9, is depicted in Figure 1. Note that, for all t ∈ [0 , λ ( t ) = − u ( t ) = − t + 4 and λ ( t ) = c = 6. The graphs of λ and λ are not displayed for this particular instance.When a is not so large, say a = 2 .
5, as we will consider next so that the control constraintbecomes active, it is usually not possible to find an analytical solution, i.e., a solution has tobe found numerically, as described below.
Numerical Approach.
A straightforward and popular numerical approach to solving Prob-lem (P) is to discretize Problem (P) over a partition of the time horizon [0 ,
1] and then usesome finite-dimensional optimization software to get a discrete (finite-dimensional) solution for the state and control variables x ( t ) and u ( t ). The discrete solution is an approximationof the continuous-time solution. This approach is often referred to as the direct method orthe ( first- ) discretize-then-optimize approach. A survey and discussion of Euler discretizationof linear-quadratic optimal control problems and convergence of their discretized solutions totheir continuous-time solutions can be found in [10, Section 5].Figure 2 depicts the discrete solution of Problem (P) with the instance where a = 2 . s = 0, s f = 0, v = 1, v f = 0. The solution was obtained by pairing up the optimizationmodelling language AMPL [18] and the finite-dimensional optimization software Ipopt [28].The number of discretization nodes was taken to be 2000. The multipliers of the (Eulerapproximation of the) state differential equation constraints are provided by Ipopt when itfinds an optimal solution to the discretized (finite-dimensional) problem. These multipliershave been plotted in Figure 2(c). It should be noted that the graph of the adjoint variable λ ( t ) given in Figure 2(c) verifies the graph of the optimal control u ( t ) in Figure 2(b) via theoptimal control rule in (6). In Figures 2(b) and (c), the bounds ± . Function Spaces.
For the numerical methods, we consider projection/reflection methodsin Hilbert spaces. The spaces associated with Problem (P) are set up as follows. Let q ∈ IN rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya (a) Optimal state variables. (b) Optimal control variable. (c) Adjoint variables.Figure 2: Solution of direct discretization of Problem (P), with a = 2 . , s = 0 , s f = 0 , v = 1 , v f = 0 . and L (0 ,
1; IR q ) be the Banach space of Lebesgue measurable functions z : [0 , → IR q t ( z ( t ) , . . . , z q ( t )) t , with finite L norm. Namely, define k z k := q X i =1 k z i k ! / where k z i k := (cid:18)Z | z i ( t ) | d t (cid:19) / , for i = 1 , . . . , q , with | · | the modulus or absolute value. In other words, L (0 ,
1; IR q ) := { z : [0 , → IR q : k z k < ∞} . Furthermore, W , (0 ,
1; IR q ) is the Sobolev space of absolutely continuous functions, namely W , (0 ,
1; IR q ) = { z ∈ L (0 ,
1; IR q ) | ˙ z = dz/dt ∈ L (0 ,
1; IR q ) } , rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya endowed with the norm k z k W , := q X i =1 (cid:2) k z i k + k ˙ z i k (cid:3)! / . In Problem (P), the state variable x ∈ W , (0 ,
1; IR ) and the control variable u ∈ L (0 ,
1; IR).
Constraint Splitting.
Next, we split the constraints of Problem (P) into two subsets, A and B . The subset A collects together all the feasible control functions satisfying only thedynamics of the car. The subset B , on the other hand, collects all the control functions whosevalues are constrained by − a and a . A := (cid:8) u ∈ L (0 ,
1; IR) | ∃ x ∈ W , (0 ,
1; IR ) which solves˙ x ( t ) = x ( t ) , x (0) = s , x (1) = s f , ˙ x ( t ) = u ( t ) , x (0) = v , x (1) = v f , ∀ t ∈ [0 , (cid:9) , (7) B := (cid:8) u ∈ L (0 ,
1; IR) | − a ≤ u ( t ) ≤ a , for all t ∈ [0 , (cid:9) . (8)The rationale behind this sort of splitting is as follows: The problem of minimizing theenergy of the car subject to only A or only B is much easier to solve – in fact, the solutionscan be analytically written in each case. If, for some given u , a solution exists to the two-point boundary-value (TPBVP) in (7) then that solution is unique by the linearity of theTPBVP [3,26]. Note that a control solution u as in (7) exists by the (Kalman) controllabilityof the double integrator – see [25]. So the set A is nonempty. Note that the constraint set A is an affine subspace and B a box . All of the projection methods that we will consider involve projections onto the sets A and B .The projection onto A from a current iterate u − is the point u solving the following problem.(P1) min 12 Z ( u ( t ) − u − ( t )) dt subject to u ∈ A . In (P1), we minimize the squared L -norm distance between u − and u . The projection onto B from a current iterate u − is similarly the point u solving the following problem.(P2) min 12 Z ( u ( t ) − u − ( t )) dt subject to u ∈ B . Proposition 1 (Projection onto A ) The projection P A of u − ∈ L (0 ,
1; IR) onto the con-straint set A , as the solution of Problem (P1) , is given by P A ( u − )( t ) = u − ( t ) + c t + c , (9) for all t ∈ [0 , , where c = 12 ( x (1) − s f ) − x (1) − v f ) , (10) c = − x (1) − s f ) + 2 ( x (1) − v f ) , (11) rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya and x (1) and x (1) are obtained by solving the initial value problem ˙ x ( t ) = x ( t ) , x (0) = s , (12)˙ x ( t ) = u − ( t ) , x (0) = v , (13) for all t ∈ [0 , . Proof. The Hamiltonian function for Problem (P1) is H ( x , x , u, λ , λ , t ) := 12 ( u − u − ) + λ x + λ u , where the adjoint variables λ and λ are defined as in (2), with H replaced by H , and thesubsequent solutions are given as in (3). The optimality condition for Problem (P1) is akinto that in (4) for Problem (P) and, owing to the fact that the control u is now unconstrained,can more simply be written as ∂H ∂u ( x, u, λ, t ) = 0 , which yields the optimal control as u ( t ) = u − ( t ) − λ ( t ), i.e. u ( t ) = u − ( t ) + c t + c , (14)for all t ∈ [0 , c and c are found as in (10)–(11). Using (14) in (7)yields the following time-varying, linear two-point boundary-value problem.˙ x ( t ) = x ( t ) , x (0) = s , x (1) = s f , (15)˙ x ( t ) = u − ( t ) + c t + c , x (0) = v , x (1) = v f , (16)for all t ∈ [0 , c and c . Once c and c are found, the projected point u in (14) isfound. Since Equations (15)–(16) are linear in x and x , a simple shooting technique [3, 26]provides the solution for c and c in just one iteration. The essence of this technique is thatthe initial-value problem (IVP) ∂z ( t, c ) ∂t = z ( t, c ) , z (0 , c ) = s , (17) ∂z ( t, c ) ∂t = u − ( t ) + c t + c , z (0 , c ) = v , (18)for all t ∈ [0 , discrepancy at t = 1 vanish. Namely, weseek a parameter c := ( c , c ) such that z (1 , c ) − s f = 0 and z (1 , c ) − v f = 0. The procedureis as follows. For a given c , there exists a unique solution z ( t, c ) := ( z ( t, c ) , z ( t, c )) of(17)–(18). Define the near-miss (vector) function ϕ : IR → IR as follows: ϕ ( c ) := " z (1 , c ) − s f z (1 , c ) − v f . (19)The Jacobian of the near-miss function is J ϕ ( c ) := ∂z (1 , c ) ∂c ∂z (1 , c ) ∂c ∂z (1 , c ) ∂c ∂z (1 , c ) ∂c rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya The shooting method looks for a pair c such that ϕ ( c ) := 0 (i.e., a pair c such that the finalboundary conditions are met). Expanding ϕ about, say, c = 0, and discarding the terms oforder 2 or higher, we obtain ϕ ( c ) ≈ ϕ (0) + J ϕ (0) c . Substituting ϕ ( c ) = 0 in the above expression, replacing “ ≈ ” with “=”, and re-arranging,gives the single (Newton) iteration of the shooting method: c = − [ J ϕ (0)] − ϕ (0) . (20)The components ( ∂z i /∂c j )(1 , c ), i, j = 1 ,
2, of J ϕ ( c ), can be obtained by solving the variationalequations for (15)–(16) with respect to c and c , i.e., by solving the following system for( ∂z i /∂c j )( · , c ): ∂∂t (cid:18) ∂z ∂c (cid:19) ( t, c ) = ∂z ∂c ( t, c ) , ∂z ∂c (0 , c ) = 0 ,∂∂t (cid:18) ∂z ∂c (cid:19) ( t, c ) = ∂z ∂c ( t, c ) , ∂z ∂c (0 , c ) = 0 ,∂∂t (cid:18) ∂z ∂c (cid:19) ( t, c ) = t , ∂z ∂c (0 , c ) = 0 ,∂∂t (cid:18) ∂z ∂c (cid:19) ( t, c ) = 1 , ∂z ∂c (0 , c ) = 0 . Elementary calculations lead to the following solution of the above system: ∂z∂c ( t, c ) = " t / t / t / t , which is independent of c . Hence, J ϕ (0) = ∂z∂c (1 ,
0) = " / / / , with inverse: (cid:20) ∂z∂c (1 , (cid:21) − = [ J ϕ (0)] − = " −
12 66 − . (21)Setting ( x ( · ) , x ( · )) := ( z ( · , , z ( · , c = 0 into Equation (20), and expanding out, yield (10)–(11).The proof is complete. ✷ Proposition 2 (Projection onto B ) The projection P B of u − ∈ L (0 ,
1; IR) onto the con-straint set B , as the solution of Problem (P2) , is given by P B ( u − )( t ) = u − ( t ) , if − a ≤ u − ( t ) ≤ a , − a , if u − ( t ) ≤ − a ,a , if u − ( t ) ≥ a , (22) for all t ∈ [0 , . Proof. The expression (22) is the straightforward solution of Problem (P2). ✷ rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya In this section, we discuss best approximation algorithms. In the following, X is a real Hilbert space (23)with inner product h· , ·i , induced norm k · k . We also assume that A is a closed affine subspace of X , and B is a nonempty closed convex subset of X . (24)Given z ∈ X , our aim is to find P A ∩ B ( z ) , (25)the projection of z onto the intersection A ∩ B which we assume to be nonempty . We alsoassume that we are able to compute the projectors onto the constraints P A and P B .Many algorithms are known which could be employed to find P A ∩ B ( z ); here, however, wefocus on three simple methods that do not require a product space set-up as some of thoseconsidered, in, e.g., [4, 5, 11, 12].In the next section, we will numerically test these algorithms when X = L (0 ,
1; IR), A = A , B = B , and z = 0. We start with Dykstra’s algorithm (see [9]), which operates as follows : Set a := z and q := 0. Given a n , q n , where n ≥
0, update b n := P B ( a n + q n ) , a n +1 := P A ( b n ) , and q n +1 := a n + q n − b n . (26)It is known that both ( a n ) n ∈ B and ( b n ) n ∈ N converge strongly to P A ∩ B ( z ). Given β >
0, we specialize the Douglas–Rachford algorithm (see [15], [23] and [16]) to mini-mize the sum of the two functions f ( x ) = ι B ( x )+ β k x − z k and g := ι A which have respectiveproximal mappings (see [4, Proposition 24.8(i)]) P f ( x ) = P B (cid:0) β x + β β z (cid:1) and P g = P A . Set λ := β ∈ ]0 , T := Id − P f + P g (2 P f − Id)turns into
T x = x − P B (cid:0) λx + (1 − λ ) z (cid:1) + P A (cid:16) P B (cid:0) λx + (1 − λ ) z (cid:1) − x (cid:17) . (27)Now let x ∈ X and given x n ∈ X , where n ≥
0, update b n := P B (cid:0) λx n + (1 − λ ) z (cid:1) , x n +1 := T x n = x n − b n + P A (cid:0) b n − x n (cid:1) . (28)Then it is known (see [27] or [7]) that ( b n ) n ∈ N converges weakly to P A ∩ B ( z ). Note that (28)simplifies to x n +1 := x n − P B ( λx n ) + P A (cid:0) P B ( λx n ) − x n (cid:1) provided that z = 0. (29) In the general case, there is also an auxiliary sequence ( p n ) associated with A ; however, because A is anaffine subspace, it is not needed in our setting. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya The Arag´on Artacho–Campoy (AAC) Algorithm was recently presented in [2]. Given twofixed parameters α and β in ]0 , T x = (1 − α ) x + α β (cid:18) P A (cid:16) β (cid:0) P B ( x + z ) − z (cid:1) − x + z (cid:17) − z (cid:19) + x + 2 β (cid:0) z − P B ( x + z ) (cid:1)! = x + 2 αβ P A (cid:16) β (cid:0) P B ( x + z ) − z (cid:1) − x + z (cid:17) − P B ( x + z ) ! . (30)Now let x ∈ X and given x n ∈ X , where n ≥
0, update b n := P B ( x n + z ) , (31)and x n +1 := T x n = x n + 2 αβ (cid:18) P A (cid:16) β (cid:0) b n − z (cid:1) − x n + z (cid:17) − b n (cid:19) . (32)By [2, Theorem 4.1(iii)], the sequence ( b n ) n ∈ N converges strongly to P A ∩ B ( z ) provided that z − P A ∩ B ( z ) ∈ ( N A + N B )( P A ∩ B z ). Note that (32) simplifies to x n +1 := T x n = x n + 2 αβ (cid:16) P A (cid:0) βP B x n − x n (cid:1) − P B x n (cid:17) provided that z = 0. (33) In this section, we gather the algorithms considered abstractly and explain how we imple-mented them.We start with Dykstra’s algorithm from Subsection 4.1.
Algorithm 1 (Dykstra)Step 1 ( Initialization ) Choose the initial iterates u = 0 and q = 0. Choose a smallparameter ε >
0, and set k = 0. Step 2 ( Projection onto B ) Set u − = u k + q k . Compute e u = P B ( u − ) by using (22). Step 3 ( Projection onto A ) Set u − := e u . Compute b u = P A ( u − ) by using (9). Step 4 ( Update ) Set u k +1 := b u and q k +1 := u k + q k − e u . Step 5 ( Stopping criterion ) If k u k +1 − u k k L ∞ ≤ ε , then return e u and stop. Otherwise, set k := k + 1 and go to Step 2.Next is the Douglas–Rachford method from Subsection 4.2. It appears that this constraint qualification is not easy to check in our setting. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya ( Initialization ) Choose a parameter λ ∈ ]0 ,
1[ and the initial iterate u arbitrarily.Choose a small parameter ε >
0, and set k = 0. Step 2 ( Projection onto B ) Set u − = λu k . Compute e u = P B ( u − ) by using (22). Step 3 ( Projection onto A ) Set u − := 2 e u − u k . Compute b u = P A ( u − ) by using (9). Step 4 ( Update ) Set u k +1 := u k + b u − e u . Step 5 ( Stopping criterion ) If k u k +1 − u k k L ∞ ≤ ε , then return e u and stop. Otherwise, set k := k + 1 and go to Step 2.Finally, we describe the Arag´on Artacho–Campoy algorithm from Subsection 4.3. Algorithm 3 (AAC)Step 1 ( Initialization ) Choose the initial iterate u arbitrarily. Choose a small parameter ε >
0, two parameters α and β in ]0 , k = 0. Step 2 ( Projection onto B ) Set u − = u k . Compute e u = P B ( u − ) by using (22). Step 3 ( Projection onto A ) Set u − = 2 β e u − u k . Compute b u = P A ( u − ) by using (9). Step 4 ( Update ) Set u k +1 := u k + 2 αβ ( b u − e u ). Step 5 ( Stopping criterion ) If k u k +1 − u k k L ∞ ≤ ε , then return e u and stop. Otherwise, set k := k + 1 and go to Step 2.We provide another version of each of Algorithms 1–3, as Algorithms 1b–3b, in Appendix A.In Algorithm 1b, we monitor the sequence of iterates which are the projections onto set A ,instead of monitoring the projections onto set B in Algorithm 1. On the other hand, inAlgorithms 2b–3b, the order in which the projections are done is reversed: the first projectionis done onto the set A and the second projection onto B .Although the order of projections will not matter in view of the existing results statingthat convergence is achieved under any order – see [6, Proposition 2.5(i)], the order doesmake a difference in early iterations (as well as in the number of iterations required forconvergence of Algorithms 2 and 2b, as we will elaborate on later). If our intent is to stopthe algorithm early so that we can use the current iterate as an initial guess in more accuratecomputational optimal control algorithms, which can find the junction times with a highprecision (see [20–22]), then it is desirable to implement Algorithms 1–3 above, rather thanAlgorithms 1b–3b, because any iterate of Algorithms 1–3 will satisfy the constraints on thecontrol variable, while that of Algorithms 1b–3b will in general not. In what follows, we study the working of Algorithms 1–3 for an instance of Problem (P).Suppose that the car is initially at a reference position 0 and has unit speed. It is desiredthat the car come back to the reference position and be at rest after one unit of time; namelythat s = 0, s f = 0, v = 1, v f = 0. For these boundary conditions, no solution exists if onetakes the control variable bound a = 2 . a = 2 .
5. So, Arag´on Artacho and Campoy recommend α = 0 . β ∈ [0 . , . rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya we use a = 2 .
5. In the ensuing discussions, we use the stopping tolerance ε = 10 − unlessotherwise stated. Discretization.
Algorithms 1–3, as well as 1b–3b, carry out iterations with functions.For computations, we consider discrete approximations of the functions over the partition0 = t < t < . . . < t N = 1 such that t i +1 = t i + h , i = 0 , , . . . , N ,h := 1 /N and N is the number of subdivisions. Let u i be an approximation of u ( t i ), i.e., u i ≈ u ( t i ), i = 0 , , . . . , N −
1; similarly, x ,i ≈ x ( t i ) and x ,i ≈ x ( t i ), or x i := ( x ,i , x ,i ) ≈ x ( t i ), i = 0 , , . . . , N . In other words, the functions u , x and x are approximated by the N -dimensional array u h , with components u i , i = 0 , , . . . , N −
1, and the ( N +1)-dimensionalarrays x ,h and x ,h , with components x ,i and x ,i , i = 0 , , . . . , N , respectively. We definea discretization P h A of the projection P A as follows. P h A ( u − h )( t ) = u − h + c t h + c , (34)where t h = (0 , t , . . . , t N ), c = 12 ( x ,N − s f ) − x ,N − v f ) , (35) c = − x ,N − s f ) + 2 ( x ,N − v f ) , (36)and x ,N and x ,N are obtained from the Euler discretization of (12)–(13): Given x , = s and x , = v , x ,i +1 = x ,i + h x ,i , (37) x ,i +1 = x ,i + h u − i ( t ) , (38)for i = 0 , , . . . , N − P h B of the projection P B can be defined in a straightforward manner, bysimply replacing u − in (22) with the discrete components u − i of u − h . Parametric Behaviour.
Obviously, the behaviour of Algorithms 2 and 2b, the Douglas–Rachford method, depend on the parameter λ , and the behaviour of Algorithms 3 and 3b onthe two parameters α and β . Figure 3 displays the dependence of the number of iterations ittakes to converge on these parameters, for various values of a . The dependence for a givenvalue of a appears to be continuous, albeit the presence of downward spikes .The graphs for Algorithms 2 and 2b, shown in parts (a) and (c) of Figure 3, respectively,differ significantly from one another. The bound a = 4 corresponds to the case when thecontrol constraint becomes active only at t = 0 – see Figure 1. In other words, when a > a = 4, the best value of λ is 1for Algorithm 2, yielding the solution in just 6 iterations. For Algorithm 2b, the best valuefor λ is 0.5, as can be seen in (c), producing the solution in 30 iterations. Going back toAlgorithm 2, with decreasing values of a , the values of λ minimizing the number of iterationsshift to the right. For example, the minimum number of iterations is 91, with a = 2 . λ = 0 . a = 2 . λ = 0 . λ . For example, the rounded-off valueof λ = 0 . λ = 0 .
55 yields 444 iterations forconvergence. The number of iterations is less sensitive to the local minimizer λ = 0 . rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya (a) Algorithm 2 (DR) (b) Algorithms 3 (AAC) and 3b (AAC-b)for α = 1 (c) Algorithm 2b (DR-b) (d) Algorithms 3 (AAC) and 3b (AAC-b)Figure 3: Numerical experiments with s = 0 , s f = 0 , v = 1 , v f = 0 . which results in 132 iterations. It is interesting to note that the graph with a = 4 appears tobe an envelope for the number of iterations for all λ ∈ ]0 , α and β ,for the same values of a as in the rest of the graphs in the figure. It is interesting to observethat the surfaces look to be cascaded with (roughly) the outermost surface being the onecorresponding to a = 4. The surface plot suggests that for minimum number of iterations,one must have α = 1. Although theory requires α < α = 1 seems to cause no concerns inthis particular intance; so, we set α = 1 for the rest of the paper. The cross-sectional curvesat α = 1 are shown with much more precision in part (b) of the figure. The spikes that areobserved in part (d) can also be seen in the graph in part (b).In fact, the first observation one has to make here is that, for a = 4, convergence can beachieved in merely one iteration, with β = 0 .
5. This is quite remarkable, compared withAlgorithms 2 and 2b. The graphs in (b) appear to be enveloped as well by the graph for a = 4, as in part (c). For the values of a other than 4, the globally minimum number ofiterations seems to be achieved at a downward spike, which as a result is very sensitive tochanges in β . For example, for a = 2 .
5, the optimal β value is 0.78249754 for a minimum 35 rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya iterations. A rounded-off β = 0 .
782 results in 111 iterations, and β = 0 . β = 0 . β = 0 . . a = 4. Behaviour in Early Iterations.
Figure 4(a)–(c) illustrates the working of all three algo-rithms for the same instance. All three algorithms converge to the optimal solution, with thestopping tolerance of ε = 10 − . The optimal values of the algorithmic parameters, λ = 0 . α = 1 and β = 0 . Error Analysis via Numerical Experiments.
For a fixed value of N , Algorithms 1–3converge only to some approximate solution of the original Problem. Therefore, the questionas to how the algorithms behave as the time partition is refined, i.e., N is increased, needs tobe investigated. For the purpose of a numerical investigation, we define, in the k th iteration,the following errors. Suppose that the pair ( u ∗ , x ∗ ) is the optimal solution of Problem (P)and ( u kh , x kh ) an approximate solution of Problem (P) in the k th iteration of a given algorithm.Define σ ku := max ≤ i ≤ N − | u ki − u ∗ ( t i ) | and σ kx := max ≤ i ≤ N || x ki − x ∗ ( t i ) || ∞ , where || · || ∞ is the ℓ ∞ -norm in IR . For large N , these expressions are reminiscent of the L ∞ -norm, and therefore they will be referred to as the L ∞ -error .For ( u ∗ , x ∗ ) in the error expressions, we have used the discretized (approximate) solutionobtained for the Euler-discretized Problem (P) utilizing the Ipopt–AMPL suite, with N = 10 and the tolerance set at 10 − .For N = 2000, these errors are depicted in Figure 4(d) and (e). From the graphs it isimmediately clear that no matter how much smaller the stopping tolerance is selected, thebest error that is achievable with N = 2000 is around 10 − for the control variable and around10 − for the state variable vector. In fact, the graphs also tell that perhaps a much smallerstopping threshold than 10 − would have achieved the same approximation to the continuous-time solution of Problem (P). By just looking at the graphs, one can see that Algorithm 1could have been run just for about 300 iterations instead of 530, and Algorithms 2 and 3could have been run for about 50 iterations to achieve the best possible approximation with N = 2000.In Figure 5, we depict the same errors for N = 10 (parts (a) and (b)), N = 10 (inparts (c) and (d)) and N = 10 (in parts (e) and (f)). It is observed that, with a ten-foldincrease in N (which is a ten-fold decrease in h ) the errors in both u and x are reduced byten-folds, implying that the error (both in x and in u ) depends on the stepsize h linearly.This is in line with the theory of Euler-discretization of optimal control problems; see, forexample, [13, 14]. Furthermore, even for very large values of N , it can be seen from thesegraphs that a stopping threshold slightly smaller than 10 − would suffice to get even morestringent error levels, such as around 10 − for the control variable and around 10 − for thestate variable vector. A larger stopping threshold would obviously result in smaller numberof iterations. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya (a) Algorithm 1 (Dykstra)stops after 530 iterations. (b) Algorithm 2 (DR, λ = 0 . (c) Algorithm 3 (AAC, α = 1, β = 0 . -4 -2 (d) L ∞ -error in control by Algorithms 1–3. -4 -2 (e) L ∞ -error in states by Algorithms 1–3.Figure 4: Numerical experiments with a = 2 . , s = 0 , s f = 0 , v = 1 , v f = 0 , and thenumber of discretization subintervals N = 2000 . The graphs in (a)–(c) show approximations ofthe optimal control function with Algorithms 1–3, after k = 3 , , iterations, with ε = 10 − .All algorithms are observed to converge to the optimal solution indicated by k → ∞ , in variousrates. The semi-log graphs in (d) and (e) show the L ∞ errors in the state and control variables,respectively, in each iteration of the three algorithms. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya -4 -2 (a) L ∞ -error in control with N = 10 . -4 -2 (b) L ∞ -error in states with N = 10 . -4 -2 (c) L ∞ -error in control with N = 10 . -4 -2 (d) L ∞ -error in states with N = 10 . -4 -2 (e) L ∞ -error in control with N = 10 . -4 -2 (f) L ∞ -error in states with N = 10 .Figure 5: Numerical experiments with a = 2 . , s = 0 , s f = 0 , v = 1 , v f = 0 . The semi-loggraphs show the L ∞ errors in the state and control variables, respectively, in each iteration of thethree algorithms, with various N from coarse ( N = 1000 ) to fine ( N = 100000 ). rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya N Dykstra DR AAC Ipopt10 . × − . × − . × − . × − . × − . × − . × − . × − . × − . × − . × − . × − (a) L ∞ -error in control, σ ku . N Dykstra DR AAC Ipopt10 . × − . × − . × − . × − . × − . × − . × − . × − . × − . × − . × − . × − (b) L ∞ -error in states, σ kx . Table 1: Least errors that can be achieved by Algorithms 1–3 and Ipopt, with ε = 10 − .Table 1 displays the values of the errors, separately in u and x , after the stopping criteriawith ε = 10 − was satisfied, for each of the three algorithms. A precise 10-fold reduction inerror with a 10-fold increase in N can be verified with these numbers, as discussed in theprevious paragraph. We have added the experiments we have carried out with Ipopt, version3.12, an interior point optimization software [28], which solved the direct Euler-discretizationof Problem (P), with the same values of N and the same tolerance 10 − . Ipopt, running withlinear solver MA57, was paired up with the optimization modelling language AMPL [18]. Thesame 10-fold decreases in error cannot be observed with Ipopt, unless one sets the tolerancefor Ipopt to be much smaller than 10 − , say 10 − (which also means longer computationaltimes). With the tolerance set at 10 − , the error values with Ipopt becomes pretty muchthe same as those with Dykstra (still with ε = 10 − ), which is interesting to note.As we pointed out earlier, the same errors listed in Table 1 can be achieved with biggerstopping thresholds. For N = 10 , , , respectively: with ε = 10 − , − , − , Algo-rithm 1 converges in 281, 359 and 454 iterations; with ε = 10 − , − , − , Algorithm 2converges in 65, 50 and 101 iterations; with ε = 10 − , − , − , Algorithm 3 converges in49, 60 and 70 iterations.In Table 2, the CPU times (in seconds) each algorithm takes, with the respective ε valueslisted above, are tabulated. Note that Algorithms 1–3 have been coded and run on Matlab,64-bit (maci64) version R2017b. All software, including AMPL and Ipopt, were run onMacBook Pro, with operating system macOS Sierra version 10.12.6, processor 3.3 GHz IntelCore i7 and memory 6 GB 2133 MHz LPDDR3. In Table 2, the CPU times for Ipopt arelisted with the tolerance 10 − , since with only this fine tolerance it is possible to obtain thesame order of the error magnitudes as those obtained by Algorithms 1–3. With ε = 10 − ,the CPU times for Ipopt are 0.06, 0.45 and 4.4 seconds, respectively, which are significantlyhigher than the times taken by Algorithms 1–3, in addition to worse errors.Numerical observations suggest two joint winners: Algorithms 2 and 3, i.e., the Douglas–Rachford method and the Arag´on Artacho–Campoy algorithm, in both accuracy and speed. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya N Dykstra DR AAC Ipopt10 Table 2: CPU times taken by Algorithms 1–3 and Ipopt. For N = 10 , , , respec-tively: ε = 10 − , − , − for Algorithm 1, ε = 10 − , − , − for Algorithm 2, and ε = 10 − , − , − for Algorithm 3, have been used. The tolerance for Ipopt was set as10 − . We have applied three well-known projection methods to solve an optimal control prob-lem, i.e., control-constrained minimum-energy control of double integrator. We have derivedthe projectors for the optimal control problem and demonstrated that they can be used inDykstra’s algorithm, the Douglas–Rachford (DR) method and the Arag´on Artacho–Campoy(AAC) algorithm, effectively. We carried out extensive numerical experiments for an in-stance of the problem and concluded that the DR and AAC algorithms (Algorithms 2 and 3)were jointly the most successful. We also made comparisons with the standard discretizationapproach, only to witness the benefit of using projection methods.It is interesting to note that when we apply alternating projections, we also seem to con-verge to P A∩B (0) even though this is not supported by existing theory.To the best of authors’ knowledge, the current paper constitutes the first of its kind whichinvolves projection methods and continuous-time optimal control problems. It can be con-sidered as a prototype for future studies in this direction. Some of the possible directions arelisted as follows. • The setting we have introduced could be extended to general control-constrained linear-quadratic problems. • We have used some discretization of the projector as well as the associated IVP in (34)–(38). This might be extended to problems in more general form. On the other hand,for the particular problem we have dealt with in the present paper, one might take intoaccount the fact that if u − ( t ) is piecewise linear then its projection is piecewise linear.This might simplify further the expressions given in Proposition 1. • Although theory for projection methods can in principle vouch convergence only forconvex problems, it is well-known that the DR method can be successful for nonconvexproblems, see, for example, [8]. It would be interesting to extend the formulations inthe current paper to nonconvex optimal control problems. • For a certain value of an algorithmic parameter, Figure 3 exhibits downward spikes.It would be interesting to see if this phenomenon is also observed in other control-constrained optimal control problems. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya Appendix
Algorithm 1b (Dykstra-b)Steps 1–4 ( Initialization ) Do as in Steps 1–4 of Algorithm 1.
Step 5 ( Stopping criterion ) If k u k +1 − u k k L ∞ ≤ ε , then return u k +1 and stop. Otherwise,set k := k + 1 and go to Step 2. Algorithm 2b (DR-b)Step 1 ( Initialization ) Choose a parameter λ ∈ ]0 ,
1[ and the initial iterate u arbitrarily.Choose a small parameter ε >
0, and set k = 0. Step 2 ( Projection onto A ) Set u − = λu k . Compute e u = P A ( u − ) by using (9). Step 3 ( Projection onto B ) Set u − := 2 e u − u k . Compute b u = P B ( u − ) by using (22). Step 4 ( Update ) Set u k +1 := u k + b u − e u . Step 5 ( Stopping criterion ) If k u k +1 − u k k L ∞ ≤ ε , then return e u and stop. Otherwise, set k := k + 1 and go to Step 2. Algorithm 3b (AAC-b)Step 1 ( Initialization ) Choose the initial iterate u arbitrarily. Choose a small parameter ε >
0, two parameters α and β in ]0 , k = 0. Step 2 ( Projection onto A ) Set u − = u k . Compute e u = P A ( u − ) by using (9). Step 3 ( Projection onto B ) Set u − = 2 β e u − u k . Compute b u = P B ( u − ) by using (22). Step 4 ( Update ) Set u k +1 := u k + 2 αβ ( b u − e u ). Step 5 ( Stopping criterion ) If k u k +1 − u k k L ∞ ≤ ε , then return e u and stop. Otherwise, set k := k + 1 and go to Step 2. References [1]
W. Alt, C. Y. Kaya and C. Schneider , Dualization and discretization of linear-quadratic control problems with bang–bang solutions . EURO J Comput. Optim., 4 (2016),47–77.[2]
F. J. Arag´on Artacho and R. Campoy , A new projection method for finding theclosest point in the intersection of convex sets . Comput. Optim. Appl. 69 (2018), 99–132.[3]
U. M. Ascher, R. M. M. Mattheij and R. D. Russell , Numerical Solutionof Boundary Value Problems for Ordinary Differential Equations . SIAM Publications,Philadelphia, 1995.[4]
H. H. Bauschke and P. L. Combettes , Convex Analysis and Monotone OperatorTheory in Hilbert Spaces . Second edition. Springer, 2017. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya [5] H. H. Bauschke and V. R. Koch , Projection methods: Swiss Army knives for solvingfeasibility and best approximation problems with halfspaces . Infinite Products of Opera-tors and Their Applications, 2012, 1–40.[6]
H. H. Bauschke and W. M. Moursi , On the order of the operators in the Douglas–Rachford algorithm . Optimization Letters, 10 (2016), 447–455.[7]
H. H. Bauschke and W. M. Moursi , On the Douglas–Rachford algorithm . Math.Program., Ser. A, 164 (2017), 263–284.[8]
J. M. Borwein and B. Sims , The DouglasRachford Algorithm in the absence of con-vexity . In: Bauschke H., Burachik R., Combettes P., Elser V., Luke D., Wolkowicz H.(eds) Fixed-Point Algorithms for Inverse Problems in Science and Engineering. SpringerOptimization and Its Applications, vol 49. Springer, New York, NY.[9]
J. P. Boyle and R. L. Dykstra , A method for finding projections onto the intersectionof convex sets in Hilbert spaces , in Advances in Order Restricted Statistical Inference,vol. 37 of Lecture Notes in Statistics, Springer, 1986, 28–47.[10]
R. S. Burachik, C. Y. Kaya, and S. N. Majeed , A duality approach for solvingcontrol-constrained linear-quadratic optimal control problems . SIAM J. Control Optim.,52 (2014), 1771–1782.[11]
P. L. Combettes , A block-iterative surrogate constraint splitting method for quadraticsignal recovery . IEEE Trans. Sig. Proc., 51 (2003), 2432–2442.[12]
P. L. Combettes , Iterative construction of the resolvent of a sum of maximal monotoneoperators . J. Convex Anal., 16 (2009), 727–748.[13]
A. L. Dontchev and W. W. Hager , The Euler approximation in state constrainedoptimal control , Math. Comp., 70 (2001), 173–203.[14]
A. L. Dontchev, W. W. Hager and K. Malanowski , Error bound for Euler ap-proximation of a state and control constrained optimal control problem,
Numer. Funct.Anal. Optim., 21 (2000), 653–682.[15]
J. Douglas and H. H. Rachford , On the numerical solution of heat conductionproblems in two and three space variables . Trans. Amer. Math. Soc., 82 (1956), 421–439.[16]
J. Eckstein and D. P. Bertsekas , On the Douglas-Rachford splitting method andthe proximal point algorithm for maximal monotone operators . Math. Program., Ser. A,55 (1992), 293–318.[17]
J. Eckstein and M. C. Ferris , Operator-splitting methods for monotone affine varia-tional inequalities, with a parallel application to optimal control . INFORMS J. Comput.,10 (1998), 218–235.[18]
R. Fourer, D. M. Gay, and B. W. Kernighan , AMPL: A Modeling Languagefor Mathematical Programming, Second Edition . Brooks/Cole Publishing Company /Cengage Learning, 2003.[19]
M. R. Hestenes , Calculus of Variations and Optimal Control Theory . John Wiley &Sons, New York, 1966.[20]
C. Y. Kaya, S. K. Lucas, and S. T. Simakov , Computations for bang–bang con-strained optimal control using a mathematical programming formulation , Optim. Contr.Appl. Meth., 25(6) (2004), 295–308. rojection Methods for Optimal Control of Double Integrator by H. H. Bauschke, R. S. Burachik & C. Y. Kaya [21] C. Y. Kaya and J. L. Noakes , Computations and time-optimal controls , Opt. Cont.Appl. Meth., 17 (1996), 171–185.[22]
C. Y. Kaya and J. L. Noakes , Computational algorithm for time-optimal switchingcontrol , J. Optim. Theory App., 117 (2003), 69–92.[23]
P.-L. Lions and B. Mercier , Splitting algorithms for the sum of two nonlinear oper-ators . SIAM J. Numer. Anal., 16 (1979), 964–979.[24]
B. O’Donoghue, G. Stathopoulos, and S. Boyd , A splitting method for optimalcontrol . IEEE Trans. Contr. Sys. Tech., 21 (2013), 2432–2442.[25]
W. J. Rugh , Linear System Theory, 2nd Edition . Pearson, 1995.[26]
J. Stoer and R. Bulirsch , Introduction to Numerical Analysis, 2nd Edition . Springer-Verlag, New York, 1993.[27]
B. F. Svaiter , On weak convergence of the Douglas-Rachford method . SIAM J. ControlOptim., 49 (2011), 280–287.[28]