Semi-Definite Relaxation Based ADMM for Cooperative Planning and Control of Connected Autonomous Vehicles
Xiaoxue Zhang, Zilong Cheng, Jun Ma, Sunan Huang, Frank L. Lewis, Tong Heng Lee
11 Semi-Definite Relaxation Based ADMM forCooperative Planning and Control of ConnectedAutonomous Vehicles
Xiaoxue Zhang, Zilong Cheng, Jun Ma, Sunan Huang, Frank L. Lewis,
Life Fellow, IEEE, and Tong Heng Lee
Abstract —This paper investigates the cooperative planningand control problem for multiple connected autonomous vehicles(CAVs) in different scenarios. In the existing literature, most ofthe methods suffer from significant problems in computationalefficiency. Besides, as the optimization problem is nonlinear andnonconvex, it typically poses great difficultly in determiningthe optimal solution. To address this issue, this work pro-poses a novel and completely parallel computation frameworkby leveraging the alternating direction method of multipliers(ADMM). The nonlinear and nonconvex optimization problemin the autonomous driving problem can be divided into twomanageable subproblems; and the resulting subproblems canbe solved by using effective optimization methods in a parallelframework. Here, the differential dynamic programming (DDP)algorithm is capable of addressing the nonlinearity of the systemdynamics rather effectively; and the nonconvex coupling con-straints with small dimensions can be approximated by invokingthe notion of semi-definite relaxation (SDR), which can also besolved in a very short time. Due to the parallel computationand efficient relaxation of nonconvex constraints, our proposedapproach effectively realizes real-time implementation and thusalso extra assurance of driving safety is provided. In addition, twotransportation scenarios for multiple CAVs are used to illustratethe effectiveness and efficiency of the proposed method.
Index Terms —Autonomous driving, multi-agent system, modelpredictive control (MPC), alternative direction method of mul-tipliers (ADMM), semi-definite relaxation (SDR), cooperativeplanning and control (CPaC).
I. I
NTRODUCTION
With the rapid development of information and communica-tion technologies as well as the improvement of computationalresources, connected automated vehicles (CAVs) have becomeone of the critical components in the context of intelligenttransportation systems [1]. Generally, vehicles are equippedwith advanced sensors that provide detailed information aboutthe driving environment and onboard computing chip forefficient computation. Besides, the vehicle-to-vehicle (V2V)and vehicle-to-infrastructure (V2I) communication modules
X. Zhang, Z. Cheng, and T. H. Lee are with the NUS Graduate Schoolfor Integrative Sciences and Engineering, National University of Singapore,Singapore 119077 (e-mail:[email protected]; [email protected];[email protected]).J. Ma is with the Department of Mechanical Engineering, University ofCalifornia, Berkeley, CA 94720 USA (e-mail: [email protected]).S. Huang is with the Temasek Laboratories, National University of Singa-pore, Singapore, 117411 (e-mail: [email protected]).F. L. Lewis is with the Automation and Robotics Research Institute, Univer-sity of Texas at Arlington, Arlington, TX 76118 USA (e-mail: [email protected]).This work has been submitted to the IEEE for possible publication.Copyright may be transferred without notice, after which this version mayno longer be accessible. are equipped to enable the sharing of information with otherparticipants through vehicular ad hoc networks [2]. All thesetechnologies and equipment make substantial contributions tothe development of intelligent transportation systems. How-ever, cooperative planning and control (CPaC) in the tensetraffic flow is still a long-standing challenge in the domain ofautonomous driving [3]. Notably, the main task of the CPaCis to generate high-quality trajectories that satisfy the require-ments resulted from the road geometry, collision avoidance,vehicle dynamics, and traffic rules during different drivingtasks [4].Researches of single-vehicle planning and control havebeen already studied extensively, and they can be referred toin numerous works [5]–[11]. On the other hand, the CPaCcan be generally solved by learning-based and optimization-based approaches. Among all learning-based approaches, thereinforcement learning (RL) is quite effective to obtain theoptimal or near-optimal action sequence through searchingand evaluation [12]–[14]. However, it still suffers from certainshortcomings. For example, the number of the public datasetfor autonomous driving is not enough and the data obtainedfrom simulators cannot be generalized to all driving scenarios,and thus to obtain enough training dataset that contains all roadcondition and driving scenarios is still an open question sofar. Besides, in some complex driving scenarios, determininga proper reward function, which is a critical component for theRL, is also not straightforward [15]. Some researches utilizethe neural network to fit the real samples, which might ignoresome corner cases in autonomous driving. This is because thevalue function in real traffic environment is closely related tothe specific scenario, and not only the current state of all trafficparticipants need to be estimated, but also the states of alltraffic participants in the future are required to be consideredcarefully [16].On the other hand, in terms of optimization-based ap-proaches, the CPaC can be generally formulated as a con-strained optimization problem that aims to generate collision-free trajectories. The trajectories should satisfy the constraintsresulted from the vehicle dynamics, road geometry, collisionavoidance, and traffic rules, meanwhile certain criterias such ascomfort and safety should also be taken into account. There arealready some optimization-based approaches available in theliterature. For instance, [17] utilizes a mixed integer quadraticprogramming (MIQP) to solve the problem of cooperativetrajectories planning for CAVs. [18] proposes a distributedcooperative control to obtain collision-free trajectories for a r X i v : . [ c s . M A ] J a n AVs by formulating it as a mixed-integer nonlinear program-ming problem. In [19], a nonlinear constrained optimizationproblem is formulated for an intersection scenario and solvedby the active set method. However, most of these researchesmerely focus on simple motion tasks with simplified vehiclemodels. The approaches are no longer effective in morerealistic and complex situations, where frequent interactionsand coordination due to the vehicle model’s strong nonlinearityand expansive computation are involved. Additionally, onemajor challenge in the cooperative optimization problem liesin how to handle the coupling constraints among connectedvehicles with efficient computation. In general, this problemis nonlinear and nonconvex, which significantly increases thedifficulty in deriving the optimal solution. As a result, the highcomplexity and nonconvexity of these constraints certainlyinduce heavy burdens to the computation efficiency [20].Nowadays, the alternating direction method of multipliers(ADMM), which solves a convex optimization problem bybreaking it into smaller and manageable ones, has becomea considerable technique with remarkable scalability. It hasbeen recently found the broad applicability in various areas,such as optimal control, distributed computation, machinelearning, and so on [21]–[24]. Remarkably, the ADMM cansolve a convex optimization problem with converging to aglobal optimum and achieve the parallel computation afterdecomposition, which exceedingly alleviates the typical com-putational burden resulted from the dimension growth of theoptimization problem. Such advantages of the ADMM promptmany researchers to turn their attention to the nonconvex opti-mization problem. For example, [25] investigates the practicalperformance of the ADMM on several nonconvex applicationsand indicates that the ADMM also performs well on variousnonconvex problems. [26] analyzes the convergence of theADMM when solving some specific nonconvex consensusand sharing problems. Moreover, an accelerated hierarchicalADMM algorithm is proposed in [27] for the nonconvex opti-mization problem. In [28], a multi-block ADMM is presentedto minimize a nonconvex and nonsmooth objective functionsubject to specific coupling linear equality constraints. Inthese past research works, the ADMM has been reasonablyestablished at the theoretical level. Indeed, these advancedoptimization techniques bring promising prospects to the areaof autonomous driving.This paper presents a novel ADMM-based approach forsolving a nonconvex optimization problem with coupling con-straints for CPaC of multiple CAVs. In this work, a nonlinearand nonconvex optimization problem is formulated, and aconsensus ADMM is utilized to split the optimization probleminto two small-scale subproblems, one with nonlinear dynam-ics constraints and the other one with nonconvex couplingconstraints. The first subproblem considering the strong non-linearity of the vehicle dynamics can be resolved by the differ-ential dynamic programming (DDP) in a parallel manner. Thesecond subproblem with the coupling nonconvex constraintsfor multiple CAVs is suitably addressed using several methodsin parallel, including the semi-definite relaxation (SDR) andMIQP. Compared with some of the optimization approaches inthe literature, this work can effectively relieve the computation burden arising from the nonlinearity and nonconvexity, whichmakes the real-time implementation possible and providesextra assurance of driving safety.The remainder of this paper is organized as follows. Sec-tion II gives the notation and the preliminary related to theDDP. Section III defines the consensus nonlinear and noncon-vex optimization problem with the introduction of the dynamicmodel, objective function, and constraints in the autonomousdriving task. Section IV proposes the ADMM algorithm forsolving such a consensus optimization problem in a parallelframework. In Section V, two complex driving scenarios inautonomous driving are used to show the effectiveness of theproposed methodology. At last, the discussion and conclusionof this work are given in Section VI.II. P
RELIMINARY
A. Notation
The following notations are used in the remaining text. R a × b denotes the set of real matrices with a rows and b columns, R a means the set of a -dimensional real columnvectors. A (cid:62) and x (cid:62) denote the transpose of the matrix A and vector x , respectively. x (cid:31) y and x (cid:23) y denote thatvector x is element-wisely greater and no less than the vector y , respectively. X (cid:31) and X (cid:23) represent that the matrix X is positive definite and positive semi-definite, respectively. a and ( a,b ) represent the a -dimensional all-zero vectorand the a -by- b all-zero matrix, respectively. a and ( a,b ) are the a -dimensional all-one vector and the a -by- b all-onematrix, respectively. I a denotes the a -dimensional identitymatrix. The Frobenius inner product is denoted as (cid:104) X, Y (cid:105) , i.e., (cid:104)
X, Y (cid:105) = Tr( X (cid:62) Y ) for all X, Y ∈ R a × b . The operator (cid:107) X (cid:107) is the Euclidean norm of matrix X . The Kronecker product isdenoted by ⊗ . Z a and Z ba represent the sets of positive integers { , , · · · , a } and { a, a + 1 , · · · , b } with a < b , respectively. blockdiag( X , X , · · · , X n ) denotes a block diagonal matrixwith diagonal entries X , X , · · · , X n ; diag( a , a , · · · , a n ) is a diagonal matrix with diagonal entries a , a , · · · , a n . (cid:16) { x i } ∀ i ∈ Z n (cid:17) denotes the concatenation of the vector x i forall i ∈ Z n , i.e., (cid:16) { x i } ∀ i ∈ Z n (cid:17) = (cid:2) x (cid:62) x (cid:62) · · · x (cid:62) n (cid:3) (cid:62) =( x , x , · · · , x n ) . B. Differential Dynamic Programming
The DDP is a second-order shooting method with aquadratic convergence rate, and it is typically deployed intrajectory optimization problems for nonlinear systems [29].Notably, it calculates the second-order derivatives of the dy-namic function and the cost function.For a discrete-time dynamic model x τ +1 = f ( x τ , u τ ) , thecost function is defined as J τ ( x, U τ ) = T − (cid:88) i = τ (cid:96) ( x i , u i ) + (cid:96) T ( x T ) , (1)where U τ = { u τ , u τ +1 , · · · , u T − } denotes the sequence ofcontrol inputs from the time stamp τ to T − , and T is the2rediction horizon. Define the value function at time τ as theoptimal cost, where V τ ( x ) = min U τ J τ ( x, U τ ) . (2)Also, it is pertinent to note that the value function of theterminal time stamp T is V T ( x ) = (cid:96) T ( x T ) . According tothe dynamic programming principle, the minimization problemover U τ can be reduced into a sequence of minimizationproblems over one-step control, which is given by V τ ( x ) = min u τ (cid:96) ( x τ , u τ ) + V τ +1 ( f ( x τ , u τ )) . (3)Then, the perturbed Q-function is given by Q τ ( δx τ , δu τ )= V τ +1 ( f ( x τ + δx τ , u τ + δu τ )) − V τ +1 ( f ( x τ , u τ ))+ (cid:96) τ ( x τ + δx τ , u τ + δu τ ) − (cid:96) τ ( x τ , u τ ) (4)where δx τ and δu τ denote the change of states and inputs atthe time stamp τ . Here, we expand (4) into its second-orderform as Q τ ( δx τ , δu τ ) ≈ δx τ δu τ (cid:62) (cid:0) Q (cid:62) τ (cid:1) x (cid:0) Q (cid:62) τ (cid:1) u ( Q τ ) x ( Q τ ) xx ( Q τ ) xu ( Q τ ) u ( Q τ ) ux ( Q τ ) uu δx τ δu τ (5)where ( Q τ ) x = ( (cid:96) τ ) x + f (cid:62) x ( V τ +1 ) x ( Q τ ) u = ( (cid:96) τ ) u + f (cid:62) u ( V τ +1 ) u ( Q τ ) xx = ( (cid:96) τ ) xx + f (cid:62) x ( V τ +1 ) xx f x + ( V τ +1 ) x · f xx ( Q τ ) ux = ( (cid:96) τ ) ux + f (cid:62) u ( V τ +1 ) xx f x + ( V τ +1 ) x · f ux ( Q τ ) uu = ( (cid:96) τ ) uu + f (cid:62) u ( V τ +1 ) xx f u + ( V τ +1 ) x · f uu . (6)Minimizing (5) with respect to δu τ , we have δu (cid:63)τ = argmin δu τ Q τ ( δx τ , δu τ ) = k τ + K τ δx τ , where k τ = − ( Q τ ) − uu ( Q τ ) u K τ = − ( Q τ ) − uu ( Q τ ) ux . (7)By substituting this control policy (7) into (5), we have ∆ V τ = − Q u,τ Q − uu,τ Q u,τ V x,τ = Q x,τ − Q u,τ Q − uu,τ Q ux,τ V xx,τ = Q xx,τ − Q xu,τ Q − uu,τ Q ux,τ (8)By computing (8) and the control policy terms k τ , K τ gradu-ally until τ = 0 , it constitutes the process named the backwardpass. Subsequently, a forward pass is carried out to computea new trajectory by u τ = ˆ u τ + αk τ + K τ ( x τ − ˆ x τ ) x τ +1 = f ( x τ , u τ ) , (9)where α is a backtracking search parameter. Normally, it isset to 1 and then reduced gradually. Given an initial nominaltrajectory { ˆ x τ , ˆ u τ } , the trajectory will be refined towards theoptimal one after certain iterations of the backward pass andforward pass. III. P ROBLEM F ORMULATION
A. Network of Connected Vehicles
In this paper, an undirected graph G ( V , E ) can be utilizedto represent the constraint or information topology of multipleCAVs. The node set V = { , , · · · , N } denotes the agents and N is the number of agents. The edge set E = { , , · · · , M } denotes the coupling constraints (information flow) betweentwo interconnected vehicles, where M denotes the number ofagents and coupling constraints in the multi-agent system. Theedge set E is defined as (cid:40) ( i, j ) ∈ E ( t ) , d safe ≤ (cid:107) p i − p j (cid:107) ≤ d cmu ( i, j ) / ∈ E ( t ) , otherwise , (10)where d cmu and d safe mean the maximum communicationdistance and minimum safe distance between two agents,respectively; p i and p j are the position vectors of the i th agentand j th agent. According to the communication topology, anadjacency matrix, which is denoted by D , can be defined asa square symmetric matrix to represent the finite undirectedgraph G ( V , E ) . The elements of D indicate whether pairs ofvertices are adjacent/connected or not in the graph. Therefore,the neighbor nodes of the i th vehicle are the correspondingindexes of nonzero elements of the i th row in D , which isrepresented by ν ( i ) = { j | ( i, j ) ∈ E , ∀ j ∈ V} . B. Problem Description1) Dynamic Model:
In terms of an autonomous vehicle,we define the state vector as x ∈ R n and the input vectoras u ∈ R m . The general vehicle dynamic model for the i thvehicle can be represented by x i ( τ +1) = f ( x iτ , u iτ ) (11)where f denotes the kinematic dynamics of the i th vehicle, and τ is the time stamp. Note that the position vector x p ∈ R n p is included in the state vector x , i.e., x = ( x p , · · · ) .
2) Cost Function:
The cost function of the i th vehicle is T − (cid:88) τ =0 (cid:13)(cid:13) x i ( τ +1) − x r,i ( τ +1) (cid:13)(cid:13) Q i + (cid:107) u iτ (cid:107) R i , (12)where Q i and R i is the weighting matrices and x r,i is thereference state vector that the vehicle needs to track. Here,the first term of the cost function penalizes the deviationbetween the state vector x i and the corresponding referencestate vector x r,i , and the second term penalizes the magnitudeof the control input variable u i . Note that the cost function isa convex, closed, and proper function.
3) Constraints:
The restrictions on the state and inputvariables should be taken into consideration; hence the boxconstraint of input variable is introduced as u iτ (cid:22) u iτ (cid:22) u iτ , (13)where u iτ , u iτ denote the minimum value and maximum valueof the input variables, respectively. Besides the box constraints,the collision avoidance constraints for the connected vehiclesalso need to be satisfied, which gives (cid:13)(cid:13) p i ( τ +1) − p j ( τ +1) (cid:13)(cid:13) ≥ d safe , ∀ j ∈ ν ( i ) , ∀ τ ∈ Z T − , (14)3here p i is the position vector for the i th vehicle, i.e., p iτ = (cid:2) p x,iτ p y,iτ (cid:3) (cid:62) .
4) Problem Formulation:
For all connected vehicles in thenetwork, i.e., ∀ i ∈ V , each vehicle is required to satisfy thedynamic constraint and box constraints, as mentioned above.Besides, CAVs need to satisfy collision avoidance constraints.Hence, the multi-agent cooperative automation problem can beformulated as an optimal control problem, which is defined as min (cid:88) i ∈V T − (cid:88) τ =0 (cid:13)(cid:13) x i ( τ +1) − x r,i ( τ +1) (cid:13)(cid:13) Q i + (cid:107) u iτ (cid:107) R i s . t . x i ( τ +1) = f ( x iτ , u iτ ) ,u iτ (cid:22) u iτ (cid:22) u iτ , (cid:107) p i ( τ +1) − p j ( τ +1) (cid:107) ≥ d safe , ∀ τ ∈ Z T − , ∀ j ∈ ν ( i ) , ∀ i ∈ V . (15)Furthermore, we define the optimization variable y ∈ R NT ( m + n ) as y = ( y , · · · , y T (cid:124) (cid:123)(cid:122) (cid:125) y , · · · , y i , · · · , y iT (cid:124) (cid:123)(cid:122) (cid:125) y i , · · · , y N , · · · , y NT (cid:124) (cid:123)(cid:122) (cid:125) y N ) , where y i = { y i , y i , · · · , y iτ , · · · , y iT } ∈ R T ( m + n ) for the i th vehicle, and y iτ = ( x iτ , u i ( τ − ) ∈ R m + n . Remark 1.
In order to compute this problem in parallel, thehost constraints and the coupling constraints in this collisionavoidance multi-agent system are separated into two sets Y and Z . The first set addresses the host constraints for eachagent, and the other one aims to deal with the couplingconstraints and box constraints.Define two sets Y and Z for the variable y as Y = (cid:110) y ∈ R NT ( m + n ) (cid:12)(cid:12)(cid:12) x i ( τ +1) = f ( x iτ , u iτ )) , ∀ i ∈ V (cid:111) Z = z ∈ R NT ( m + n p ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) u iτ (cid:22) z u,iτ (cid:22) u iτ , (cid:107) z p,iτ − z p,jτ (cid:107) ≥ d safe , ∀ τ ∈ Z T , ∀ j ∈ ν ( i ) , ∀ i ∈ V , where z = (cid:16) z , · · · , z T (cid:124) (cid:123)(cid:122) (cid:125) z , · · · , z i , · · · , z iT (cid:124) (cid:123)(cid:122) (cid:125) z i , · · · , z N , · · · , z NT (cid:124) (cid:123)(cid:122) (cid:125) z N (cid:17) , with z iτ = (cid:0) z p,iτ , z u,i ( τ − (cid:1) ∈ R m + n p , z u = (cid:16) { z u,iτ } ∀ τ ∈ Z T − , ∀ i ∈V (cid:17) which is the component of z , and thelower bound of z u is z u = (cid:16) { u iτ } ∀ τ ∈ Z T , ∀ i ∈V , (cid:17) ∈ R NT ( m + n ) ,and the upper bound of z u is z u = (cid:16) { u iτ } ∀ τ ∈ Z T , ∀ i ∈V , (cid:17) . Remark 2. x i ( τ +1) and u i ( τ ) are the components of thevariable y of state vector and the control input. z p,iτ and z u are the components of the variable z . Definition 1.
For a non-empty set C and a variable z , theindicator function is defined as δ C ( z ) = (cid:40) , if z ∈ C∞ , otherwise ,where C is a convex set. Then, the optimal control problem (15) can be convertedinto a more compact form as min (cid:88) i ∈V J ( y ) + δ Y ( y ) + δ Z ( T y ) , (16)where the matrix T = I NT ⊗ A ∈ R NT ( m + n p ) × NT ( m + n ) andthe matrix A = (cid:20) I n p ( n p ,n − n p ) ( n p ,m ) ( m,n p ) ( m,n − n p ) I m (cid:21) .By introducing a consensus variable z ∈ R NT ( m + n p ) , wecan rewrite (16) as min (cid:88) i ∈V J i ( x i , u i ) + δ Y ( y ) + δ Z ( z )s . t . T y − z = 0 . (17)IV. O PTIMIZATION IN P ARALLEL
The augmented Lagrangian function of problem (17) is L σ ( y, z ; λ )= J ( y ) + δ Y ( y ) + δ Z ( z ) + (cid:104) λ, E p y − z (cid:105) + σ (cid:107)T y − z (cid:107) = J ( y ) + δ Y ( y ) + δ Z ( z ) + σ (cid:13)(cid:13)(cid:13)(cid:13) T y − z + λσ (cid:13)(cid:13)(cid:13)(cid:13) − σ (cid:107) λ (cid:107) (18)where λ = ( λ , λ , · · · , λ N ) ∈ R NT ( m + n p ) is the dualvariable, and σ is the penalty parameter.It is straightforward to see that the ADMM algorithm canbe denoted by y k +1 = argmin y L σ ( y, z k ; λ k ) (19a) z k +1 = argmin r L σ ( y k +1 , z ; λ k ) (19b) λ k +1 = λ k + σ ( T y k +1 − r k +1 ) , (19c)where the superscript · k is the corresponding variable orparameter of the ADMM algorithm in the k th iteration. Thestopping criterion is chosen in terms of the primal residualerror, i.e., (cid:107)T y − z (cid:107) ≤ (cid:15). (20) A. Solving the First Sub-problem
The first sub-problem of the ADMM algorithm is to deter-mine the variable y by argmin y L σ ( y, r k ; λ k )= argmin y J ( y ) + δ Y ( y ) + σ (cid:13)(cid:13)(cid:13)(cid:13) T y − z k + λ k σ (cid:13)(cid:13)(cid:13)(cid:13) . (21)Since there is no coupling term, this optimization sub-problem can be treated in a distributed manner for each agent i ∈ V .For each agent i , we need to solve min y i J i ( y i ) + σ (cid:13)(cid:13)(cid:13)(cid:13) T i y i − z i + λ i σ (cid:13)(cid:13)(cid:13)(cid:13) s . t . x i ( τ +1) = f ( x iτ , u iτ ) , ∀ τ ∈ Z T − , ∀ i ∈ V , (22)4here the matrix T i = I T ⊗ A ∈ R T ( m + n p ) × T ( m + n ) ,the variable x i = (cid:16)(cid:8) x i ( τ +1) (cid:9) ∀ τ ∈ Z T − (cid:17) ∈ R T n and u i = (cid:16) { u iτ } ∀ τ ∈ Z T − (cid:17) ∈ R T m . Since y =( y , y , · · · , y i , · · · , y N ) , the variable y i ∈ R T ( m + n ) for the i th vehicle is y i = (cid:110) ( x i , u i ) (cid:124) (cid:123)(cid:122) (cid:125) y i , · · · , ( x iτ , u i ( τ − ) (cid:124) (cid:123)(cid:122) (cid:125) y iτ , · · · , ( x iT , u i ( T − ) (cid:124) (cid:123)(cid:122) (cid:125) y iT (cid:111) . Based on the definition of the set Y , given x i , the first sub-problem (22) can be rewritten as min (cid:88) i ∈V (cid:107) x i − x r,i (cid:107) Q i + (cid:107) u i (cid:107) R i + δ X i ( x i ) + δ U i ( u i ) , s . t . x i ( τ +1) = f ( x iτ , u iτ ) , ∀ τ ∈ Z T − , ∀ i ∈ V , (23)where the weighting matrices ˆ Q i = I T ⊗ Q i and ˆ R i = I T ⊗ R i , the vector x r,i ∈ R T n is the reference statevector, i.e., x r,i = (cid:16)(cid:8) x r,i ( τ +1) (cid:9) ∀ τ ∈ Z T − (cid:17) , δ X i ( · ) and δ U i ( · ) denote the indicator function with respect to the non-emptyset X i and U i , respectively, X i = { x i | x i (cid:22) x i (cid:22) x i } and U i = { u i | u i (cid:22) u i (cid:22) u i } , x i = (cid:16) { x iτ } ∀ τ ∈ Z T − (cid:17) , u i = (cid:16) { u iτ } ∀ τ ∈ Z T − (cid:17) . Furthermore, the definitions of x i and u i are in the similar way as those of x i and u i .Hence, the problem (23) is the standard format such thatthe DDP algorithm can be adopted directly. The pseudocodeof the DDP algorithm is shown in Algorithm 1. Algorithm 1
DDP Algorithm for the i th Vehicle Initialization: initial nominal trajectory { ˜ x iτ , ˜ u iτ } ∀ τ ∈ Z T ;derivatives of the cost-to-go function (cid:96) i and dynamic model f i for the i th vehicle; the maximum iteration number ofDDP, i.e., r ddp .Set the initial iteration step r = 0 . while r ≤ r ddp or not meet stopping criterion do { (cid:46) Backward pass. } for τ = T − , · · · , do compute (6), (7), (8) end for Set the backtracking line-search parameter α = 1 .{ (cid:46) Forward pass. }Use (9) to compute a new nominal trajectory.Decrease α . r = r + 1 . end while B. Solving the Second Sub-problem
The second sub-problem is min z L σ ( y k +1 , z ; λ k ) = min z δ Z ( z ) + σ (cid:13)(cid:13)(cid:13)(cid:13) T y k +1 − z + λ k σ (cid:13)(cid:13)(cid:13)(cid:13) . (24) Here, the second sub-problem (24) is equivalent to min z (cid:13)(cid:13)(cid:13)(cid:13) T y k +1 − z − λ k σ (cid:13)(cid:13)(cid:13)(cid:13) s . t . z u (cid:22) z u (cid:22) z u (cid:107) z p,iτ − z p,jτ (cid:107) ≥ d safe ∀ τ ∈ Z T , ∀ j ∈ ν ( i ) , ∀ i ∈ V , (25)where z u , the lower bound and upper bound of the optimiza-tion variable z u and z u have been defined in (16). Besides, z p,iτ is one of the component of z iτ regarding the positionvector, i.e., z iτ = ( z p,iτ , z u,iτ ) .On the other hand, the variable z can be rewrit-ten as z = ( z , z , · · · , z N ) ∈ R NT ( m + n p ) with z i = ( z i , z i , · · · , z iτ , · · · , z iT ) ∈ R T ( m + n p ) and z iτ =( z p,iτ , z u,iτ ) . Thus, the variable z can be separated regardingthe subscript · τ , i.e., z τ = (cid:0) { z iτ } ∀ i ∈V (cid:1) ∈ R N ( m + n ) . Foreach τ , z i,τ = ( z p,iτ , z u,iτ ) . Therefore, we can derive z p,τ = (cid:0) { z p,iτ } ∀ i ∈V (cid:1) ∈ R Nn p , and z u,τ = (cid:0) { z u,iτ } ∀ i ∈V (cid:1) ∈ R Nm .For all τ ∈ Z T − , We can separate this problem into T problems. At the time stamp τ , the problem is defined as min z τ (cid:13)(cid:13)(cid:13)(cid:13) z τ − T τ y k +1 τ − λ kτ σ (cid:13)(cid:13)(cid:13)(cid:13) + δ Z u,τ ( z u,τ )s . t . (cid:107) z p,iτ − z p,jτ (cid:107) ≥ d safe , ∀ i ∈ V , ∀ j ∈ ν ( i ) , (26)where y τ = (cid:0) { y iτ } ∀ i ∈V (cid:1) ∈ R N ( m + n ) and λ τ = (cid:0) { λ iτ } ∀ i ∈V (cid:1) ∈ R m + n p , and the set Z u,τ = (cid:8) z u,τ ∈ R NT m | z u,τ (cid:22) z u,τ (cid:22) z u,τ (cid:9) . Since theconstraints only consider the position component z p,τ of thevariable z τ , we can divide the problem (26) into two separateparts according to the two components z p,τ and z u,τ of thevariable z τ .The first part in terms of the variable z u,τ is min z u,τ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) z u,τ − T u,τ y k +1 τ − λ ku,τ σ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + δ Z u,τ ( z u,τ ) (27)where the matrix T u,τ = I N ⊗ (cid:2) ( m,n ) I m (cid:3) is used to extract y u,τ from y τ . It is straightforward to obtain z u,τ = Proj Z u,τ ( T u,τ y k +1 τ + λ ku,τ σ ) , (28)where the operator Proj C ( z ) denotes the projection of thevariable z onto the set C . Here, the solution of (28) canbe easily solved by confining all elements of the vector T u,τ y k +1 τ + λ ku,τ σ to be inside the set Z u,τ . If some elementsof the vector T u,τ y k +1 τ + λ ku,τ σ are outside Z u,τ , these ele-ments are bounded by Z u,τ . If some elements of the vector T u,τ y k +1 τ + λ ku,τ σ are inside Z u,τ , these elements will maintaintheir value.The second part for the avariable z p,τ is min z p,τ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) z p,τ − T p,τ y k +1 τ − λ kp,τ σ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) s . t . (cid:107) z p,iτ − z p,jτ (cid:107) ≥ d safe , ∀ i ∈ V , ∀ j ∈ ν ( i ) , (29)where the matrix T p,τ = I N ⊗ (cid:2) I n p ( n p ,m + n − n p ) (cid:3) is usedto extract y p,τ from y τ .5ince the inequality constraints are nonconvex, the SDR canbe used to solve the problem (26). Though it is much easierto solve a relaxed optimization problem, the results of therelaxed problem determine the bounds of the optimal value ofthe original nonconvex problem. Therefore, the SDR is used tosolve the nonconvex problem (29). Actually, this problem (29)can be formulated as a quadratically constrained quadraticprogramming (QCQP) problem, which can be defined as min z p,τ ( z p,τ − c p,τ ) (cid:62) ( z p,τ − c p,τ )s . t . ( z p,iτ − z p,jτ ) (cid:62) ( z p,iτ − z p,jτ ) ≥ d safe ∀ i ∈ V , ∀ j ∈ ν ( i ) , (30)where c p,τ = T p,τ y k +1 τ + λ kp,τ σ . By defining a matrix M ij ∈ R n p × Nn p such that M ij z p,τ = z p,iτ − z p,jτ , we canrewrite (30) as min z p,τ z (cid:62) p,τ z p,τ − c (cid:62) p,τ z p,τ s . t . z (cid:62) p,τ K ij z p,τ ≥ d safe , ∀ i ∈ V , ∀ j ∈ ν ( i ) , (31)where K ij = M (cid:62) ij M ij ∈ S Nn p .Note that two formulas in deriving an SDR problem aregiven by z (cid:62) p,τ z p,τ = Tr( z p,τ z (cid:62) p,τ ) z (cid:62) p,τ K ij z p,τ = Tr( K ij z p,τ z (cid:62) p,τ ) . (32)Thus, by introducing a new variable Z τ = z p,τ z (cid:62) p,τ ∈ S Nn p ,the QCQP problem (31) can be rewritten as min z p,τ Tr( Z τ ) − c (cid:62) p,τ z p,τ s . t . Tr( K ij Z τ ) ≥ d safe , ∀ i ∈ V , ∀ j ∈ ν ( i ) Z τ = z p,τ z (cid:62) p,τ . (33)Here, the quadratic terms in (31) have been converted intolinear ones. Besides, a nonlinear equality constraint is in-troduced in (33). Then, the problem (33) can be relaxedto a convex one by changing the last nonconvex equalityconstraint Z τ = z p,τ z (cid:62) p,τ for a convex inequality constraint Z τ − z p,τ z (cid:62) p,τ (cid:23) . Notice that Z τ − z p,τ z (cid:62) p,τ can be formulatedas a Schur complement, which results in a symmetric matrixgiven by (cid:18) Z τ z p,τ z (cid:62) p,τ (cid:19) . Remark 3.
By using the Schur complement, the positive semi-definite constraint becomes a second-order cone constraintwhich is easy to be handled.Thus, the problem can be obtained as min z p,τ ,Z τ ∈ S Nnp
Tr( Z τ ) − c (cid:62) p,τ z p,τ s . t . Tr( K ij Z τ ) ≥ d safe , ∀ i ∈ V , ∀ j ∈ ν ( i ) (cid:18) Z τ z p,τ z (cid:62) p,τ (cid:19) (cid:23) , (34)which is a semi-definite programming (SDP) problem, becauseone of the constraints is changed into a looser one. It isapparent that the optimal value of (34) is not greater than theoptimal value of (30), because the cost function is minimized under a larger domain in (34). Besides, if Z τ = z τ z (cid:62) τ at theoptimum of the problem (34), then z τ will be optimal. Notethat the feasible results of the relaxed problem represent thatthe collision avoidance constraints are satisfied. Remark 4.
It indicates that the optimality and the feasibilityof a nonconvex QCQP problem may not be guaranteed by theSDR. Otherwise, an NP-hard problem would have been solvedin a polynomial time, which is impossible based on the currentstate of the science. However, the result is still a non-trivialsolution of a nonconvex QCQP problem.
Remark 5.
Many practical experiences have already indicatedthat the SDR can provide accurate or near-optimal approx-imations. In some cases, the solution of the SDP is not afeasible solution to the nonconvex problem. In that case, wecan employ randomization, which uses the optimal solutionof the SDP to extract a feasible solution to the nonconvexproblem [30].
1) Comparison with Mixed Integer Quadratic Program-ming:
Here, the MIQP can be used as a method to solve thenonconvex problem (29) for comparative purposes. Assumethe collision region is a polyhedron, which can be expressedas the intersection of s half planes { x ∈ R n p | P x ≤ q } where P = (cid:2) P (cid:62) P (cid:62) · · · P (cid:62) s (cid:3) (cid:62) ∈ R s × n p with P i ∈ R n p for i ∈ Z s , and q = (cid:2) q q · · · q s (cid:3) (cid:62) ∈ R s . Each row P (cid:62) i x ≤ q i of P x ≤ q denotes one of the half plane whichis to form the polyhedron. In order to avoid the collision, thebig-M method is used to relax this problem. Here, in orderto avoid this polyhedron (collision region), x must be at leastin one of the half planes, i.e., P (cid:62) x ≤ q or P (cid:62) x ≤ q , or · · · , or P (cid:62) s x ≤ q s . Since the union of logic operator “or” ishard to compute, it should be transformed into logic “and”operator, which is a convex form in an optimization problem.Thus, binary variables are introduced as P (cid:62) x ≥ q − M e P (cid:62) x ≥ q − M e ... P (cid:62) s x ≥ q s − M e ss (cid:88) i =1 e i ≤ s − , (35)where e i ∈ { , } is a binary variable and M is a sufficientlylarge positive number. If e i = 0 , the corresponding constraintsare satisfied and if e i = 1 , it is relaxed. The last constraint s (cid:80) i =1 e i ≤ s − is used to guarantee that at least one constraintis satisfied. For example, if s = 4 or , an obstacle canbe represented as a rectangle or octagon by using s binaryvariables. In the MIQP, the computational time largely dependson the number of integer or binary variables, i.e., s . Thus, thenumber of binary variables should be as small as possible todecrease the computational time. Here, we set s = 4 . Wefurther define matrices P , P , which are used to extract theposition variable x, y in three dimensions, and also matrices P , P , which are are used to extract the position variable withnegative sign − x, − y in three dimensions. q i = d safe , ∀ i ∈ Z s .6 i is the integer binary variables. Therefore, we can rewritethe constraints (cid:107) z p,iτ − z p,jτ (cid:107) ≥ d safe as P ijr z p,τ ≥ d safe − M e ijr (36)by introducing binary variables e ijr for each original g ij inequality constraint. Thus, the subproblem can be transformedto min z p,τ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) z p,τ − T p,τ y k +1 τ − λ kp,τ σ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) + δ C ( e )s . t . P ijr z p,τ ≥ d safe − M e ijr (cid:88) r =1 r ijr ≥ ∀ i ∈ V , ∀ j ∈ ν ( i ) , (37)where e = (cid:16) { e ijr } ∀ r ∈ Z s , ∀ j ∈ ν ( i ) , ∀ i ∈V (cid:17) ∈ R s (cid:80) i ( r i ) , P ijr isused to extract the position variable of z iτ − z jτ in one dimen-sion with positive or negative sign, the set C is a nonconvexcone with two integer elements 0,1, i.e., C = { , } s (cid:80) i r i .
2) Comparison with Interior Point Method:
Another com-parative study is performed by solving the nonconvex prob-lem (29) using the interior point OPTimizer (IPOPT) method,which is a very comprehensive approach to solve the nonlinearnonconvex programming problem.
C. Proposed Algorithm
The pseudocode of our proposed algorithm is shown inAlgorithm 2. Note that allV. S
IMULATION R ESULTS
A. Dynamic Model of the Vehicle
The dynamic model of the i th vehicle can be characterizedby p x,i ( τ +1) = p x,iτ + f r ( v iτ , δ iτ ) cos( θ i ( τ )) (38a) p y,i ( τ +1 ) = p y,iτ + f r ( v iτ , δ iτ ) sin( θ i ( τ )) (38b) θ i ( τ +1) = θ iτ + sin − (cid:18) τ s v iτ sin( δ iτ ) b i (cid:19) (38c) v iτ +1 = v iτ + τ s a iτ (38d)where the subscript · i means the corresponding parametersor variables for the i th vehicle, the subscript · τ means thecorresponding parameters or variables at the time stamp τ , p x and p y are the position of the center point of the vehicle in X and Y dimension in the Cartesian coordinates, respectively, θ represents the heading angle of the vehicle with 0 in thepositive X-dimension, v denotes the velocity of the vehicle, δ is the steering angle, and a is the acceleration of the vehicle, b denotes the wheelbase of this vehicle, τ s is the sampling time,and the function f r ( v, δ ) is f r ( v, δ ) = b + τ s v cos( δ ) − (cid:113) b − ( τ s v sin( δ )) . Define the state vector as x = (cid:2) p x p y θ v (cid:3) (cid:62) and theinput vector as u = (cid:2) δ a (cid:3) (cid:62) . The vehicle dynamic model canbe rewritten as x i ( τ +1) = f ( x iτ , u iτ ) . (39) Algorithm 2
ADMM for Multi-Vehicle Cooperative Automa-tion
Initialization: dynamic model for all agents i ∈ V ; commu-nication network G ( V , E ) ; parameters M , P ijτ ; weightingmatrices Q i and R i in the cost function; upper and lowerbound of the state x and control input u , i.e., x, x and u, u ,respectively; the maximum iteration number of ADMM, i.e., k admm ; initial state x i, at initial time τ = 0 for all vehicle i ∈ V ; penalty parameter σ > ; variables y ∈ R NT ( m + n ) , z ∈ R NT ( m + n p ) ; dual variable λ ∈ R NT ( m + n p ) .Set the outer iteration k = 0 . while k ≤ k admm or stopping criterion (20) is violated do Update y k +1 in parallel for i ∈ V via the DDP algorithm.Update z k +1 u in parallel for τ ∈ Z T by (28). if Use the SDR then
Update z k +1 p in parallel for τ ∈ Z T by solving theproblem (34). else if Use the MIQP then
Update z k +1 p in parallel for τ ∈ Z T by solving theproblem (37). else if Use the IPOPT then
Update z k +1 p in parallel for τ ∈ Z T by solving theproblem (29) directly. end if Update λ k +1 by (20). k = k + 1 . end while B. Simulation Results
Here, we focus on CPaC for multiple connected vehiclesunder the scenario of intersections in autonomous driving. Inthis paper, two scenarios are considered, including a three-way junction scenario and an intersection scenario. The op-timization algorithm is implemented in a PC with Intel(R)Xeon(R) CPU E5-1650 v4 @ 3.60GHz, and all the programsare conducted in Python 3.7.In the simulation, the steering angle δ is confined into therange [ − . , . rad, and the acceleration is within [ − , m/s . The length and width of all vehicles are 2.5 m and 1.6m, respectively. The penalty parameter σ of the augmentedLagrangian function is set as 10. Also, the sampling time τ s is0.1 s. The maximum iteration numbers of the DDP algorithmand the ADMM algorithm are set as 100 and 100, respectively.The safety distance d safe is defined as m. The lane widthof all roads is set to 4 m. The initial trajectory of the DDPalgorithm is the initial states with control inputs being 0. Theprediction horizon T is set to be 100. The tolerance of stoppingcriterion in (20) is set to be 0.01.
1) Scenario 1 (Three-way Junction):
In the scenario ofa three-way junction, three vehicles are used to show theperformance in cooperative planning and control. Here, thethree vehicles have three types of behaviors (turning left,turning right, and going straight) in three lanes. All subfiguresin Fig. 1 illustrate the trajectories of the three vehicles inall ADMM iterations, with different solvers when solving thesecond subproblem. In this figure, the thick solid black lines7 D 6 ' 3 $ ' 0 0 E 0 , 4 3 $ ' 0 0 F , 3 2 3 7 $ ' 0 0
Fig. 1. Trajectories of the 3 vehicles in all ADMM iterations when usingdifferent methods to solve the second subproblem. are the road boundaries, and the thin dotted gray lines arethe road center-lines to separate lanes in different directions.The circle marker and diamond marker denote the start pointand end point of a vehicle. The reference trajectories of thethree vehicles are represented by dashed lines. All trajectoriesfor one vehicle in all ADMM iterations are represented bya group of similar colors. For example, the first group ofpurple line denotes the group of one vehicle’s trajectories inall ADMM iterations. Also, the darker the color, the moreiterations it represents. In this figure, we can observe that thetrajectories become smoother with the increase of iterations.Here, the SDP-ADMM (our proposed method), MIQP-ADMM(comparison 1), and IPOPT-ADMM (comparison 2) are theADMM approach with the use of the SDP, MIQP and IPOPTto solve the second subproblem in the ADMM scheme,respectively. Particularly, Fig. 1 (a) and (b) represent thetrajectories solved by the SDP-ADMM and MIQP-ADMM,respectively. It’s obvious that the number of iterations that theSDP-ADMM requires is much less than that of the MIQP-ADMM. Here, Fig. 1 (c) shows the trajectories with the useof one widely used interior point solver (IPOPT) to solvethe second subproblem. It is straightforward that the IPOPT-ADMM requires much more iterations than both the SDP-ADMM and MIQP-ADMM.Fig. 2 is used to show how the three vehicles are controlledto reach the end points and how they avoid collision with eachother with the use of the SDP-ADMM. The trajectories gener-ated in the last ADMM iteration (which meets the predefinedstopping criterion) are shown in Fig. 2. The six subfiguresrepresent the current states of vehicles at the different timestamp τ in the final ADMM iteration. The curves with differentcolors denote the history trajectories from 0 to τ . Accordingto Fig. 2, all vehicles complete their driving task, meanwhilethe collisions are avoided. Since the three vehicles’ trajectorieswhen using the MIQP-ADMM and IPOPT-ADMM are verysimilar to that of the SDP-ADMM, only the resulted drivingprocess using the SDP-ADMM is demonstrated in this figure. = = = = = = Fig. 2. Trajectories at different time stamp τ for all vehicles in the lastADMM iteration . ' L V W > P @ D 6 ' 3 $ ' 0 0 ' L V W > P @ E 0 , 4 3 $ ' 0 0 ' L V W > P @ F , 3 2 3 7 $ ' 0 0 Fig. 3. Distance among all vehicles in the last ADMM iteration.
During the driving process, the safety distance among allvehicles should be maintained to avoid potential collisions.Here, Fig. 3 is used to show the distance among all vehiclesin the last ADMM iteration under the three methods (SDP-ADMM, MIQP-ADMM, and IPOPT-ADMM) to solve thesecond subproblem. In this figure, the safety distance d safe = 3 m is represented by the gray solid line. Based on Fig. 3, wecan observe that the inter-distances for all vehicles are greaterthan the safety distance during the whole prediction horizon.
2) Scenario 2 (Intersection):
In this scenario, there are12 vehicles driving to pass through the intersection from 4lanes. Note that vehicles are represented by using differentcolors. Similarly, there are 3 vehicles in one lane to carry out8 = = = = = = Fig. 4. All trajectories at different time stamp τ for all vehicles in the lastADMM iteration with the use of SDP-ADMM. three driving behavior (turning right, turning left, and goingstraight). Fig. 4 shows the driving process in different timestamp τ for all vehicles, based on the last ADMM iterationwhen using the SDP-ADMM. It is easy to observe that allvehicles have successfully avoided each other by keeping asafe distance away.Similarly, Fig. 5 demonstrates the distance among all ve-hicles in the last ADMM iteration, under the three methods(SDP-ADMM, MIQP-ADMM, and IPOPT-ADMM) to solvethe second subproblem. The gray solid line denotes the safetydistance d safe = 3 m. Obviously, the inter-distances among allvehicles are greater than the safety distance during the wholeprediction horizon.
3) Comparison of Computational Time:
Table. I shows theaverage iteration number of the SDP-ADMM, MIQP-ADMM,IPOPT-ADMM in the two driving scenarios for 20 trials. Inthis table, ' L V W > P @ D 6 ' 3 $ ' 0 0 ' L V W > P @ E 0 , 4 3 $ ' 0 0 ' L V W > P @ F , 3 2 3 7 $ ' 0 0 Fig. 5. Distance among all vehicles in the last ADMM iteration.TABLE IA
VERAGE ITERATION NUMBER OF THE
ADMM
WITH USE OF THREEMETHODS TO SOLVE THE SECOND SUBPROBLEM IN THE TWO DRIVINGSCENARIOS FOR TRIALS .SDP-ADMM MIQP-ADMM IPOPT-ADMM
4) Comparison with Solvers:
Instead of using our proposedADMM algorithm to compute in parallel, the original nonlin-ear and nonconvex problem (15) can be solved by using awidely-used nonlinear programming solver, i.e., IPOPT. Theapproach which only use the IPOPT is called the pure-IPOPT.Note that here, the pure-IPOPT is compared with the wholeADMM-based solving approach. Here, the comparison methodis to use the IPOPT to solve (15), instead of solving the secondsubproblem in the ADMM algorithm.The trajectories for all vehicles by using the IPOPT to solvethe original problem in the two scenarios are shown in Fig. 8(a) and (b), respectively. In scenario 1 (three-way junction),the pure-IPOPT can successfully solve this problem, but itresults in a low-quality solution with higher cost, comparedwith all of the three ADMM-based approaches. In scenario 2of the intersection, the solution of the IPOPT solver has beentrapped into a local minimum and cannot achieve the drivingtask successfully. On the country, our proposed approach canfind a (sub-)optimal solution and finish the defined drivingtask successfully in both scenarios. Table II illustrates thecomparison of the average computation time of the ADMMalgorithm with use of three methods to solve the secondsubproblem and the pure-IPOPT which only use the IPOPTwithout the ADMM scheme in the two driving scenarios for20 trials. From Table II, it is obvious that our proposedapproach shows the best time efficiency, compared with theMIQP-ADMM, IPOPT-ADMM and pure-IPOPT. Note that thecomputation time of the pure-IPOPT in scenario 2 denotes thecomputational time of the unsuccessful trajectories, as shownin Fig. 8 (b).9 W L P H > V @ D 6 ' 3 $ ' 0 0 E 0 , 4 3 $ ' 0 0 F , 3 2 3 7 $ ' 0 0 Q R R I Y H K W L P H > V @ G 6 ' 3 $ ' 0 0 Q R R I Y H K H 0 , 4 3 $ ' 0 0 Q R R I Y H K I , 3 2 3 7 $ ' 0 0 Fig. 6. Computational time of solving the first subproblem in the ADMM forthe three methods (the subfigures in the first row are the results in the scenario1, and the subfigures in the second row shows the results in the scenario 2). W L P H > V @ D 6 ' 3 $ ' 0 0 E 0 , 4 3 $ ' 0 0 F , 3 2 3 7 $ ' 0 0 W L P H > V @ G 6 ' 3 $ ' 0 0 H 0 , 4 3 $ ' 0 0 I , 3 2 3 7 $ ' 0 0 Fig. 7. Computational time of solving the second subproblem in the ADMMfor the three methods (the subfigures in the first row are the results in thescenario 1, and the subfigures in the second row shows the results in thescenario 2).
Our proposed approach (SDP-ADMM) also shows its effec-tiveness when solving such optimization problems, comparedwith the approach that only uses the SDP (which cannot ad-dress the nonlinear and nonconvex optimization problem (15)).Certainly, we can use the SDR to relax the nonconvexconstraints, but the nonlinear constraints, i.e., the dynamicsconstraints, cannot be handled. Besides, the dimension of thesecond subproblem (29) in our proposed approach is muchsmaller than the original problem, which is contributed fromthe ADMM scheme by separating the original problem intotwo manageable subproblems and computing these subprob- D E
Fig. 8. Trajectories for all vehicles by using pure-IPOPT to solve the originalproblem in the two scenarios. TABLE IIC
OMPARISON OF THE AVERAGE COMPUTATION TIME OF THE FOURMETHODS (SDP-ADMM, MIQP-ADMM,IPOPT-ADMM,
ANDPURE -IPOPT)
TO SOLVE THE SECOND SUBPROBLEM IN THE TWO DRIVINGSCENARIOS FOR TRIALS .SDP-ADMM MIQP-ADMM IPOPT-ADMM pure-IPOPT lems in a parallel manner. A similar reason happens in the sit-uation where the IPOPT-ADMM is much faster than the pure-IPOPT. Therefore, our proposed ADMM-based approach canachieve real-time computation due to the parallel computationand effective separation of the original optimization problem.VI. D
ISCUSSION AND C ONCLUSION
A. Discussion
Based on [31], the time complexity to solve the SDP inthe worst cases is O (max { m, n } n log( (cid:15) )) , where (cid:15) ∈ R + denotes the numerical solution accuracy, n is the dimensionof decision variables and m is the number of constraints. Notethat the assumption of sparsity or specific structure in matrices,which can be used to improve the computation time by somesolving tricks, is not considered in this time complexity. Thus,the SDP is a computationally efficient approximation methodfor the nonconvex QCQP, because the time complexity ispolynomial time with the problem size n and the number ofconstraints m .As for the MIQP, the branch-and-cut algorithm is appliedin most solvers. For the branch-and-cut algorithm, generally,all feasible solution sets are repeatedly divided into smallerand smaller subsets, which is called branches; and a targetlower bound (for the minimum problem) is calculated for the10olution set in each subset, which is called delimitation; Aftersub-branch, any subset whose limit exceeds the target value ofthe known feasible solution set will not be considered further,which is called pruning. As we all know, it is NP-complete;and thus, it’s rather hard to provide the time complexity of theMIQP. B. Conclusion
This paper investigates the cooperative planning and controlproblem for multiple CAVs in autonomous driving. Here,a nonlinear nonconvex constrained optimization problem issuitably formulated, considering the nonlinear dynamics ofthe vehicle model and various coupling constraints (regardingthe communications) among all CAVs. Next, we propose anADMM-based approach to split the optimization problem intoseveral small-scale subproblems, and these sub-problems canbe efficiently solved in a parallel manner. Here, the nonlin-earity of the system dynamics can be addressed efficiently byusing the DDP algorithm, and the SDR approximates the non-convexity of the coupling constraints with small dimensions,which can also be solved in a very short time. As a result, real-time computation and implementation can be realized throughour proposed approach. Two complex driving scenarios inautonomous driving are used to validate the effectivenessand computational efficiency of our proposed approach incooperative planning and control for multiple CAVs.R
EFERENCES[1] Y. Rizk, M. Awad, and E. W. Tunstel, “Decision making in multiagentsystems: A survey,”
IEEE Transactions on Cognitive and DevelopmentalSystems , vol. 10, no. 3, pp. 514–529, 2018.[2] S. E. Li, Y. Zheng, K. Li, Y. Wu, J. K. Hedrick, F. Gao, andH. Zhang, “Dynamical modeling and distributed control of connectedand automated vehicles: Challenges and opportunities,”
IEEE IntelligentTransportation Systems Magazine , vol. 9, no. 3, pp. 46–58, 2017.[3] J. Duan, S. E. Li, Y. Guan, Q. Sun, and B. Cheng, “Hierarchicalreinforcement learning for self-driving decision-making without relianceon labelled driving data,”
IET Intelligent Transport Systems , vol. 14,no. 5, pp. 297–305, 2020.[4] B. Xu, X. J. Ban, Y. Bian, W. Li, J. Wang, S. E. Li, and K. Li,“Cooperative method of traffic signal optimization and speed controlof connected vehicles at isolated intersections,”
IEEE Transactions onIntelligent Transportation Systems , vol. 20, no. 4, pp. 1390–1403, 2018.[5] Y. Rasekhipour, A. Khajepour, S.-K. Chen, and B. Litkouhi, “A potentialfield-based model predictive path-planning controller for autonomousroad vehicles,”
IEEE Transactions on Intelligent Transportation Systems ,vol. 18, no. 5, pp. 1255–1267, 2016.[6] K. Chu, M. Lee, and M. Sunwoo, “Local path planning for off-road autonomous driving with avoidance of static obstacles,”
IEEETransactions on Intelligent Transportation Systems , vol. 13, no. 4, pp.1599–1616, 2012.[7] J. Ma, Z. Cheng, X. Zhang, A. A. Mamun, C. W. de Silva, and T. H.Lee, “Data-driven predictive control for multi-agent decision makingwith chance constraints,” arXiv preprint arXiv:2011.03213 , 2020.[8] J. Chen, W. Zhan, and M. Tomizuka, “Constrained iterative LQR foron-road autonomous driving motion planning,” in .IEEE, 2017, pp. 1–7.[9] B. Amos, I. Jimenez, J. Sacks, B. Boots, and J. Z. Kolter, “DifferentiableMPC for end-to-end planning and control,” in
Advances in NeuralInformation Processing Systems , 2018, pp. 8289–8300.[10] J. Ma, Z. Cheng, X. Zhang, M. Tomizuka, and T. H. Lee, “Alternat-ing direction method of multipliers for constrained iterative LQR inautonomous driving,” arXiv preprint arXiv:2011.00462 , 2020.[11] X. Zhang, J. Ma, Z. Cheng, S. Huang, S. S. Ge, and T. H. Lee,“Trajectory generation by chance constrained nonlinear MPC withprobabilistic prediction,”
IEEE Transactions on Cybernetics , 2020. [12] Y. Guan, Y. Ren, S. E. Li, Q. Sun, L. Luo, and K. Li, “Centralizedcooperation for connected and automated vehicles at intersections byproximal policy optimization,”
IEEE Transactions on Vehicular Tech-nology , 2020.[13] T. Chu, J. Wang, L. Codecà, and Z. Li, “Multi-agent deep reinforcementlearning for large-scale traffic signal control,”
IEEE Transactions onIntelligent Transportation Systems , vol. 21, no. 3, pp. 1086–1095, 2019.[14] Y. Ren, J. Duan, Y. Guan, and S. E. Li, “Improving generalizationof reinforcement learning with minimax distributional soft actor-critic,” arXiv preprint arXiv:2002.05502 , 2020.[15] M. Bouton, A. Nakhaei, K. Fujimura, and M. J. Kochenderfer,“Cooperation-aware reinforcement learning for merging in dense traffic,”in .IEEE, 2019, pp. 3441–3447.[16] J. Chen, B. Yuan, and M. Tomizuka, “Model-free deep reinforcementlearning for urban autonomous driving,” in . IEEE, 2019, pp. 2765–2771.[17] C. Burger and M. Lauer, “Cooperative multiple vehicle trajectory plan-ning using MIQP,” in . IEEE, 2018, pp. 602–607.[18] A. Mirheli, M. Tajalli, L. Hajibabai, and A. Hajbabaie, “A consensus-based distributed trajectory control in a signal-free intersection,”
Trans-portation Research Part C: Emerging Technologies , vol. 100, pp. 161–176, 2019.[19] J. Lee and B. Park, “Development and evaluation of a cooperativevehicle intersection control algorithm under the connected vehiclesenvironment,”
IEEE Transactions on Intelligent Transportation Systems ,vol. 13, no. 1, pp. 81–90, 2012.[20] F. Borrelli, D. Subramanian, A. U. Raghunathan, and L. T. Biegler,“MILP and NLP techniques for centralized trajectory planning ofmultiple unmanned air vehicles,” in .IEEE, 2006, pp. 6–pp.[21] J. Ma, Z. Cheng, X. Zhang, M. Tomizuka, and T. H. Lee, “On symmetricGauss-Seidel ADMM algorithm for H ∞ guaranteed cost control withconvex parameterization,” arXiv preprint arXiv:2001.00708 , 2020.[22] Z. Zhou, J. Feng, Z. Chang, and X. Shen, “Energy-efficient edgecomputing service provisioning for vehicular networks: A consensusADMM approach,” IEEE Transactions on Vehicular Technology , vol. 68,no. 5, pp. 5087–5099, 2019.[23] H. Zheng, R. R. Negenborn, and G. Lodewijks, “Fast ADMM fordistributed model predictive control of cooperative waterborne AGVs,”
IEEE Transactions on Control Systems Technology , vol. 25, no. 4, pp.1406–1413, 2016.[24] J. Ma, Z. Cheng, X. Zhang, M. Tomizuka, and T. H. Lee, “Optimaldecentralized control for uncertain systems by symmetric Gauss-Seidelsemi-proximal ALM,” arXiv preprint arXiv:2001.00306 , 2020.[25] Z. Xu, S. De, M. Figueiredo, C. Studer, and T. Goldstein, “Anempirical study of ADMM for nonconvex problems,” arXiv preprintarXiv:1612.03349 , 2016.[26] M. Hong, Z.-Q. Luo, and M. Razaviyayn, “Convergence analysis ofalternating direction method of multipliers for a family of nonconvexproblems,”
SIAM Journal on Optimization , vol. 26, no. 1, pp. 337–364,2016.[27] X. Zhang, J. Ma, Z. Cheng, S. Huang, C. W. de Silva, and T. H. Lee,“Accelerated hierarchical ADMM for nonconvex optimization in multi-agent decision making,” arXiv preprint arXiv:2011.00463 , 2020.[28] Y. Wang, W. Yin, and J. Zeng, “Global convergence of ADMM innonconvex nonsmooth optimization,”
Journal of Scientific Computing ,vol. 78, no. 1, pp. 29–63, 2019.[29] Y. Tassa, T. Erez, and E. Todorov, “Synthesis and stabilization ofcomplex behaviors through online trajectory optimization,” in .IEEE, 2012, pp. 4906–4913.[30] J. Park and S. Boyd, “General heuristics for nonconvex quadraticallyconstrained quadratic programming,” arXiv preprint arXiv:1703.07870 ,2017.[31] Z.-Q. Luo, W.-K. Ma, A. M.-C. So, Y. Ye, and S. Zhang, “Nonconvexquadratic optimization, semidefinite relaxation, and applications,”
IEEESignal Processing Magazine , 2010., 2010.