A Sequential Learning Algorithm for Probabilistically Robust Controller Tuning
Robert Chin, Chris Manzie, Iman Shames, Dragan Neši?, Jonathan E. Rowe
AA Sequential Learning Algorithm for Probabilistically RobustController Tuning
Robert Chin a , b , , Chris Manzie a , Iman Shames c , Dragan Neˇsi´c a ,Jonathan E. Rowe b , d a Department of Electrical and Electronic Engineering, The University of Melbourne, Australia b School of Computer Science, University of Birmingham, United Kingdom c College of Engineering & Computer Science, Australian National University, Australia d The Alan Turing Institute, United Kingdom
Abstract
In this paper, we introduce a sequential learning algorithm to address a probabilistically robust controller tuning problem. Thealgorithm leverages ideas from the areas of randomised algorithms and ordinal optimisation, which have both been proposedto find approximate solutions for difficult design problems in control. We formally prove that our algorithm yields a controllerwhich meets a specified probabilisitic performance specification, assuming a Gaussian or near-Gaussian copula model for thecontroller performances. Additionally, we are able to characterise the computational requirement of the algorithm by usinga lower bound on the distribution function of the algorithm’s stopping time. To validate our work, the algorithm is thendemonstrated for the purpose of tuning model predictive controllers on a diesel engine air-path. It is shown that the algorithmis able to successfully tune a single controller to meet a desired performance threshold, even in the presence of uncertainty inthe diesel engine model, that is inherent when a single representation is used across a fleet of vehicles.
Key words:
Randomized algorithms; Robust control; Control of uncertain systems; Statistical learning theory; Ordinaloptimization.
When there is probabilistic plant uncertainty, the useof randomised algorithms (RA) is a well-establishedtechnique for finding approximate solutions to other-wise difficult computational problems [1]. Some keyresults from this area of probabilistic robust controlpertain to the sample complexity, i.e. the number ofsimulations needed for the RA to perform analysis ordesign until desired probabilistic specifications are met.In control analysis, the techniques used to obtain sam-ple complexities were originally pioneered in [2, 3], andbased on well-known concentration inequalities such asthe Chernoff bound. In control design, the first samplecomplexities for RA were provided in [4], using resultsfrom the paradigm of empirical risk minimisation in
Email address: [email protected] (RobertChin). Corresponding author R. Chin. statistical learning theory (namely, using the Vapnik-Chervonenkis dimension). These randomised algorithmsand variants thereof have seen numerous control appli-cations. In fault detection, randomised algorithms havebeen used for finding a solution which ensures a lowfalse alarm rate with high confidence [5]. In [6], ran-domised algorithms were used for the estimation andanalysis for the probability of stability in high speedcommunication networks.In another line of literature, the premise of ordinal op-timisation (OO) is to find approximate solutions to dif-ficult stochastic optimisation problems [7]. Introducedin [8] for the optimisation of discrete-event dynamicsystems, ordinal optimisation primarily operates undertwo principles: firstly that comparison by order (as op-posed to comparison based on numerical difference) ismore ‘robust’ against noise, and secondly by goal soft-ening, we can improve our chances at finding a success-ful solution. These advantages of ordinal comparisonand goal softening have been theoretically demonstrated
Preprint a r X i v : . [ m a t h . O C ] F e b n [9] and [10] respectively. Noteworthy applications ofOO include finding an approximate solution to the Wit-senhausen problem (an unsolved problem in nonlinearstochastic optimal control) [11], and reducing the com-putational burden for rare event simulation of overflowprobabilities in queuing systems [12].We observe several commonalities between methods inOO and methods in RA for controller tuning. On the sur-face, they both seek to find approximate solutions to dif-ficult design problems that are rendered intractable dueto uncertainty. Moreover, they both employ a philosphywhich can be roughly summarised as “randomly samplemany candidate solutions, simulate their performances,and pick the best observed one”. In OO, the practice ofselecting the best is called the ‘horse race’ rule, and itsoptimality was formally shown in [13]. Additionally, [14]discusses that although this strategy is what seems in-tuitively to be the best thing to do (and what has beendone for decades), the usage of RA is justified throughrigorous sample complexity estimates. A further similar-ity shared by OO and RA is the notion of goal softening,which can be used to control the degree of sub-optimalityfor the obtained solution.This apparent connection between OO and RA had beenrecognised and briefly touched on in [2, 15], but as ofyet, has not been fully explored in the literature. In thispaper, we address a controller tuning/design problembased on both OO and RA. That is, we present a ran-domised algorithm for solving a control design problemto meet a desired probabilistic performance specifica-tion, and characterise the sample complexity using re-cent work obtained in ordinal optimisation for a Gaus-sian copula model [16]. The Gaussian copula is com-mon choice for modelling the dependence structure inmultivariate distributions [17], and our results are bestsuited (i.e. the least conservative) for Gaussian or near-Gaussian copula scenarios. To obtain our bounds, wealso employ concentration inequalities akin to those of-ten used in analysis of RA [18].However, our results are distinct from existing algo-rithms, in that our description of performance specifica-tion is different from those typically considered in RA.The randomised algorithms in [14] produce a ‘probablyapproximate near minimiser’. This is whereby givensome risk, accuracy and level, then with high confidence(prescribed by the risk), there is only a small chance(prescribed by the level) that a randomly sampled con-troller would outperform the performance observed insimulation of the obtained solution (within slack pre-scribed by the accuracy). In our formulation, we providean ‘end-to-end’ result which says that given a risk andnominal performance threshold, there is only a smallchance (prescribed by the risk) that the performancetested on the true system of the obtained solution doesnot outperform the nominal performance threshold. Our algorithm uses a stopping rule, so that the samplecomplexity is not known in advance, but rather is a ran-dom variable induced by the randomness over each runof the algorithm. As the decision of whether to stop islearned from the algorithm by drawing sequential sam-ples, we refer to our algorithm as a sequential learningalgorithm. Another sequential learning algorithm alsoappeared in [19], which built upon the work of [4] withless conservative sample complexities. Their algorithmis based on the Rademacher bootstrap technique. Stop-ping rules in RA were also studied in [20] for designinglinear quadratic regulators, while [21] investigated an-other class of sequential algorithms. A stopping rule isalso considered by [22] for solving stochastic programs,in which the algorithm stops when the computed confi-dence widths of estimated quantities become sufficientlysmall; this is similar to the nature of our algorithm.This paper is organised as follows. In Section 2, we in-troduce the OO success probability and state the prob-lem formulation. In Section 3, a lower confidence boundfor the OO success probability is developed. This is fol-lowed in Section 4 by our sequential learning algorithmwhich applies the lower confidence bounds developed inthe previous section. Additionally, we provide a lowerbound for the distribution function of the stopping timeof the algorithm. The algorithm is then specialised toprobabilistically robust controller tuning in Algorithm2, for which we state and prove our main result in The-orem 4. Lastly in Section 5, we apply our algorithm toa numerical example, which considers probabilisticallyrobust tuning of model predictive controllers on a dieselengine air-path for a fleet of vehicles. Throughout this paper, R denotes the set of real numbersand N denotes the set of natural numbers. We let (cid:98)·(cid:99) de-note the integer floor operator, and (cid:12) is the Hadamard(element-wise) product between matrices. If Q is a ma-trix, then Q (cid:31) Q is positive definite. Thefunction exp ( · ) is the exponential function, the loga-rithm log ( · ) is taken to be the natural logarithm, whilethe cotangent function is denoted by cot ( · ). The N × N identity matrix is denoted I N × N . The probability of anevent and expectation operator are denoted by Pr ( · ) and E [ · ] respectively, with context as to which probabilityspace provided in subscripts (whenever further contextis required). Equality in law between random elements isdenoted with the binary relation = st . The indicator vari-able for the event A is given by I A . The standard Gaus-sian cumulative distribution function (CDF) is denotedby Φ ( · ), while its inverse (i.e. quantile function) is de-noted by Φ − ( · ). The exponential distribution with rateparameter λ is represented by Exp ( λ ). The abbrevia-2ion i.i.d. stands for mutually independent and identi-cally distributed. Consider a measurable system performance function J ( ψ, θ ) : Ψ × Θ → R , (1)with controller parameter θ ∈ Θ from a topological spaceΘ, and uncertain plant parameter ψ ∈ Ψ from a topo-logical space Ψ. The uncertainty over ψ is representedby some probability distribution P ψ over Ψ. Withoutloss of generality, we use the convention that a lower J indicates better performance.Our motivation is to find a controller θ ∗ so that the sys-tem will perform ‘well’ with high probability. To this end,suppose there is a mechanism P θ to randomly sample acandidate controller θ i ∈ Θ. Then to evaluate candidatecontrollers, introduce the ordinal comparison function J ( θ ) : Θ → R , (2)whose role is such that in order to find θ ∗ , we first sample n i.i.d. θ i ∼ P θ and let θ ∗ = argmin θ i ∈{ θ ,...,θ n } J ( θ i ) . (3)As a concrete example for J ( θ ), we could take J ( θ ) = J (cid:0) ψ, θ (cid:1) for some nominal value ψ , such as ψ = E P ψ [ ψ ].An alternative example is to take J ( θ ) = E P ψ [ J ( ψ, θ )],supposing this expectation can be evaluated.Once θ ∗ has been obtained, a single ‘test’ of the systemyields the performance J ( ψ ∗ , θ ∗ ), from an independentlyrealised plant ψ ∗ ∼ P ψ . This test performance naturallypredicates on how well the two random variables J ( θ i )and J ( ψ ∗ , θ i ) are correlated, via their dependence on θ i .A strong correlation should suggest that well-performing J ( θ i ) is highly indicative of well-performing J ( ψ ∗ , θ i ),thus we would reasonably anticipate the test J ( ψ ∗ , θ ∗ )to also perform well. To formalise the concept of dependence between J ( θ i )and J ( ψ ∗ , θ i ), we use copulas [17]. A copula is a mul-tivariate distribution with uniform (0 ,
1) marginals,so that via the inverse probability integral transform,a multivariate distribution may be represented withjust its marginal distributions and a copula. A com-mon choice for a copula model is the Gaussian copula,defined in the bivariate case as follows.
Definition 1 (Bivariate Gaussian copula)
Let ( Z, X ) be a bivariate standard Gaussian with correlation ρ ∈ [ − , , i.e. (cid:34) ZX (cid:35) ∼ N (cid:32)(cid:34) (cid:35) , (cid:34) ρρ (cid:35)(cid:33) . (4) Then the Gaussian copula with correlation ρ is the dis-tribution of (Φ ( Z ) , Φ ( X )) . For a Gaussian copula, the dependence is neatly sum-marised with the correlation parameter ρ . However, notevery family of copula will be parametrised with a corre-lation. Instead, a well-defined notion of correlation validfor any bivariate distribution is the Kendall correlation. Definition 2 (Population Kendall correlation)
For a bivariate distribution ( Z, X ) , the populationKendall correlation is defined as κ = E (cid:104) sign (cid:16) Z − ` Z (cid:17) sign (cid:16) X − ` X (cid:17)(cid:105) , (5) where (cid:16) ` Z, ` X (cid:17) is an independent copy of ( Z, X ) . In this paper, it will be convenient to associate everybivariate distribution with a bivariate Gaussian copula,which we do so through the Kendall correlation.
Definition 3 (Associated Gaussian copula)
Forany bivariate distribution ( Z, X ) with population Kendallcorrelation κ , the Gaussian copula associated with thisdistribution is defined as the bivariate Gaussian copulawith correlation ρ = sin ( πκ/ . The formula ρ = sin ( πκ/
2) is from the relation between κ and ρ for a Gaussian copula [23, Equation (9.11)]. Assuch, any bivariate distribution with a Gaussian copulahas its own copula as the associated Gaussian copula. We are ready to list the standing assumptions of thepaper, for which the main results rely on.
Assumption 1
The bivariate distribution for the per-formances (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) is continuous, and has pop-ulation Kendall correlation κ > , however the value of κ itself is unknown. Remark 1
By Sklar’s theorem [17, Theorem 1.1],the continuity property in Assumption 1 ensures that (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) has a unique copula. We also assume the following bound between the copulaof (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) , and its associated Gaussian cop-ula.3 ssumption 2 Let (cid:16) (cid:101) Z, (cid:101) X (cid:17) denote the copula of (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) and let (cid:16) (cid:101) Z, (cid:101) X (cid:48) (cid:17) denote the Gaussiancopula associated with (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) . For a given ν ∈ [0 , , then for all z ∈ (0 , we have sup x ∈ (0 , (cid:110) Pr (cid:16) (cid:101) X (cid:48) ≤ x (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) − Pr (cid:16) (cid:101) X ≤ x (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17)(cid:111) ≤ ν. (6) Remark 2
The condition (6) in Assumption 2 is slightlyweaker than saying that the copula of (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) is ‘near’ to that of a Gaussian copula. To elab-orate further, a given bound on the Kolmogorov-Smirnov distance (i.e. supremum norm) or total vari-ation distance [24, § Pr (cid:16) (cid:101) X (cid:48) ≤ x (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) and Pr (cid:16) (cid:101) X ≤ x (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) will imply (6) . Moreover, if (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) is assumed to have a Gaussian cop-ula, then (6) is satisfied with ν = 0 . Now let J ∗ ∈ R denote a nominal performance thresh-old, which is used to benchmark the test performance J ( ψ ∗ , θ ∗ ). We require this nominal performance thresh-old to be feasible, in the following sense. Assumption 3
The nominal performance threshold J ∗ satisfies Pr ψ ∗ ,θ i ( J ( ψ ∗ , θ i ) ≤ J ∗ ) > . (7)Lastly, we can forego exact knowledge about the distri-butions of P ψ , P θ , but the standing assumption is thatthey can at the very least be sampled from (e.g. via acomputer simulation). Assumption 4
Samples can be drawn i.i.d. from thedistributions P ψ and P θ . As a consequence, we can produce an i.i.d. sample fromthe distribution of (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) , which we denote (cid:0) J ( θ ) , J ( ψ , θ ) (cid:1) , . . . , (cid:0) J ( θ n ) , J ( ψ n , θ n ) (cid:1) . (8)We may now state the main problem addressed in thispaper. Problem 1
Suppose Assumptions 1, 2, 3 and 4 hold.Given γ ∈ ( ν, and nominal performance threshold J ∗ ∈ R , find a controller θ ∗ such that Pr ψ ∗ ,θ ∗ ( J ( ψ ∗ , θ ∗ ) ≤ J ∗ ) ≥ − γ. (9) In Section 4, we propose Algorithm 2 to address thisproblem, with the formal statement contained in Theo-rem 4. We briefly review some results from [16], which studieda particular ordinal optimisation success probability.
Definition 4 (OO success probability)
Consider n i.i.d. copies of ( Z i , X i ) drawn from the distribu-tion of ( Z, X ) , which is continuous and has populationKendall correlation κ > . We observe Z , . . . , Z n ,and order these observations from best to worst, de-noted by Z n ≤ · · · ≤ Z n : n . The best m are selected,given by Z n ≤ · · · ≤ Z m : n , with respective X -valuesdenoted as X (cid:104) (cid:105) , . . . , X (cid:104) m (cid:105) , which are initially unob-served. More explicitly, we have selected the pairs (cid:0) Z n , X (cid:104) (cid:105) (cid:1) , . . . , (cid:0) Z m : n , X (cid:104) m (cid:105) (cid:1) . The success probabilityis defined as p success ( n, m, α ) := Pr (cid:32) m (cid:91) i =1 (cid:8) X (cid:104) i (cid:105) ≤ x ∗ α (cid:9)(cid:33) (10)= Pr (cid:18) min i ∈{ ,...,m } (cid:8) X (cid:104) i (cid:105) (cid:9) ≤ x ∗ α (cid:19) , (11) where x ∗ α with α ∈ (0 , is the α percentile of thedistribution of X , i.e. Pr ( X ≤ x ∗ α ) = α . Several properties were obtained in [16] pertaining to theOO success probability in the case of Gaussian copulas.
Theorem 1 (Gaussian OO success probability)
If the distribution of ( Z, X ) in Definition 4 has aGaussian copula with correlation ρ > , then the ordi-nal optimisation success probability (10) , here denoted p N success ( n, m, α, ρ ) , satisfies the following properties.(a) (Monotonicity in m ) Given the triplet (¯ n, ¯ α, ¯ ρ ) ∈ N × (0 , × (0 , , then p N success (¯ n, m, ¯ α, ¯ ρ ) ≤ p N success (¯ n, m (cid:48) , ¯ α, ¯ ρ ) (12) for all m ∈ [1 , n ) and m (cid:48) ∈ [ m, n ] .(b) (Monotonicity in n ) Given the triplet ( ¯ m, ¯ α, ¯ ρ ) ∈ N × (0 , × (0 , , then for all n ≤ n (cid:48) such that n (cid:48) ≥ ¯ m and n ∈ [ ¯ m, n (cid:48) ] , we have p N success ( n, ¯ m, ¯ α, ¯ ρ ) ≤ p N success ( n (cid:48) , ¯ m, ¯ α, ¯ ρ ) . (13) (c) (Convergence of success probability) Given thetriplet ( ¯ m, ¯ α, ¯ ρ ) ∈ N × (0 , × (0 , , then lim n →∞ p N success ( n, ¯ m, ¯ α, ¯ ρ ) = 1 . (14)4 d) (High probability of success) Given the triplet ( ¯ m, ¯ α, ¯ ρ ) ∈ N × (0 , × (0 , , and for any δ ∈ (0 , ,there exists an (cid:101) n δ (¯ α, ¯ ρ ) < ∞ such that p N success ( n, ¯ m, ¯ α, ¯ ρ ) ≥ − δ, (15) for all n ≥ (cid:101) n δ (¯ α, ¯ ρ ) .(e) (Lower bound for success probability) For any ω ∈ (0 , π/ , let c = 12 − ωπ (16) c = cot ωπ − ω (17) and µ n = − (cid:115) log ( nc ) c (18) σ n = − log log 22 c (log ( nc ) − log log 2) . (19) Then there exists some n ∗ ( ω ) ∈ N such that for all n ≥ n ∗ ( ω ) , m ∈ [1 , n ] , ρ ∈ (0 , , α ∈ (0 , , wehave p N success ( n, m, α, ρ ) ≥ Φ (cid:32) Φ − ( α ) − ρµ n (cid:112) − ρ + ρ σ n (cid:33) =: (cid:98) p N success ,ω ( n, m, α, ρ ) . (20)From Theorem 1(e), an optimised lower bound (opti-mised with respect to ω ) can be derived as p N success ( n, m, α, ρ ) ≥ sup ω ∈ Ω n (cid:98) p N success ,ω ( n, m, α, ρ )=: (cid:98) p N success ( n, m, α, ρ ) , (21)where Ω n ⊂ (0 , π/
2) is the set of all ω such that n ≥ n ∗ ( ω ). A time complexity O (1) numerical implementa-tion of (21) is detailed in [16]. Problem 1 can be framed in the context of OO, by taking( Z i , X i ) = st (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) (22)in Definition 4 with m = 1. However, a value for α =Pr ψ ∗ ,θ i ( J ( ψ ∗ , θ i ) ≤ J ∗ ) is not explicitly given in Prob-lem 1, nor can the results in Theorem 1 be readily ap-plied since ( Z, X ) may generally not have a Gaussiancopula. In this section, we overcome these obstacles by developing a lower confidence bound for the OO successprobability. This is to be derived from lower confidencebounds for α , and for ρ , the latter being the correlationof the associated Gaussian copula.To facilitate this, we will work more abstractly witha continuous bivariate distribution for ( Z, X ) withKendall correlation κ > ρ . It is to be keptin mind that we can take (22) to bring the context backinto controller tuning. Also, the standing assumptionscan be stated in an analogous way for the distribution( Z, X ). In particular, the analogy to Assumption 4 is tolet an i.i.d. sample of size n be denoted by( Z , X ) , . . . , ( Z n , X n )= st (cid:0) J ( θ ) , J ( ψ , θ ) (cid:1) , . . . , (cid:0) J ( θ n ) , J ( ψ n , θ n ) (cid:1) . (23)First, we consider the following point estimators for α and ρ . Definition 5 (Point estimator for α ) From the sam-ple (23) , a point estimate of α = Pr ( X ≤ x ∗ ) for perfor-mance threshold x ∗ is (cid:98) α n := 1 n n (cid:88) i =1 I { X i ≤ x ∗ } . (24) Definition 6 (Point estimator for ρ ) From the sam-ple (23) , a point estimate of the correlation ρ for the as-sociated Gaussian copula is (cid:98) ρ n := sin (cid:16) π { , (cid:98) κ n } (cid:17) , (25) where (cid:98) κ n is the sample Kendall correlation (cid:98) κ n := 1 n ( n − n (cid:88) i =1 n (cid:88) j =1 sign (( X i − X j ) ( Z i − Z j )) . (26)We also provide monotonicity properties of the OO suc-cess probability in α and ρ , analogous to Theorem 1(a),(b). Lemma 1 (Monotonicity in α ) Given the triplet (¯ n, ¯ m ) ∈ N × [1 , n ] , then for all α ≤ α (cid:48) such that α ∈ (0 , and α (cid:48) ∈ (0 , , we have for the OO successprobability (10) that p success (¯ n, ¯ m, α ) ≤ p success (¯ n, ¯ m, α (cid:48) ) . (27)5 ROOF.
By De Morgan’s laws (i.e. complement of theunion is the intersection of the complements), put thedefinition of p success ( n, m, α ) from (10) in terms of p success ( n, m, α ) = 1 − Pr (cid:32) m (cid:92) i =1 (cid:8) X (cid:104) i (cid:105) > x ∗ α (cid:9)(cid:33) . (28)Then apply the properties that x ∗ α is non-decreasing in α and Pr (cid:0)(cid:84) mi =1 (cid:8) X (cid:104) i (cid:105) > x ∗ α (cid:9)(cid:1) is non-increasing in x ∗ α . (cid:50) Lemma 2 (Monotonicity in ρ ) Given the triplet (¯ n, ¯ m, ¯ α ) ∈ N × [1 , n ] × (0 , , then for all ρ ≤ ρ (cid:48) suchthat ρ ∈ (0 , and ρ (cid:48) ∈ (0 , , we have for the Gaussiancopula OO success probability p N success ( n, m, α, ρ ) ≤ p N success ( n, m, α, ρ (cid:48) ) . (29) PROOF.
Provided in Appendix A. (cid:50)
Confidence bounds for α and ρ can be obtained from thefollowing concentration inequalities. Lemma 3 (Concentration inequalities for α ) For a > , we have Pr ( (cid:98) α n − α < − a ) ≤ exp (cid:0) − na (cid:1) (30)Pr ( (cid:98) α n − α > a ) ≤ exp (cid:0) − na (cid:1) . (31) PROOF.
Recognising that n (cid:98) α n is a sum of indepen-dent Bernoulli random variables (each bounded between0 and 1) with mean α , use Hoeffding’s inequality [18,Theorem 1.1] to obtainPr ( (cid:98) α n − α > a ) = Pr ( n (cid:98) α n − nα > na ) (32) ≤ exp (cid:0) − na (cid:1) , (33)and analogously for the lower tail bound. (cid:50) Lemma 4 (Concentration inequalities for ρ ) Under Assumption 1 with the substitution (22) , for r > , we have Pr ( (cid:98) ρ n − ρ < − r ) ≤ exp (cid:18) − (cid:106) n (cid:107) r π (cid:19) (34)Pr ( (cid:98) ρ n − ρ > r ) ≤ exp (cid:18) − (cid:106) n (cid:107) r π (cid:19) . (35) PROOF.
Provided in Appendix B. (cid:50)
Remark 3
A two-tailed bound similar to Lemma 4 witha slightly different exponent can be found in [25, Theorem4.2]. Applying the fact that n/ ≤ (cid:98) n/ (cid:99) for all n > ,one can eliminate the floor operator in (34) , (35) andrecover the same exponent as found in [25]. From the upper tailed concentration inequalities for α and ρ , we may then derive lower confidence bounds. Toderive a lower confidence bound for α with confidencelevel at least 1 − β , equate exp (cid:0) − na (cid:1) = β and rear-range in the upper-tailed bound (31) to obtainPr (cid:32)(cid:98) α n − α > (cid:114) log (1 /β )2 n (cid:33) ≤ β . (36)Let b := (cid:114) log (1 /β )2 n , (37)so that Pr ( α > (cid:98) α n − b ) ≥ − β . (38)Thus the lower confidence bound for α with confidenceat least 1 − β is obtained as (cid:98) α n := (cid:98) α n − b . (39)To derive a lower confidence bound for ρ with confidencelevel at least 1 − β , equate exp (cid:0) − (cid:98) n/ (cid:99) r /π (cid:1) = β and rearrange in the upper-tailed bound (35) to obtainPr (cid:32)(cid:98) ρ n − ρ > π (cid:115) log (cid:18) β (cid:19) · (cid:4) n (cid:5) (cid:33) ≤ β . (40)Let b := π (cid:115) log (cid:18) β (cid:19) · (cid:4) n (cid:5) , (41)so that Pr ( ρ > (cid:98) ρ n − b ) ≥ − β . (42)Thus the lower confidence bound for ρ with confidenceat least 1 − β is obtained as (cid:98) ρ n := (cid:98) ρ n − b . (43)We may also bound the difference in the success proba-bility from that of its associated Gaussian copula. Lemma 5 (Difference in OO success probability)
Consider the OO success probability (10) from Definition4, and let ρ be the correlation of the associated Gaussiancopula. If Assumption 2 holds under the substitution (22) , then for all n ∈ N and α ∈ (0 , we have p N success ( n, , α, ρ ) − p success ( n, , α ) ≤ ν. (44)6 ROOF.
Let (cid:16) (cid:101) Z, (cid:101) X (cid:17) denote the copula of ( Z, X ) andlet (cid:16) (cid:101) Z, (cid:101) X (cid:48) (cid:17) denote the associated Gaussian copula,where the marginal (cid:101) Z can be shared since it is a uniform(0 ,
1) random variable. Using the fact that the first orderstatistic of an i.i.d. uniform (0 ,
1) sample is Beta (1 , n )distributed [26, § m = 1 that p success ( n, , α ) = (cid:90) Pr (cid:16) (cid:101) X ≤ α (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) f U n ( z ) dz, (45)where f U n ( · ) is the density of the Beta (1 , n ) distribu-tion. Likewise p N success ( n, , α, ρ ) = (cid:90) Pr (cid:16) (cid:101) X (cid:48) ≤ α (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) f U n ( z ) dz. (46)The difference between these is p N success ( n, , α, ρ ) − p success ( n, , α ) (47)= (cid:90) (cid:16) Pr (cid:16) (cid:101) X (cid:48) ≤ α (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) − Pr (cid:16) (cid:101) X ≤ α (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17)(cid:17) f U n ( z ) dz (48) ≤ (cid:90) sup α ∈ (0 , (cid:110) Pr (cid:16) (cid:101) X (cid:48) ≤ α (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) − Pr (cid:16) (cid:101) X ≤ α (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17)(cid:111) f U n ( z ) dz (49) ≤ ν (cid:90) f U n ( z ) dz (50)= ν, (51)where the second inequality is from (6) in Assumption2. (cid:50) Using Lemma 3, the aforementioned properties on α and ρ , as well as the lower bound for p success in (21), we areready to establish a lower confidence bound on the OOsuccess probability. Theorem 2 (Lower confidence bound for p success ) Consider the OO success probability (10) from Defini-tion 4. If Assumption 2 holds under the substitution (22) , then from the sample (23) , with confidence at least − β − β , we have (cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) − ν ≤ p success ( n, m, α ) . (52) PROOF. As (cid:98) p N success from (21) is a lower bound, then (cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) ≤ p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) . (53)Applying Lemma 3 (which requires Assumption 2), thisimplies (cid:98) p success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) − ν ≤ p (cid:48) success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) . (54)Note that the property of monotonicity in m from The-orem 1(a) also applies to any copula, because from (10),we see thatPr (cid:32) m (cid:91) i =1 (cid:8) X (cid:104) i (cid:105) ≤ x ∗ α (cid:9)(cid:33) ≤ Pr (cid:32) m +1 (cid:91) i =1 (cid:8) X (cid:104) i (cid:105) ≤ x ∗ α (cid:9)(cid:33) . (55)Hence we have p success ( n, , α ) ≤ p success ( n, m, α ) . (56)ThereforePr (cid:16)(cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) − ν ≤ p success ( n, m, α ) (cid:17) ≥ Pr (cid:16)(cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) − ν ≤ p success ( n, , α ) (cid:17) ≥ Pr (cid:16)(cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) ≤ p N success ( n, , α, ρ ) (cid:17) ≥ Pr (cid:16) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) ≤ p N success ( n, , α, ρ ) (cid:17) ≥ Pr (cid:16)(cid:98) α n ≤ α, (cid:98) ρ n ≤ ρ (cid:17) = 1 − Pr (cid:16)(cid:98) α n > α or (cid:98) ρ n > ρ (cid:17) ≥ − Pr ( (cid:98) α n > α ) − Pr (cid:16)(cid:98) ρ n > ρ (cid:17) ≥ − β − β , (57)where the first inequality is from applying (56), the sec-ond inequality is due to the implication (54), the thirdinequality is from (53), the fourth inequality is by ap-plying the monotonicity properties from Lemmas 1 and2, the fifth inequality is by the union bound (Boole’s in-equality), and the last inequality is from the lower con-fidence bound properties (38), (39), (42), (43). (cid:50) Remark 4 A − β − β lower confidence bound forthe OO success probability under the associated Gaussiancopula is (cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) , i.e. Pr (cid:16)(cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) ≤ p N success ( n, , α, ρ ) (cid:17) ≥ − β − β . (58)7 Sequential Learning Algorithm
In view of Remark 4, we present Algorithm 1, which se-quentially draws samples from (
Z, X ) and stops after arandom τ samples until an associated Gaussian copulaOO success probability of at least 1 − δ is reached, withconfidence of at least 1 − β − β . Note that this algo-rithm works irrespective of the value of ν in Assumption2, because the Algorithm considers only the associatedGaussian copula. Algorithm 1
Sequential learning for Gaussian copulaOO success probability
Require: δ ∈ (0 , β ∈ (0 , β ∈ (0 , x ∗ , initial sample (23) of size n n ← n + 1 Independently sample (
Z, X ) and add to existingsamples Compute (cid:98) α n , (cid:98) ρ n via (24), (25), (26) Compute (cid:98) α n , (cid:98) ρ n via (39), (43) using β , β respec-tively p ← (cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) If p ≥ − δ , continue, otherwise go to step 1 τ ← n Qualitatively, as n increases, the confidence widths b and b decrease to zero. The lower bound from Theorem1(e) also stipulates that (cid:98) p N success is increasing in n . Thus,we intuitively reason that Algorithm 1 eventually ter-minates with sufficiently large n . This intuition can bemade precise with the following theorem and subsequentcorollary, which uses the concentration inequalities for α and ρ to bound the distribution of the time at whichAlgorithm 1 stops. Theorem 3 (Bound on stopping time)
Fix δ , β , β in Algorithm 1. Given some n ∈ N , suppose the pair ( α ∗ , ρ ∗ ) satisfies (cid:98) p N success ( n, , α ∗ , ρ ∗ ) ≥ − δ . Also let a := α − α ∗ (59) r := ρ − ρ ∗ , (60) where α , ρ are the actual values of α , ρ respectively.Then, for all n greater than the initial sample size, wehave Pr ( τ ≤ n ) ≥ − exp (cid:16) − n ( α − α ∗ − b ) (cid:17) − exp (cid:32) − (cid:106) n (cid:107) ρ − ρ ∗ − b ) π (cid:33) , (61) provided α − α ∗ − b > and ρ − ρ ∗ − b > . PROOF.
We may boundPr ( τ ≤ n ) ≥ Pr (cid:16)(cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) ≥ − δ (cid:17) (62) ≥ Pr (cid:16)(cid:98) α n ≥ α ∗ , (cid:98) ρ n ≥ ρ ∗ (cid:17) (63)= 1 − Pr (cid:16)(cid:98) α n < α ∗ or (cid:98) ρ n < ρ ∗ (cid:17) (64) ≥ − Pr ( (cid:98) α n < α ∗ ) − Pr (cid:16)(cid:98) ρ n < ρ ∗ (cid:17) (65)= 1 − Pr ( (cid:98) α n − α < − a ) − Pr (cid:16)(cid:98) ρ n − ρ < − r (cid:17) (66)= 1 − Pr ( (cid:98) α n − α < − a + b ) − Pr ( (cid:98) ρ n − ρ < − r + b ) (67) ≥ − exp (cid:16) − n ( a − b ) (cid:17) − exp (cid:32) − (cid:106) n (cid:107) r − b ) π (cid:33) , (68)where the first inequality holds because of the stoppingcondition, the second inequality is by definition of α ∗ and ρ ∗ along with monotonicity properties from Lem-mas 1 and 2, the third inequality is from the union bound(Boole’s inequality), and the fourth equality is by ap-plication of the lower tailed concentration inequalities(30), (34) from Lemmas 3 and 4 respectively. Substitut-ing (59), (60) completes the proof. (cid:50) Corollary 1 (Finite stopping time)
Under As-sumptions 1 and 3 with the substitution (22) , the stop-ping time τ from Algorithm 1 satisfies Pr ( τ < ∞ ) = 1 . (69) PROOF.
Assumptions 1 and 3 ensure that α > ρ >
0. By Theorem 1(b), (d), for any δ > α ∗ , ρ ∗ ) such that α − α ∗ − b > ρ − ρ ∗ − b > n greater than some sufficientlylarge number. Hence from the monotone convergencetheorem [27, Theorem 4.8], we havePr ( τ < ∞ )= lim n →∞ Pr ( τ < n + 1)= lim n →∞ Pr ( τ ≤ n ) ≥ lim n →∞ (cid:34) − exp (cid:16) − n ( α − α ∗ − b ) (cid:17) − exp (cid:32) − (cid:106) n (cid:107) ρ − ρ ∗ − b ) π (cid:33)(cid:35) = 1 , (70)8here the inequality is by applying Theorem 3. (cid:50) Remark 5 (Optimised bound on stopping time)
We can also numerically optimise the bound (61) withrespect to ( α ∗ , ρ ∗ ) . This can be useful for characterisingthe computational requirement (i.e. number of samplesneeding to be simulated) of the algorithm. Further detailson optimising the bound are provided in Appendix C.4.1 Controller Tuning Algorithm Next, we specialise Algorithm 1 to the context of con-troller tuning, in order to explicitly address Problem 1.This is presented in Algorithm 2, which now also out-puts the tuned controller θ ∗ τ . Algorithm 2
Probabilistically robust controller tuning
Require: δ ∈ (0 , β ∈ (0 , β ∈ (0 , J ∗ , initial sample (8) of size n n ← n + 1 Independently sample θ i and ψ i Form ( Z i , X i ) ← (cid:0) J ( θ i ) , J ( ψ i , θ i ) (cid:1) and add to ex-isting samples Compute (cid:98) α n via (24) with performance threshold J ∗ Compute (cid:98) ρ n via (25), (26) Compute (cid:98) α n , (cid:98) ρ n via (39), (43) using β , β respec-tively p ← (cid:98) p N success (cid:16) n, , (cid:98) α n , (cid:98) ρ n (cid:17) If p ≥ − δ , continue, otherwise go to step 1 τ ← n θ ∗ τ ← argmin i ∈{ ,...,τ } J ( θ i )By chaining the confidence level with the OO successprobability, we demonstrate how Algorithm 2 addressesProblem 1, via the following theorem. Theorem 4
Suppose Algorithm 2 is applied to tuningcontrollers of a system with a performance function J ( ψ, θ ) . Let θ ∗ τ denote the candidate solution output bythe algorithm. Under Assumptions 1, 2, 3 and 4, thengiven any γ ∈ ( ν, and δ > , β > , β > with δ + β + β = γ − ν , then Pr ψ ∗ ,θ ∗ τ ( J ( ψ ∗ , θ ∗ τ ) ≤ J ∗ ) ≥ − γ. (71) PROOF.
Let (cid:101) n δ ( α, ρ ) denote the smallest integer n such that p N success ( n, , α, ρ ) ≥ − δ . Combining this withLemma 3 (requiring Assumption 2), we havePr ψ ∗ ,θ ∗ τ ( J ( ψ ∗ , θ ∗ τ ) ≤ J ∗ | τ ≥ (cid:101) n δ ( α , ρ )) ≥ − δ − ν. (72) Recognise that for any τ such that p N success ( τ, , α, ρ ) ≥ − δ , this implies τ ≥ (cid:101) n δ ( α, ρ ) , (73)by definition of (cid:101) n δ ( α, ρ ) and due to monotonicity in n (Theorem 1(b)). As noted in Corollary 1 (requiring As-sumptions 1 and 3), the algorithm stops at time τ withprobability one such that (cid:98) p N success (cid:16) τ, , (cid:98) α τ , (cid:98) ρ τ (cid:17) ≥ − δ. (74)Then by letting β = β + β , we have1 − β − β = 1 − β ≤ Pr (cid:16)(cid:98) p N success (cid:16) τ, , (cid:98) α τ , (cid:98) ρ τ (cid:17) ≤ p N success ( τ, , α , ρ ) (cid:17) ≤ Pr (cid:0) − δ ≤ p N success ( τ, , α , ρ ) (cid:1) ≤ Pr ( τ ≥ (cid:101) n δ ( α , ρ )) , (75)where the first inequality is via Remark 4, the secondinequality is due to the stopping condition (74), and thethird inequality is due to the implication (73). ThusPr ( J ( ψ ∗ , θ ∗ τ ) ≤ J ∗ ) ≥ Pr ( J ( ψ ∗ , θ ∗ τ ) ≤ J ∗ , τ ≥ (cid:101) n δ ( α , ρ ))= Pr ( J ( ψ ∗ , θ ∗ τ ) ≤ J ∗ | τ ≥ (cid:101) n δ ( α , ρ )) × Pr ( τ ≥ (cid:101) n δ ( α , ρ )) ≥ (1 − δ − ν ) (1 − β ) > − δ − ν − β = 1 − γ, (76)because δ, β , β were chosen such that γ = δ + β + β + ν. (77) (cid:50) Remark 6
An explicit choice of algorithm settings may,for instance, be δ = β = β = ( γ − ν ) / , or δ =( γ − ν ) / , β = β = ( γ − ν ) / . The lower bound onthe stopping time in Theorem 3 will generally depend on δ , β , β , so by choosing an appropriate combination of δ , β , β , the performance of the algorithm can potentiallybe improved. However, doing so would not be reasonablein practice since it also requires the actual values of α and ρ to be known. Remark 7 If (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) is assumed to havea Gaussian copula, we may take ν = 0 as per Re-mark 2, so Pr ψ ∗ ,θ ∗ τ ( J ( ψ ∗ , θ ∗ τ ) ≤ J ∗ ) can be made ar-bitrarily close to one. This is because the lower tailboundary conditional CDF for the bivariate Gaussian opula is degenerate at zero for ρ > [17, § (46) for the OO success proba-bility, lim n →∞ p N success ( n, , α, ρ ) = 1 . However, thereexist families of bivariate copulas where the lower tailboundary conditional CDF is non-degenerate (e.g. thebivariate Frank copula [17, § (45) , generally lim n →∞ p success ( n, , α ) (cid:54) = 1 . There-fore in the case ν > , we generally cannot make Pr ψ ∗ ,θ ∗ τ ( J ( ψ ∗ , θ ∗ τ ) ≤ J ∗ ) arbitrarily close to one. Remark 8
Although the value of ν is treated as givenin Theorem 4, this might not be explicitly known a prioriin practice, and instead must be assumed. However, oncethe algorithm has finished running, an a posteriori valuefor ν can be estimated from the collected sample. This isdemonstrated in the numerical example. We demonstrate our proposed approach on a numericalexample, of tuning model predictive controllers (MPC)on automotive diesel engines. Typically in production,a single controller will be tuned for a fleet of vehicles.However, the exact model representing any individualengine dynamics may differ slightly. Here, the individualengine is modelled as a nominal linear process model x k +1 = Ax k + Bu k , (78)with the nominal matrices A ∈ R n × n , B ∈ R n × m sub-jected to a Gaussian disturbance A = A + S A (cid:12) Z n × n (79) B = B + S B (cid:12) Z n × m , (80)where Z N × N is an N × N matrix of independent stan-dard Gaussian entries, while S A , S B contain the stan-dard deviations of the disturbances. These disturbancesmodel the variations of different vehicles in the fleet. Weapply Algorithm 2 to tuning engine controllers so thatthe performance (measured in settling time) will be ro-bust to these variations. The air-path of an automotive diesel engine can be lo-cally represented by a reduced order linear model with 4states, and 3 inputs [28]. The state vector is denoted by x = (cid:104) p im p em W comp y EGR (cid:105) (cid:62) , (81)where p im is the engine intake manifold pressure, p em isthe exhaust manifold pressure, W comp is the compressormass flow rate and y EGR is the known as the exhaust gas recirculation (EGR) rate. The inputs are composedof the actuation signals u = (cid:104) u thr u EGR u VGT (cid:105) (cid:62) , (82)where u thr is for the throttle valve, u EGR is for the EGRvalve and u VGT is for the variable geometry turbine(VGT) vane. At a particular operating condition, thenominal model (78) with normalised units and variablestrimmed from the equilibrium point has been obtainedas A = . − . . . . . . . − . − . . . . . − . . (83) B = − . − . . − . − . . . − . − . . . . . (84)We consider the task of regulating the states to the originfrom the initial condition x = (cid:104) − . − . − . . (cid:105) (cid:62) , (85)which constitutes a step change from a ‘medium’ engineoperating condition to a ‘high’ engine operating condi-tion. Let the system performance function for our regu-lation problem be defined as J ( ψ, θ ) = SettlingTime ( Y ψ,θ ) , (86)where Y ψ,θ is the discrete-time closed-loop trajec-tory of y EGR under controller θ on plant ψ , and SettlingTime ( · ) is the 2% settling time. This settlingtime function is implemented in Matlab , which linearlyinterpolates in between discrete-time trajectories, and isthus continuous-valued. Here, the MPC law is obtainedby solving the receding horizon quadratic cost problemmin u k | ,..., u k | (cid:40) (cid:88) i =0 (cid:16) x (cid:62) k | i Qx k | i + u (cid:62) k | i Ru k | i (cid:17) + x (cid:62) k | P x k | (cid:41) subject to x k | i +1 = Ax k | i + Bu k | i , i = 0 , . . . , M x k | i ≤ f , i = 1 , . . . , E u k | i ≤ h , i = 0 , . . . , (cid:12)(cid:12) u k | i − u k | i − (cid:12)(cid:12) ≤ (cid:101) u, i = 0 , . . . , , (87)10here Q (cid:31) P (cid:31) R (cid:31)
0, and with state constraintvalues (representing physical constraints on the signals) M = − (88) f = (cid:104) . . . (cid:105) (cid:62) , (89)input constraint values E = (cid:34) I × − I × (cid:35) (90) h = (cid:104) .
15 0 .
15 0 .
15 0 0 .
15 0 . (cid:105) , (91)and slew rate (cid:101) u = 0 . , (92)while the initial input begins at u = (cid:104) (cid:105) (cid:62) . At state x k for k ≥
1, the optimal solution to (87) with x k | = x k is obtained as (cid:16) u ∗ k | , . . . , u ∗ k | (cid:17) , and the control lawapplied at time k is u k = u ∗ k | .The tunable controller parameters are the positive defi-nite cost matrices θ = ( Q, P, R ) . (93)The mechanism we use for randomly generating the Q , R matrices, using a similar approach to [29, 30], is givenby Q = W Q D Q W (cid:62) Q (94) R = W R D R W (cid:62) R , (95)where • W Q is a uniformly random orthogonal matrix ofdimension 4 × • W R is a uniformly random orthogonal matrix ofdimension 3 × • D Q is a diagonal matrix whose diagonal elementsare independently Exp (1) distributed, • D R is a diagonal matrix whose diagonal elementsare independently Exp (1 / P is fixed with respect to A, B, Q, R by solving thediscrete-time algebraic Riccati equation.Uncertainty quantification for the diesel engine air-pathhas been performed using the methodology detailed in[31], which we assume for the purpose of this example represents the uncertainty over a fleet of vehicles. Eachrandom plant is characterised by the pair ψ = ( A, B ) , (96)and ψ can be sampled by perturbing the nominal model(78) with small noise. The standard deviations from (79),(80) are S A = .
095 4 .
775 3 .
610 9 . . . . . .
85 16 .
56 12 .
85 31 . .
38 34 .
46 26 .
26 64 . × − (97) S B = .
287 3 .
753 4 . . . . .
70 12 .
31 16 . .
98 26 .
26 34 . × − . (98)Let the performance comparison function J ( θ ) be thesettling time from using the nominal model ψ = (cid:0) A, B (cid:1) as the plant dynamics in closed-loop under controller θ ,i.e. J ( θ ) = SettlingTime (cid:16) Y ψ,θ (cid:17) . (99)We also set a desired nominal performance threshold of J ∗ = 6 .
75 seconds for the settling time.
Assume a value for ν = 0 .
1. By running Algorithm 2with settings δ = 0 . , β = 0 . , β = 0 .
05, then fromTheorem 4 the prescribed probability of the nominalperformance threshold being met in a single test is atleast 1 − γ = 0 .
7. We ran the algorithm once, whichstopped after τ = 1544 samples. A histogram for theperformances J ( θ i ) and J ( ψ i , θ i ) are plotted in Figure1.The best controller θ ∗ τ when evaluated on the perfor-mance comparison function J ( θ ) was found to have asettling time of J ( θ ∗ τ ) = 2 . ψ ∗ , we obtained a trajectory for y EGR with a settling time of 3 . J ∗ = 6 . J ∗ = 6 .
75 to be at least 1 − γ = 0 .
7. We11 ig. 1. Histograms of J ( θ i ) and J ( ψ i , θ i ) for the 1544 sam-ples in a single run of Algorithm 2. found that all 10000 of the tests outperformed the nomi-nal threshold, which far exceeds 0 .
7. Moreover, the min-imum, mean and maximum performance times were of0 . . . Aggregate results were also obtained for 1000 indepen-dent runs of Algorithm 2 with identical tuning procedureand settings as described above, producing 1000 tunedcontrollers. Each controller was then tested on its ownrandomly generated plant. It was found that all 1000tests succeeded in outperforming the nominal thresholdof J ∗ = 6 .
75 seconds, with minimum, mean and max-imum performance times of 0 . . . For this example, we may validate Theorem 3 by plot-ting in Figure 2 the numerically optimised lower boundfor the CDF of the stopping time τ (using the point es-timates (cid:98) α τ = 0 . (cid:98) ρ τ = 0 . α , ρ ), against the em-pirical CDF of the stopping time for the 1000 runs.As the curves in Figure 2 are within less than order ofmagnitude on the horizontal scale, this hints that Theo-rem 3 is not overly conservative. However, our empiricalresults also suggest that 1 − γ can be conservative for theactual probability of outperforming J ∗ (which appearsto be much closer to 1 than 0 . α and ρ , since thelower confidence bounds (39), (43) are non-asymptotic,which tend to be more conservative than their asymp-totic counterparts. This conservativeness can somewhat n P r ( τ ≤ n ) Stopping Time Distribution
EmpiricalTheorem 3 (Optimised)
Fig. 2. The empirical algorithm run times compared againstthose as suggested by Theorem 3. be reduced in absolute terms by choosing a lower γ . Forinstance, one could select δ = 0 . β = 0 . β = 0 . γ = 0 .
15. However, this comesat a tradeoff of longer stopping times, hence more com-putation. Using point estimates in place of the actual α and ρ , applying Theorem 3 suggests that Algorithm2 will have stopped by n = 40000, with probability atleast 0 . δ = 0 . β = 0 . β = 0 . γ = 0 .
11, this valuefor n changes to 480000.Decreasing J ∗ rather than γ also comes at a tradeoffof more computation. We performed additional simula-tions identical to Section 5.2 with a single tuned con-troller, except with J ∗ = 6 .
5. After 10000 independenttests, the minimum, mean and maximum settling timeswere 0 . . . J ∗ = 6 .
75. How-ever, this run of Algorithm 2 took 37144 samples. Us-ing point estimates in place of the actual α and ρ , ap-plying Theorem 3 suggests that Algorithm 2 will havestopped by n = 6800 with probability at least 0 .
999 for J ∗ = 6 .
75 (evident from Figure 2). On the other handwith J ∗ = 6 .
5, Theorem 3 suggests that Algorithm 2 willhave stopped by n = 93000, with probability at least0 . J ∗ doesseem to be conservative of the performance in this ex-ample, decreasing J ∗ by a small amount does not appearto improve the test performance, but results in an orderof magnitude increase in the computation time.From the relatively large sample of size 37144 obtainedin the previous paragraph, we can also numerically val-idate the assumption that ν = 0 .
1. A nonparametriccopula kernel density estimate was fitted from this sam-ple, using the kdecopula package in R [32], which im-plements the transformation local likelihood estimationmethod [33]. A Gaussian copula was also fitted with thepoint estimate for ρ . Performing a sweep over z and x values in (6), we computedsup z,x ∈ (0 , (cid:110) Pr (cid:16) (cid:101) X (cid:48) ≤ x (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17) − Pr (cid:16) (cid:101) X ≤ x (cid:12)(cid:12)(cid:12) (cid:101) Z = z (cid:17)(cid:111) ≈ . , (100)12hich is consistent with the assumption of ν = 0 . In this paper, we addressed a probabilistically robustcontrol design problem using a sequential learning algo-rithm, based on ordinal optimisation. Our results wereenabled by relating the OO success probability to pre-vious work on Gaussian copula OO, via the associatedGaussian copula. The algorithm was illustrated on a nu-merical example involving the tuning of MPC for auto-motive diesel engines, and showed that the performanceof tuned controllers was robust to plant uncertainty inboth a multi-plant setting over a fleet of vehicles with asingle algorithm run, and a multi-controller setting overmany algorithm runs.One avenue for future research is to investigate therole that the distribution P θ (for sampling candidatecontrollers) plays in the tuned controller performance.In our formulation, P θ is an arbitrary choice left tothe practitioner. In Section 5, P θ was chosen by ex-plicitly constructing a distribution, however anotheroption would have been to let θ i be the solution outputby running a randomised optimisation algorithm withobjective J ( θ ). Modifying P θ affects the distributionof (cid:0) J ( θ i ) , J ( ψ ∗ , θ i ) (cid:1) , and consequently the value of ν .Hence it is perhaps possible to find guiding principles indesigning P θ which will lead to more favourable prob-abilistic performance specifications, or alternatively,reduced computational requirements for a fixed perfor-mance specification. The first author acknowledges the support of the Eliz-abeth & Vernon Puzey Scholarship. Computations de-scribed in this paper were performed using the Univer-sity of Birmingham’s BEAR [34] and The University ofMelbourne’s Spartan [35] high performance computingservices.
A Proof of Lemma 2
Let G be the random variable for the number of X -valuesless than or equal to the threshold x ∗ α . Conditional on G = g , we can write the order statistics of the X -valuesas X n ≤ · · · ≤ X g : n ≤ x ∗ α < X ( g +1): n ≤ · · · ≤ X n : n . (A.1)Using the additive Gaussian noise representation of theGaussian copula [16, Theorem 4.1], we can assume with-out loss of generality that each X ∼ N (0 , Z isformed by Z = X + Y, (A.2) where Y ∼ N (cid:0) , ξ (cid:1) is independent with X , and ξ =1 /ρ −
1. Introduce the following indexing of the Z -valuesaccording to the ordering of the X -values. We denote Z { i } := X i : n + Y i (A.3)where the Y i are i.i.d. N (cid:0) , ξ (cid:1) , since the Y -valuesare independent of the ordering of the X -values. Let p N success | g ( n, m, α, ρ ) denote the conditional successprobability, given G = g . An equivalent characterisationof the conditional success probability can be obtainedfrom [7, Equation (2.19)]. This way, we may write p N success | g ( n, m, α, ρ )= Pr (cid:18) min (cid:8) Z { } , . . . , Z { g } (cid:9) ≤ ( m ) min (cid:8) Z { g +1 } , . . . , Z { n } (cid:9)(cid:12)(cid:12)(cid:12)(cid:12) G = g (cid:19) , (A.4)where ( m ) min {·} denotes the m th smallest value of its ar-guments. Putting (A.3), we have p N success | g ( n, m, α, ρ )= Pr (cid:18) min { Y + X n , . . . , Y g + X g : n }≤ ( m ) min (cid:8) Y g +1 + X ( g +1): n , . . . , Y n + X n : n (cid:9)(cid:12)(cid:12)(cid:12)(cid:12) G = g (cid:19) = Pr (cid:18) min { Y + ∆ X n , . . . , Y g + ∆ X g : n }≤ ( m ) min (cid:8) Y g +1 + ∆ X ( g +1): n ,. . . , Y n + ∆ X n : n } (cid:12)(cid:12)(cid:12)(cid:12) G = g (cid:19) , where ∆ X i : n := X i : n − x ∗ α . Now let (cid:101) Y i ∼ N (0 ,
1) repre-sent a standardised random variable, so that Y i = st ξ (cid:101) Y i ,and= p N success | g ( n, m, α, ρ ) (A.5)= Pr (cid:18) min (cid:110) ξ (cid:101) Y + ∆ X n , . . . , ξ (cid:101) Y g + ∆ X g : n (cid:111) ≤ ( m ) min (cid:110) ξ (cid:101) Y g +1 + ∆ X ( g +1): n ,. . . , ξ (cid:101) Y n + ∆ X n : n (cid:111)(cid:12)(cid:12)(cid:12)(cid:12) G = g (cid:19) (A.6)13 Pr (cid:18) min (cid:26) (cid:101) Y + ∆ X n ξ , . . . , (cid:101) Y g + ∆ X g : n ξ (cid:27) ≤ ( m ) min (cid:26) (cid:101) Y g +1 + ∆ X ( g +1): n ξ ,. . . , (cid:101) Y n + ∆ X n : n ξ (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) G = g (cid:19) , (A.7)because ξ >
0. Let any fixed realisation of the ran-dom variables (cid:101) Y , . . . , (cid:101) Y n , ∆ X n , . . . , ∆ X n : n be denotedas (cid:101) y , . . . , (cid:101) y n , ∆ x n , . . . , ∆ x n : n respectively. Observe∆ X i : n ≤ i ≤ g , and ∆ X i : n > i > g . Sofor any ξ (cid:48) < ξ , we have (cid:101) y i + ∆ x i : n ξ (cid:48) < (cid:101) y i + ∆ x i : n ξ , ∀ i ≤ g (A.8) (cid:101) y i + ∆ x i : n ξ (cid:48) > (cid:101) y i + ∆ x i : n ξ , ∀ i > g. (A.9)Therefore it follows thatmin (cid:26)(cid:101) y + ∆ x n ξ (cid:48) , . . . , (cid:101) y g + ∆ x g : n ξ (cid:48) (cid:27) ≤ min (cid:26)(cid:101) y + ∆ x n ξ , . . . , (cid:101) y g + ∆ x g : n ξ (cid:27) (A.10) ( m ) min (cid:26)(cid:101) y g +1 + ∆ x ( g +1): n ξ , . . . , (cid:101) y n + ∆ x n : n ξ (cid:27) ≤ ( m ) min (cid:26)(cid:101) y g +1 + ∆ x ( g +1): n ξ (cid:48) , . . . , (cid:101) y n + ∆ x n : n ξ (cid:48) (cid:27) . (A.11)Denote ρ (cid:48) = (cid:0) ξ (cid:48) + 1 (cid:1) − / , so that ρ < ρ (cid:48) . Then p N success | g ( n, m, α, ρ (cid:48) )= p N success | g (cid:16) n, m, α, (cid:0) ξ (cid:48) + 1 (cid:1) − / (cid:17) = Pr (cid:18) min (cid:26) (cid:101) Y + ∆ X n ξ (cid:48) , . . . , (cid:101) Y g + ∆ X g : n ξ (cid:48) (cid:27) ≤ ( m ) min (cid:26) (cid:101) Y g +1 + ∆ X ( g +1): n ξ (cid:48) ,. . . , (cid:101) Y n + ∆ X n : n ξ (cid:48) (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) G = g (cid:19) ≥ Pr (cid:18) min (cid:26) (cid:101) Y + ∆ X n ξ , . . . , (cid:101) Y g + ∆ X g : n ξ (cid:27) ≤ ( m ) min (cid:26) (cid:101) Y g +1 + ∆ X ( g +1): n ξ ,. . . , (cid:101) Y n + ∆ X n : n ξ (cid:27)(cid:12)(cid:12)(cid:12)(cid:12) G = g (cid:19) = p N success | g ( n, m, α, ρ ) , where the inequality is from applying (A.10) and (A.11).The random variable G is binomial distributed with pa-rameters n , α (i.e. not affected by the value of ρ ), thus p N success (¯ n, ¯ m, ¯ α, ρ (cid:48) )= n (cid:88) g =0 p N success | g (¯ n, ¯ m, ¯ α, ρ (cid:48) ) Pr ( G = g ) ≥ n (cid:88) g =0 p N success | g (¯ n, ¯ m, ¯ α, ρ ) Pr ( G = g )= p N success (¯ n, ¯ m, ¯ α, ρ ) . (A.12) B Proof of Lemma 4
We prove the upper tail concentration bound; the stepsfor the lower tail are similar and analogous. From Def-inition 3, the population Kendall correlation κ and theassociated Gaussian copula correlation ρ are related by ρ = sin ( πκ/ r > (cid:98) ρ n − ρ > r )= Pr (cid:16) sin (cid:16) π { (cid:98) κ n , } (cid:17) − sin (cid:16) π κ (cid:17) > r (cid:17) = Pr (cid:16) sin (cid:16) π (cid:98) κ n (cid:17) − sin (cid:16) π κ (cid:17) > r (cid:17) . where we able to take max { (cid:98) κ n , } = (cid:98) κ n since (cid:98) κ n ≥ (cid:98) ρ n − ρ , as ρ > π ( (cid:98) κ n − κ ) > r together with r > (cid:98) κ n − κ >
0. Since sin ( · ) is 1-Lipschitz continuous,then generally (cid:12)(cid:12)(cid:12) sin (cid:16) π (cid:98) κ n (cid:17) − sin (cid:16) π κ (cid:17)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12) π (cid:98) κ n − π κ (cid:12)(cid:12)(cid:12) . (B.1)However as we have established the sign of (cid:98) κ n − κ , thenthe event π ( (cid:98) κ n − κ ) > r together with r > (cid:16) π (cid:98) κ n (cid:17) − sin (cid:16) π κ (cid:17) ≤ π (cid:98) κ n − κ ) . (B.2)ThusPr (cid:18)(cid:98) κ n − κ > rπ (cid:19) = Pr (cid:16) π (cid:98) κ n − κ ) > r (cid:17) ≥ Pr (cid:16) sin (cid:16) π (cid:98) κ n (cid:17) − sin (cid:16) π κ (cid:17) > r (cid:17) = Pr ( (cid:98) ρ n − ρ > r ) . (B.3)Using the fact that (cid:98) κ n is an unbiased estimator for κ ,and moreover a U-statistic with a second-order kernelbounded between − (cid:18)(cid:98) κ n − κ > rπ (cid:19) ≤ exp (cid:18) − (cid:106) n (cid:107) r π (cid:19) . (B.4)14ombining with (B.3) completes our proof. C Optimised Bound for Theorem 3
The lower bound (61) for the distribution of the stoppingtime can be optimised byPr( τ ≤ n ) ≥ − min ( α ∗ ,ρ ∗ ) ∈ A (cid:40) exp (cid:16) − n ( α − α ∗ − b ) (cid:17) + exp (cid:32) − (cid:106) n (cid:107) ρ − ρ ∗ − b ) π (cid:33)(cid:41) , (C.1)where A := (cid:110) ( α ∗ , ρ ∗ ) ∈ (0 , : α − b > α ∗ ,ρ − b > ρ ∗ , (cid:98) p N success ( n, , α ∗ , ρ ∗ ) ≥ − δ (cid:111) (C.2)is the region of (0 , where the gaps ( α − b ) − α ∗ and( ρ − b ) − ρ ∗ are positive. Moreover, because the boundis improving with the gaps ( α − b ) − α ∗ and ( ρ − b ) − ρ ∗ , and also because of the monotonicity properties inLemmas 1 and 2, the optimum will lie on the Paretofront (cid:98) p N success ( n, , α ∗ , ρ ∗ ) = 1 − δ for ( α ∗ , ρ ∗ ) ∈ (0 , .For a given ω , we can instead analytically determine thePareto front along (cid:98) p N success ,ω ( n, , α ∗ , ρ ∗ ) = 1 − δ using(20). LettingΦ (cid:32) Φ − ( α ) − ρµ n (cid:112) − ρ + ρ σ n (cid:33) = 1 − δ, (C.3)and putting the definitions of µ n , σ n and rearranging,we obtain a quadratic form in α , ρ given by d ρ + d Φ − ( α ) ρ + d Φ − ( α ) + d = 0 , (C.4)where d = − nc ) log log 2 + 2 log ( nc ) − c Φ − (1 − δ ) log ( nc )log log 2+ 2 c Φ − (1 − δ ) − Φ − (1 − δ ) (C.5) d = − √ c log ( nc ) / log log 2 + 4 √ c log ( nc ) / (C.6) d = − c log ( nc )log log 2 + 2 c (C.7) d = 2 c Φ − (1 − δ ) log ( nc )log log 2 − c Φ − (1 − δ ) . (C.8) Thus given ω , n , δ , we can solve for ρ in terms of α with ρ ω ( α ) = 12 d (cid:34) − (cid:0) d Φ − ( α ) (cid:1) + (cid:114) ( d Φ − ( α )) − d (cid:16) d Φ − ( α ) + d (cid:17)(cid:35) . (C.9)Alternatively given ω , n , δ , we can solve for α in termsof ρ with α ω ( ρ ) = Φ − ( d ρ ) + (cid:113) ( d ρ ) − d ( d ρ + d )2 d , (C.10)where we have taken the positive solutions of thequadratics since α > ρ >
0. We may then proceed tooptimise with respect to α ∗ (and ρ ∗ implicitly in termsof α ∗ ) with an inner minimisation for a given ω , andthen optimise with respect to ω in an outer minimisa-tion. Explicitly, (C.1) becomesPr( τ ≤ n ) ≥ − inf ω ∈ Ω n (cid:40) min α ∗ ∈ A (cid:48) ω (cid:40) exp (cid:16) − n ( α − α ∗ − b ) (cid:17) + exp (cid:32) − (cid:106) n (cid:107) ρ − ρ ω ( α ∗ ) − b ) π (cid:33)(cid:41)(cid:41) , (C.11)where A (cid:48) ω := { α ∈ (0 ,
1] : α ω ( ρ − b ) ≤ α ≤ α − b } , (C.12)and ω ∈ Ω n ⊂ (0 , π/
2) is defined the same as in (21).The inner minimisation in (C.11) is quasiconvex, thusthe optimised bound is not too difficult to numericallyimplement. This is further illustrated in Figure C.1.
References [1] R. Tempo, G. Calafiore, and F. Dabbene,
RandomizedAlgorithms for Analysis and Control of Uncertain SystemsWith Applications . Springer, 2nd ed., 2013.[2] P. Khargonekar and A. Tikku, “Randomized algorithmsfor robust control analysis and synthesis have polynomialcomplexity,” in , IEEE, 1996.[3] R. Tempo, E. W. Bai, and F. Dabbene, “Probabilisticrobustness analysis: explicit bounds for the minimum numberof samples,” in , IEEE, 1996.[4] M. Vidyasagar, “Statistical learning theory and randomizedalgorithms for control,”
IEEE Control Systems Magazine ,vol. 18, no. 6, pp. 69–85, 1998.[5] S. X. Ding, L. Li, and M. Kr¨uger, “Application of randomizedalgorithms to assessment and design of observer-based faultdetection systems,”
Automatica , vol. 107, pp. 175–182, 2019. .02 0.04 0.06 0.08 0.10 0.120.600.650.700.750.800.85 L o w e r B o un d ( , ) b b p success, ( n , 1, , ) = 1 = 0.84, n = 7500 Fig. C.1. A visual depiction of how the bound (C.11) is op-timised. The solid curve is a plot of the original bound (61)along the Pareto front (dashed curve), for given ω = 0 . n = 7500, α = 0 . ρ = 0 . δ = 0 . β = 0 . β = 0 .
05. The dotted lines indicate how the gaps( α − b ) − α ∗ and ( ρ − b ) − ρ ∗ must be positive in orderfor the bound to be informative. [6] T. Alpcan, T. Basar, and R. Tempo, “Randomizedalgorithms for stability and robustness analysis of high speedcommunication networks,” in IEEE Conference on ControlApplications , IEEE, 2003.[7] Y.-C. Ho, Q.-C. Zhao, and Q.-S. Jia,
Ordinal Optimization:Soft Optimization for Hard Problems . Springer, 2007.[8] Y. C. Ho, R. S. Sreenivas, and P. Vakili, “Ordinaloptimization of DEDS,”
Discrete Event Dynamic Systems ,vol. 2, no. 1, pp. 61–88, 1992.[9] X. Xie, “Dynamics and convergence rate of ordinalcomparison of stochastic discrete-event systems,”
IEEETransactions on Automatic Control , vol. 42, no. 4, pp. 586–590, 1997.[10] L. H. Lee, T. W. E. Lau, and Y. C. Ho, “Explanation ofgoal softening in ordinal optimization,”
IEEE Transactionson Automatic Control , vol. 44, no. 1, pp. 94–99, 1999.[11] M. Deng and Y.-C. Ho, “An ordinal optimization approachto optimal control problems,”
Automatica , vol. 35, no. 2,pp. 331–338, 1999.[12] Y.-C. Ho and M. E. Larson, “Ordinal optimization approachto rare event probability problems,”
Discrete Event DynamicSystems: Theory and Applications , vol. 5, no. 2-3, pp. 281–301, 1995.[13] M. Yang and L. Lee, “Ordinal optimization with subsetselection rule,”
Journal of Optimization Theory andApplications , vol. 113, no. 3, pp. 597–620, 2002.[14] M. Vidyasagar, “Randomized algorithms for robust controllersynthesis using statistical learning theory,”
Automatica ,vol. 37, no. 10, pp. 1515–1528, 2001.[15] H. Ishii and R. Tempo, “Las vegas randomized algorithmsin distributed consensus problems,” in
American ControlConference , IEEE, 2008.[16] R. Chin, J. E. Rowe, I. Shames, C. Manzie, and D. Neˇsi´c,“Ordinal optimisation for the gaussian copula model.”https://arxiv.org/abs/1911.01993, 2020.[17] H. Joe,
Dependence Modeling with Copulas . CRC Press, 2014. [18] D. P. Dubhashi and A. Panconesi,
Concentration of Measurefor the Analysis of Randomized Algorithms . CambridgeUniversity Press, 2009.[19] V. Koltchinskii, C. T. Abdallah, M. Ariola, P. Dorato,and D. Panchenko, “Improved sample complexity estimatesfor statistical learning control of uncertain systems,”
IEEETransactions on Automatic Control , vol. 45, no. 12, pp. 2383–2388, 2000.[20] Y. Fujisaki and Y. Oishi, “Guaranteed cost regulator design:A probabilistic solution and a randomized algorithm,”
Automatica , vol. 43, no. 2, pp. 317–324, 2007.[21] T. Alamo, R. Tempo, A. Luque, and D. R. Ramirez,“Randomized methods for design of uncertain systems:Sample complexity and sequential algorithms,”
Automatica ,vol. 52, pp. 160–172, 2015.[22] G. Bayraksan and P. Pierre-Louis, “Fixed-width sequentialstopping rules for a class of stochastic programs,”
SIAMJournal on Optimization , vol. 22, no. 4, pp. 1518–1548, 2012.[23] M. Kendall and J. D. Gibbons,
Rank Correlation Methods .Oxford University Press, 5th ed., 1990.[24] R. M. Gray,
Entropy and Information Theory . Springer, 2011.[25] H. Liu, F. Han, M. Yuan, J. Lafferty, and L. Wasserman,“High-dimensional semiparametric gaussian copula graphicalmodels,”
The Annals of Statistics , vol. 40, no. 4, pp. 2293–2326, 2012.[26] B. C. Arnold, N. Balakrishnan, and H. N. Nagaraja,
A FirstCourse in Order Statistics . SIAM, 2008.[27] M. Capinski and P. E. Kopp,
Measure, Integral andProbability . Springer, 2004.[28] R. C. Shekhar, G. S. Sankar, C. Manzie, and H. Nakada,“Efficient calibration of real-time model-based controllers fordiesel engines — part i: Approach and drive cycle results,”in
IEEE 56th Annual Conference on Decision and Control(CDC) , IEEE, 2017.[29] A. S. Ira, C. Manzie, I. Shames, R. Chin, D. Neˇsi´c, H. Nakada,and T. Sano, “Tuning of multivariable model predictivecontrollers through expert bandit feedback,”
InternationalJournal of Control , 2020.[30] A. I. Maass, C. Manzie, I. Shames, R. Chin, D. Neˇsi´c,N. Ulapane, and H. Nakada, “Tuning of model predictiveengine controllers over transient drive cycles,” in , 2020.[31] R. Chin, A. I. Maass, N. Ulapane, C. Manzie, I. Shames,D. Neˇsi´c, J. E. Rowe, and H. Nakada, “Active learningfor linear parameter-varying system identification,” in
IFACWorld Congress , 2020.[32] T. Nagler and K. Wen, kdecopula: Kernel Smoothing forBivariate Copula Densities , 2018. R package version 0.9.2.[33] G. Geenens, A. Charpentier, and D. Paindaveine, “Probittransformation for nonparametric kernel estimation of thecopula density,”
Bernoulli
OpenStack Summit , 2016.[36] W. Hoeffding, “Probability inequalities for sums of boundedrandom variables,”
Journal of the American StatisticalAssociation , vol. 58, no. 301, pp. 13–30, 1963., vol. 58, no. 301, pp. 13–30, 1963.