Sample Reuse Techniques of Randomized Algorithms for Control under Uncertainty
aa r X i v : . [ m a t h . O C ] M a y Sample Reuse Techniques of Randomized Algorithmsfor Control under Uncertainty
Xinjia Chen, Jorge L. Aravena and Kemin ZhouDepartment of Electrical and Computer EngineeringLouisiana State UniversityBaton Rouge, LA 70803 { chan,aravena,kemin } @ece.lsu.eduTel: (225)578- { } Fax: (225) 578-5200November 20, 2018
Abstract
Sample reuse techniques have significantly reduced the numerical complexity of probabilisticrobustness analysis. Existing results show that for a nested collection of hyper-spheres the com-plexity of the problem of performing N equivalent i.i.d. (identical and independent) experimentsfor each sphere is absolutely bounded, independent of the number of spheres and depending onlyon the initial and final radii.In this chapter we elevate sample reuse to a new level of generality and establish that thenumerical complexity of performing N equivalent i.i.d. experiments for a chain of sets is absolutelybounded if the sets are nested. Each set does not even have to be connected, as long as thenested property holds. Thus, for example, the result permits the integration of deterministicand probabilistic analysis to eliminate regions from an uncertainty set and reduce even furtherthe complexity of some problems. With a more general view, the result enables the analysis ofcomplex decision problems mixing real-valued and discrete-valued random variables. The results presented in this chapter evolved from our previous work in probabilistic robustnessanalysis. For completeness we give a brief overview of the problem originally considered and showhow it is embedded in our present, more general, formulation.Probabilistic robust control methods have been proposed with the goal of overcoming the NP hardcomplexity and the conservatism associated with the deterministic worst-case framework of robust1ontrol (see, [1]–[26] and the references therein). At the heart of the probabilistic control paradigmis the idea of sacrificing the extreme instances of uncertainty. This is in sharp contrast to thedeterministic robust control which approaches the issue of uncertainty with a “worst case” philosophy.Due to the obvious possibility of violation of robustness requirements associated with the probabilisticmethod, it has been the common contention that applying the probabilistic method for control designmay be more dangerous than using the deterministic worst-case approach. Interestingly, it has beendemonstrated (Chen, Aravena and Zhou, [8]) that it is not uncommon for a probabilistic controller(which guarantees only most instances of the uncertainty bounding set assumed in the design) to besignificantly less risky than a deterministic worst-case controller. The reasons are the “uncertaintyin modeling uncertainties” and the fact that the worst-case design cannot, in some instances, be“all encompassing.” Although this philosophy is proposed in the context of robust design, a directconsequence on robustness analysis is that it is not necessary to evaluate the system robustness in adeterministic worst-case framework. This is because a system certified to be robust in a deterministicworst-case framework is not necessarily less risky than a system with a probability that the robustnessrequirement is not always satisfied.While the worst-case control theory uses the deterministic robustness margin to evaluate the sys-tem robustness, probabilistic control theory introduced the robustness function as a tool to measurethe robustness properties of a control system subject to uncertainties. Such function is defined as P ( r ) = vol( { X ∈ B r | P is guaranteed for X } ) / vol( B r )where vol( . ) is the Lebesgue measure, P denotes the robustness requirement, and B r denotes the un-certainty bounding set with radius r . This function describes quantitatively the relationship betweenthe proportion of systems guaranteeing the robustness requirement and the radius of uncertainty set.Such a function has been proposed by a number of researchers. For example, Barmish and Lagoa [3]have constructed a curve of robustness margin amplification versus risk in a probabilistic setting.The so-called robustness function can serve as a guide for control engineers in evaluating therobustness of a control system once a controller design is completed. In addition to overcome theissues of conservatism and NP complexity of the worst-case robustness analysis, the probabilisticrobustness analysis based on the robustness function has the following advantages.First, the robustness function can address problems which are intractable by deterministic worst-case methods. For many real world control problems, robust performance is more appropriatelycaptured by multiple objectives such as stability, transient response (specified, for example, in termsof overshoot, rise time and settling time), disturbance rejection measured by H ∞ or H norm, etc.Thus, for a more insightful analysis of the robust performance of uncertain systems, the robustness2equirement is usually multi-objective. The complexity of such robustness requirement can easilymake the robustness problems intractable by the deterministic worst-case methods. For example,existing methods fail to solve robustness analysis problems when the robustness requirement is acombination of H ∞ norm bound and stability. However, the robustness curve can still be constructedand provides sufficient insights on the robustness of the system.Second, the probability that the robustness requirement is guaranteed can be inferred from therobustness function, while the deterministic margin has no relationship to such probability. Based onthe assumption that the density function of uncertainty is radially symmetric and non-increasing withrespect to the norm of uncertainty, it has been shown in [2] that the probability that P is guaranteedis no less than inf ρ ∈ (0 ,r ] P ( ρ ) when the uncertainty is contained in a bounding set with radius r . Theunderlying assumption is in agreement with conventional modeling and manufacturing practices thatconsider uncertainty as unstructured, with all directions equally likely, and make small perturbationsmore likely than large perturbations. It was discovered in [2] that the robustness function is notmonotonically decreasing. Hence, the lower bound of the probability depends on P ( ρ ) for all ρ ∈ (0 , r ].At the first glance, it may seem difficult or infeasible to estimate inf ρ ∈ (0 ,r ] P ( ρ ) since the estimationof P ( ρ ) for every ρ relies on the Monte Carlo simulation. For such probabilistic method to overcomethe NP hard of worst-case methods, it is necessary to show that the complexity for estimatinginf ρ ∈ (0 ,r ] P ( ρ ) for a given r is polynomial in terms of computer running time and memory space .Recently, sample reuse techniques have been developed in [7, 9, 10] and it is demonstrated that thecomplexity in terms of space and time is surprisingly low and is linear in the uncertainty dimensionand the logarithm of the relative width of the range of uncertainty radius .Third, using the robustness function for the evaluation of the system robustness allows the de-signer to make more accurate statements than using just the robustness margins. Here, by robustnessmargins, we mean both the deterministic robustness margin and its risk-adjusted version – the prob-abilistic robustness margin , defined as ρ ε = sup { r | P ( r ) ≥ − ε } . For virtually all practical systems,the deterministic robustness margin can be viewed as a special case of the probabilistic robustnessmargin ρ ε with ε = 0. This property should not be confused with the numerical accuracy in evalu-ating margins nor with the issue of conservatism. The fundamental reason is the lack of informationthat can be available from the robustness margins. It has been demonstrated in [9, 10] that both thedeterministic and probabilistic robustness margins have inherent limitations. In other words, using ρ ε as a measure of robustness can be misleading. Figure 1 shows the conceptual robustness functionsfor two controllers. From the figure it is apparent that the robustness margin with ε ∈ [0 , . ρ Aε , for controller A is much larger than the corresponding value, ρ Bε , for controller B . Then, basedon the comparison of ρ ε , control systems A is certainly more robust and should be recommendedfor safety purposes. However, if the coverage probability of the uncertainty set B ρ Aε is low and the3 P r opo r t i on Controller AController B
Figure 1: Comparison of Controller Alternativesrobustness curve (i.e, the graphical representation of the robustness function) of control system A rolls off rapidly beyond ρ Aε , then the robustness of system A may be poor. On the other hand, if therobustness curve of control system B maintains a high level for a wide range of uncertainty radius,then control system B may be actually more robust than system A .In general, the evaluation of the robustness function requires extensive Monte Carlo simulations.In applications, many iterations of robust design and analysis may be needed in the developmentof a satisfactory control system, it is therefore crucial to improve the efficiency of estimating therobustness function. Complexity has been reduced by considering models for the uncertainties thatdepend on a single “uncertainty radius.” In this case, the formal evaluation of the robustnessfunction requires N , i.i.d. uncertain parameter selections for each of a sequence r < r < · · · < r m of uncertainty radii, which is still a daunting task. The sample reuse principle allows carrying theevaluation to any degree of accuracy and with absolute bounds in complexity (see, [7, 9, 10]).The use of uncertainty bounding sets with a given radius can still be viewed as a limitation sinceone may have to include situations that never arise in practice. This is the limitation addressed in thiswork. Moreover, we cast to result as a general problem in decision-making under uncertainties. Weshow that the sample reuse principle can be applied with equal effectiveness in a much more generalscenario. We shall be concerned with an arbitrary sequence of nested sets B ⊂ B ⊂ · · · ⊂ B m where we need to perform N experiments for elements uniformly and independently drawn from eachset. For each element it is necessary to verify if a certain statement P is true or not.The idea of the sample reuse principle is to start experiments from the largest set and if italso belongs to smaller subset the experimental result is saved for later use in the smaller set.4 he experimental result that can be saved includes not only the samples from the sets but also theoutcome of the evaluation of the statement P . We note that this formulation enables the efficient useof Monte Carlo simulations for the evaluation of multi-dimensional distributions and the combinationof continuous and discrete variables. Consider a sequence of nested sets B ⊂ B ⊂ · · · ⊂ B m . If one needs to perform N experimentsfrom each set, a conventional approach would require a total of N m experiments. However, due tosample reuse, the actual number of experiments for set B i is a random number n i , which is usuallyless than N . Our main result, which depends only on the nested property, shows that this strategysaves a significant amount of experimental or computational effort. Theorem 1
Let V min and V max be constants such that < V min ≤ V max < ∞ . For an arbitrarysequence of nested sets B i , i = 1 , · · · , m such that B ⊂ B ⊂ · · · ⊂ B m and V min ≤ vol( B ) ≤ vol( B m ) ≤ V max , the expected total number of experiments, n , to obtain N experiments for each setis absolutely bounded, independent of the number, m , of sets in the chain and given by E [ n ] < (cid:18) V max V min (cid:19) N where E [ . ] denotes the expectation of a random variable. Remark 1
The fact that the result is independent of the number of sets in the nested chain mayappear surprising but it is a direct consequence of the power of the sample reuse principle. Looselyspeaking, the more sets are there in the chain, the more chances that an experiment can be reused. Infact this characteristic makes the result especially powerful when the demands for accuracy, indicatedby a large number of sets, is high.
As a special case of Theorem 1, we have the following result, reported by Chen, Zhou, Aravena[9, 10] and presented here now as a corollary to our main result.
Corollary 1
Let r min and r max be constants such that < r min ≤ r max < ∞ . Let B r denote theuncertainty bounding sets with radius r . Suppose that vol( B r ) = r d vol( B ) for any radius r . Then,for any sequence of radius r < r < · · · < r m such that r min ≤ r < r m ≤ r max , E [ n ] < (cid:18) d ln r max r min (cid:19) N. .1 Observations about the result In the result presented here, the only requirement for the uncertainty sets is that they must be nested.This is in sharp contrast to the existing model of uncertainty wherein we define an uncertainty“radius” and larger uncertainty sets are simply amplified versions of the smaller sets, defining achain of sets of essentially the same shape. Such limitation is completely eliminated now.Another significant feature of the new result is that the uncertainty sets can have “holes” inthem; i.e., one can easily eliminate situations, or values, that cannot physically take place. In a latersection we examine this option in more detail and show the advantage provided by the general result.In fact, as long as the sets are nested, the sets don’t even have to be connected. This permitsmodeling of situations that were not feasible, for example, combination of discrete and continuous-valued random variables.Finally, the power of the result lies in the efficient use of experiments. The property that is beingtested is not germane to the result. In this sense, we have provided a tool for decision making incomplex environments.
This section provides a formal proof of our main result. First we establish some preliminary resultsthat will be needed in the proof.
Lemma 1
For i = 2 , . . . , m , E [ n i − ] = N − m X j = i (cid:18) v i − v j (cid:19) E [ n j ] where v j = vol( B j ) , j = 1 , · · · , m .Proof. Let m ≥ j ≥ i ≥
2. Let q , q , . . . , q n j be the samples generated from B j . For ℓ = 1 , . . . , n j ,define random variable X ℓj,i − such that X ℓj,i − = ( q ℓ fall in B i − , . Based on the principle of sample reuse, we have n m = N, n j = N − m X k = j +1 n k X ℓ =1 X ℓk,j , j = 1 , · · · , m − , n j depends only on the samples generated from sets B k , j + 1 ≤ k ≤ m . Hence, event { n j = n } is independent of event { X ℓj,i − = 1 } . It follows thatPr n X ℓj,i − = 1 , n j = n o = Pr n X ℓj,i − = 1 o Pr { n j = n } where Pr { . } denotes the probability of an event. Recall that q ℓ is a random variable with uniformdistribution over B j , we havePr n X ℓj,i − = 1 o = v i − v j , ℓ = 1 , · · · , N. By the principle of sample reuse, N = n i − + m X j = i n j X ℓ =1 X ℓj,i − . Thus for i = 2 , . . . , m , E [ n i − ] = N − m X j = i E " n j X ℓ =1 X ℓj,i − = N − m X j = i N X n =1 n X ℓ =1 Pr n X ℓj,i − = 1 , n j = n o = N − m X j = i N X n =1 n X ℓ =1 Pr n X ℓj,i − = 1 o Pr { n j = n } = N − m X j = i N X n =1 n (cid:18) v i − v j (cid:19) Pr { n j = n } = N − m X j = i (cid:18) v i − v j (cid:19) N X n =1 n Pr { n j = n } = N − m X j = i (cid:18) v i − v j (cid:19) E [ n j ] . ✷ This result gives the expected number of experiments for a set, B i − , in terms of the expectedvalues for all the sets that contain it. The recursion can be solved as follows: Since all the experimentsmust belong to the set B m we have E [ n m ] = N , now for i < m we can write E [ n i ] = N − m X j = i +1 (cid:18) v i v j (cid:19) E [ n j ] = ⇒ m X j = i +1 (cid:18) v i v j (cid:19) E [ n j ] = N − E [ n i ]7nd E [ n i − ] = N − m X j = i (cid:18) v i − v j (cid:19) E [ n j ]= N − (cid:18) v i − v i (cid:19) E [ n i ] − m X j = i +1 (cid:18) v i − v j (cid:19) E [ n j ]= N − (cid:18) v i − v i (cid:19) E [ n i ] − (cid:18) v i − v i (cid:19) m X j = i +1 (cid:18) v i v j (cid:19) E [ n j ] . Therefore, E [ n i − ] = N − (cid:18) v i − v i (cid:19) E [ n i ] − (cid:18) v i − v i (cid:19) [ N − E [ n i ]]= N − (cid:18) v i − v i (cid:19) N. Thus we have established
Lemma 2
Under the sample reuse principle, for an arbitrary sequence of nested sets B i , i =1 , · · · , m such that B ⊂ B ⊂ · · · ⊂ B m and < vol( B ) ≤ vol( B m ) < ∞ , the expected totalnumber of experiments, E [ n i ] , to obtain N experiments for the set B i is E ( n i ) = N − v i v i +1 N ; i = 1 , , . . . , m − . Remark 2
We note that if we use the convention v m +1 = ∞ then the previous expression can bemade valid for i = m .Once more one can see the power of the sample reuse principle. If any two sets in the chain are“very similar,” then most of the experiments for the larger set can be reused. Now we establish a basic inequality that will be used to prove the main result.
Lemma 3
For any x > , x + ln x > . Proof.
Let f ( x ) = 1 x + ln x. Then f (1) = 1 and d f ( x ) dx = x − x > , ∀ x > . It follows that f ( x ) > , ∀ x > ✷ Using the previous result now we can prove 8 emma 4
For an arbitrary sequence of numbers < r < r < · · · < r m , m − m − X i =1 r i r i +1 < (cid:18) r m r (cid:19) . Proof.
Observing that r m r = m − Y i =1 r i +1 r i , we have ln (cid:18) r m r (cid:19) = m − X i =1 ln (cid:18) r i +1 r i (cid:19) . Therefore, m − X i =1 r i r i +1 + ln (cid:18) r m r (cid:19) = m − X i =1 " r i +1 r i + ln (cid:18) r i +1 r i (cid:19) . Since r i +1 r i > , i = 1 , · · · , m −
1, it follows from Lemma 3 that1 r i +1 r i + ln (cid:18) r i +1 r i (cid:19) > , i = 1 , · · · , m − . Hence, m − X i =1 r i r i +1 + ln (cid:18) r m r (cid:19) > m − . The lemma is thus proved. ✷ Now we are in the position to prove Theorem 1. By Lemma 2, we have E [ n ] = E " m X i =1 n i = N + m − X i =1 (cid:20) N − N (cid:18) v i v i +1 (cid:19)(cid:21) = N m − N m − X i =1 v i v i +1 . Therefore, by Lemma 4, E h n N i = m − m − X i =1 v i v i +1 < (cid:18) v m v (cid:19) ≤ (cid:18) V max V min (cid:19) and thus the proof of Theorem 1 is completed. 9 Combination with Deterministic Methods
In this section we demonstrate the flexibility allowed by the general nested conditions by examining asituation that could not be properly handled with existing tools. Especially, we consider uncertaintysets where, for example by deterministic analysis, one can establish subsets that are not feasible; i.e.,the uncertainty set has “holes” in it.There exist rich results for computing the exact or conservative bounds of the robustness margins,e.g., structure singular value µ theory or Kharitonov type methods. Let S r be a hyper-sphere withradius r . Suppose the robustness requirement is satisfied for the nominal system. By the deterministicapproach, in some situations, it may be possible to determine r such that the robustness requirementis satisfied for S r . Then, to estimate P ( r ) = vol( { q ∈ S r | P is guaranteed for q } )vol( S r )for r < r < · · · < r m with r > r , we can apply the sample reuse techniques over a nested chainof “donut” sets D ⊂ D ⊂ · · · ⊂ D m with D i = S r i \ S r , i = 1 , · · · , m where “ \ ” denotes the operation of set minus. Instead of directly estimate P ( r i ), we can estimate ℘ i = vol( { q ∈ D i | P is guaranteed for q } )vol( D i )and obtain P ( r i ) = ℘ i vol( D i ) + vol( S r )vol( S r i ) , i = 1 , · · · , m. Let b ℘ i be the estimate of ℘ i . It can be shown that E (cid:20) b ℘ i vol( D i ) + vol( S r )vol( S r i ) (cid:21) = P ( r i )and E (cid:20) b ℘ i vol( D i ) + vol( S r )vol( S r i ) − P ( r i ) (cid:21) = (1 − ℘ i ) ℘ i λ i N where λ i = vol( D i )vol( S r i ) , i = 1 , · · · , m. If we obtain an estimate b P ( r i ) of P ( r i ) without applying any deterministic technique, then E [ b P ( r i ) − P ( r i )] = (1 − ℘ i ) λ i [1 − (1 − ℘ i ) λ i ] N .
It can be shown that, the ratio of variance of the two estimate is E h b ℘ i vol( D i )+vol( S r )vol( S ri ) − P ( r i ) i E [ b P ( r i ) − P ( r i )] = ℘ i λ i − (1 − ℘ i ) λ i < . N , the estimation can be more accurate when combiningthe deterministic results and the probabilistic techniques. Since the accuracy is exchangeable withthe computational effort, we can conclude that the computational effort can be reduced by blendingthe power of deterministic methods and randomized algorithms with the sample reuse mechanism. Sample reuse has made possible the evaluation of robustness functions with, essentially, arbitraryaccuracy and bounded complexity. In this work we have expanded the power of the sample reuseconcept and shown that it can be applied to the evaluation of complex decision problem with theonly requirement that the uncertainty sets be nested. We have demonstrated the power of thegeneralization by integrating deterministic analysis and randomized algorithms and showing that onecan develop even more efficient computational approaches for the evaluation of robustness functions.