Parameter Estimation of Nonlinearly Parameterized Regressions without Overparameterization nor Persistent Excitation: Application to System Identification and Adaptive Control
Romeo Ortega, Vladislav Gromov, Emmanuel Nuño, Anton Pyrkin, Jose Guadalupe Romero
aa r X i v : . [ m a t h . O C ] O c t Parameter Estimation of Nonlinearly Parameterized Regressions withoutOverparameterization nor Persistent Excitation: Application to SystemIdentification and Adaptive Control
Romeo Ortega † , ∗ , Vladislav Gromov † , Emmanuel Nu˜no ‡ , Anton Pyrkin † and Jose Guadalupe Romero § October 18, 2019
Abstract
In this paper we propose a solution to the problem of parameter estimation of nonlinearly param-eterized regressions —continuous or discrete time—and apply it for system identification and adaptivecontrol. We restrict our attention to parameterizations that can be factorized as the product of twofunctions, a measurable one and a nonlinear function of the parameters to be estimated. Although inthis case it is possible to define an extended vector of unknown parameters to get a linear regression,it is well-known that overparameterization suffers from some severe shortcomings. Another feature ofthe proposed estimator is that parameter convergence is ensured without a persistency of excitationassumption. It is assumed that, after a coordinate change, some of the elements of the transformedfunction satisfy a monotonicity condition. The proposed estimators are applied to design identifiers andadaptive controllers for nonlinearly parameterized systems. In continuous-time we consider a generalclass of nonlinear systems and those described by Euler-Lagrange models, while in discrete-time weapply the method to the challenging problems of direct and indirect adaptive pole-placement. The ef-fectiveness of our approach is illustrated with several classical examples, which are traditionally tackledusing overparameterization and assuming persistency of excitation.
It is well known that nonlinear parameterizations are inevitable in any realistic practical problem [4, 7,15, 22, 23, 27]. Unfortunately, designing adaptive (identification or control) algorithms for nonlinearlyparameterized systems is a difficult poorly understood problem. Some results for gradient estimators have been reported in the literature for convexly parameterized continuous-time (CT) systems. It wasfirst reported in [9] (see also [26]) that convexity is enough to ensure that the gradient search “goes inthe right direction” in a certain region of the estimated parameter space. The idea is then to apply astandard adaptive scheme in this region, while in the “bad” region either the adaptation is frozen and arobust constant parameter controller is switched-on [10] or, as proposed in [1], the adaptation is runningall the time and stability is ensured with a high-gain mechanism which is suitably adjusted incorporatingprior knowledge on the parameters. In [24] reparametrization to convexify an otherwise non-convexlyparameterized system is proposed. See also [25] and [35] for some interesting results along these lines,where the controller and the estimator switch between over/underbounding convex/concave functions. ∗ R. Ortega is with Laboratoire des Signaux et Syst`emes, CNRS–SUPELEC, Gif–sur–Yvette, France, e-mail: [email protected] † V. Gromov and A. Pyrkin are with Faculty of Control Systems and Robotics, ITMO University, Saint Petersburg, Russia,email: gromov { pyrkin } @itmo.ru ‡ E. Nu˜no is with Department of Computer Science, CUCEI, University of Guadalajara, Guadalajara, Mexico, email:[email protected] § J. G. Romero is with Departamento Acad´emico de Sistemas Digitales, ITAM, Ciudad de M´exico, M´exico, e-mail:[email protected]
1n the other hand, using the Immersion and Invariance adaptation laws proposed in [3], strongerresults were obtained in [20, 21] invoking the property of monotonicity , see also [35, 36] for related results.The main advantage of using monotonicity, instead of convexity, is that in the former case the parametersearch “goes in the right direction” in all regions of the estimated parameter space—this is in contrast tothe convexity-based designs where, as pointed out above, this only happens in some regions of this space. Since this important difference is not always appreciated, let us illustrate it with the simple case of ascalar, CT, nonlinearly parameterized regression equation (NPRE) y = H ( t, θ ) , where y ( t ) ∈ R , h : R > × R q → R and θ ∈ R q is the vector of unknown parameters. If we assume that h is convex in θ the gradient descent search˙ˆ θ = " ∂ H ( t, ˆ θ ) ∂ ˆ θ ⊤ [ H ( t, ˆ θ ) − y ]ensures ˜ θ ⊤ ˙˜ θ ≤ provided H ( t, ˆ θ ) ≥ H ( t, θ ), where we defined the parameter error vector ˜ θ := ˆ θ − θ . Onthe other hand, if we assume that h is monotone decreasing in θ the simple estimator˙ˆ θ = H ( t, ˆ θ ) − y ensures ˜ θ ⊤ ˙˜ θ ≤ all the time .To the best of the authors’ knowledge no developments—similar to the ones mentioned above—havebeen reported for case of nonlinearly parameterized discrete-time (DT) regressions that, in spite of its greatpractical importance, have attracted less attention in the identification and adaptive control community.One of the objectives of our paper is to contribute, if modestly, towards the development of estimationalgorithms for DT NPRE. In particular, we provide solutions to the, essentially open, problems of directand indirect adaptive pole-placement control (APPC) without overparameterization nor persistency ofexcitation (PE) requirements. It should be pointed out that a solution to the direct APPC problem usingoverparameterization, hence requiring some excitation conditions, has been recently reported in [29].A very important drawback of the aforementioned approaches is that the monotonicity or convexityconditions are imposed on functions that depend, not only on the parameters, but also on external signals, e.g. , time or the system state. This renders the verification of the condition very hard to carry out. Thisunfortunate situation happens even in the case when the uncertain terms appear as products of a functionof the unknown parameters times a known function—the so-called, factorizable mappings , that is NPREof the form y = Ω S ( θ ) , with S : R q → R p , with p > q . Although in this case it is possible to define the extended parameter vector θ a := S ( θ ) to obtain a linear parametrization, overparametrization suffers from the following well-knownshortcomings [14, 22, 31]. S1 Performance degradation, e.g. , slower convergence, due to the need of a search in a larger parameterspace. S2 More stringent conditions imposed on the reference signals to ensure the PE requirement needed forconvergence of the new parameters. The relation between these two approaches follows invoking Kachurovskii’s Theorem that establishes the equivalencebetween convexity of a function and monotonicity of its gradient [13, Theorem 4.1.4], see also [6]. We recall that a bounded vector signal Ω ∈ R q is said to be PE if there exist δ ∈ R > such that R t + Tt Ω( τ )Ω ⊤ ( τ ) dτ ≥ δI q for some T ∈ R > and all t ∈ R ≥ in CT, or P k + Kj = k +1 Ω( j )Ω ⊤ ( j ) ≥ δI m , ∀ k ∈ N ≥ , for some K ∈ N > , with K ≥ m , in DT. Inability to recover the true parameters—except for injecting mappings. This stymies the applicationof this approach in situations, where the actual parameters are needed, e.g. , in direct adaptive control. S4 Conservativeness introduced when incorporating prior knowledge in restricted parameter estimation. S5 Reduction of the domain of validity of the estimates stemming from the, in general only local, invert-ibility of the overparameterization mappings.In this paper we propose a parameter estimator for monotonic, factorizable NPRE that achieves thefollowing objectives. O1 It does not rely on overparameterization. O2 Imposes the monotonicity property directly to the function S ( θ ). O3 Ensures parameter convergence without the stringent PE requirement.CT estimators for NPRE with factorizable mappings that avoid overparameterization and rely onmonotonicity have been reported in [21, Section 3] and [2, Section III]. In [21] neither the second nor thethird objectives above are achieved. On the other hand, in [2] these objectives are achieved, via the useof a dynamic regressor extension and mixing (DREM) estimator. As is well known, the main feature ofDREM is that it generates, out of a q -dimensional regression equations, one scalar equation for each of the q unknown parameters. Another important feature of DREM is that parameter convergence is ensuredwithout assuming PE.In this paper, we also use DREM to derive both, CT and DT, parameter estimators. We obtain simplerand stronger results than [2] due to the following three key modifications. M1 Generate the extended regressor matrix using the linear time-varying (LTV) operators first introducedin [19]. This avoids the need to select several linear, scalar operators, whose choice is difficult to decide,and provides sharper convergence results. M2 Directly apply the “mixing” operation—that is the multiplication by the adjugate of the extendedmatrix—to generate the scalar regressions. This is in contrast to the unnecessarily complex matrixfactorization proposed in [2]. M3 Incorporate the possibility of adding a change of coordinates to the original parameters to satisfy therequired monotonicity property.The remainder of the paper is organized as follows. First, we present in Section 2 a general result of“monotonizability” of factorizable NPRE. In Section 3 we apply DREM to generate the scalar regressors.In Section 4 the CT and DT DREM-based estimators are presented. Section 5 is devoted to the applicationof this estimator to the problem of adaptive control of CT nonlinearly parameterized, nonlinear systems,with particular emphasis on Euler-Lagrange (EL) models. The case of DT NPRE is illustrated in Section6 with the example of identification of a solar heated house proposed in [22, pp. 130] and with the classicalproblems of direct and indirect APPC [12]. The paper is wrapped-up with concluding remarks in Section 7.
Notation. I n is the n × n identity matrix. R > , R ≥ , N > and N ≥ denote the positive and non-negativereal and integer numbers, respectively. For n ∈ N > we define the set ¯ n := { , , . . . , n } . For x ∈ R n , wedenote | x | := x ⊤ x . CT signals s : R ≥ → R are denoted s ( t ), while for DT sequences s : N ≥ → R weuse s ( k ) := s ( kT s ), with T s ∈ R > the sampling time. When a formula is applicable to CT signals andDT sequences the time argument is omitted . The action of an operator H : L ∞ → L ∞ on a CT signal u ( t ) is denoted H [ u ]( t ), while for an operator H : ℓ ∞ → ℓ ∞ and a sequence u ( k ) we use H [ u ]( k ). With i ∈ N > we define the shift operator for DT sequences q ± i u ( k ) := u ( k ± i ), and the differentiation operator3or CT signals p i [ u ]( t ) := d i udt i . All mappings and reference signals are assumed smooth . Given a function F : R n → R we define the differential operators ∇ F := (cid:16) ∂F∂x (cid:17) ⊤ and ∇ F := ∂ F∂x . For general mappings S : R n → R n , the ( i, j )-th element of its Jacobian is defined as( ∇S ) ( ij ) := ∂ S j ∂x i , ( i, j ) ∈ ¯ n × ¯ n. In this section we identify the class of NPRE that we consider in the paper. Namely, factorizable NPRE,where the mapping dependent on the unknown parameters verifies a monotonicity condition.
In many system identification and adaptive control applications one is confronted with the problem ofestimation of the parameters appearing in a NPRE of the form y = Ω S ( θ ) + ε (1)where y ∈ R n , Ω ∈ R n × p are measurable signals, θ ∈ R q is a constant vector of unknown parameters, S : R q → R p , with p > q, (2)and ε is a (generic) exponentially decaying term. The task is to identify on-line the parameters θ , out ofthe measurements of y and Ω. Remark 1.
The NPRE (1) is, of course, a particular case of the more general, non-factorizable , regression y ( · ) = H ( · , θ ), with ( · ) = t in CT or k in DT. But, it is more often encountered than the classical linear re-gression y = Ω θ —and the solution of the associated estimation problem is far more complicated. Althoughin the factorizable case it is possible to introduce extra parameters to obtain a linear parametrization, e.g. , define a bigger dimensional vector θ a := S ( θ ) ∈ R p , overparametrization suffers from the well-knownshortcomings S1 - S5 mentioned in the Introduction. Remark 2.
For the sake of simplicity, we present y and Ω as functions of time, in the understanding thatthey may be functions of measurable signals evaluated at time t in CT or k in DT, for instance, the stateof a dynamical system—as shown below. Also, following standard practice, in the sequel we disregard thepresence of the term ε , stemming from the effect of the initial conditions of various filters used to generatethe regression, see [2] for a discussion on this assumption. Similarly to [2, 20, 21] the key property of the parameterization that we will exploit is P -monotonicity,which is defined a follows. Definition 1.
Given a positive definite matrix P ∈ R q × q , a mapping L : R q → R q is strongly P -monotone if and only if there exists a constant ρ ∈ R > such that( a − b ) ⊤ P [ L ( a ) − L ( b )] ≥ ρ | a − b | > , ∀ a, b ∈ R q , a = b. (3)The following interesting result of Demidovich [8]—see also [28]—provides a simple way to verify P -monotonicity. Lemma 1.
A sufficient condition for a differentiable mapping L : R q → R q to be strictly P –monotone is P ∇L + ( ∇L ) ⊤ P ≥ ρI q > . (4)4he following “monotonizability” assumption via coordinate change is instrumental for our furtherdevelopments. Assumption 1.
Consider the mapping S ( θ ). There exists:(i) a bijective mapping D : R q → R q , θ η with right inverse D I : R q → R q , η θ ;(ii) a permutation matrix T ∈ R p × p and;(iii) a positive definite matrix P ∈ R q × q such that P ∇W ( η ) C ⊤ + C [ ∇W ( η )] ⊤ P ≥ ρI q > , (5)where W ( η ) := S ( D I ( η )) , C := (cid:2) I q | q × ( p − q ) (cid:3) T. (6) (cid:3)(cid:3)(cid:3) In words, the construction associated with Assumption 1 proceeds as follows. First, introduce a bijectivecoordinate change for the parameters θ , namely η = D ( θ ), with inverse θ = D I ( η ). Second, write theoriginal mapping S ( θ ) in terms of the parameters η via the definition of the new mapping W ( η ) := S ( D I ( η )) . Third, assuming that these mapping contains q elements that are “good”—term to be defined below—placethem at the top with the permutation matrix T and select them with the fat matrix (cid:2) I q | q × ( p − q ) (cid:3) .Whence, define the new “good” mapping G : R q → R q as G ( η ) := C W ( η ) . (7)Observing that ∇G = ∇W C ⊤ , and invoking Lemma 1, the condition (5) ensures that this “good” mappingis strongly P -monotonic. For future reference we rewrite this condition in terms of the “good” mapping as P ∇G ( η ) + [ ∇G ( η )] ⊤ P ≥ ρI q > . (8)Using the definitions above in the NPRE (1) we obtain the new NPRE in terms of the parameters η as y = Ω W ( η ) (9) Remark 3.
To obtain a NPRE containing only “good” functions—but without the essential parameterchange θ η —a complicated reordering and mixing of the NPRE (9) is proposed in [2]. In the nextsubsection we show that direct application of DREM generates an alternative, much simpler procedure, tocarry out this task. In this section we apply DREM [2]—with an LTV operator—to the p -dimensional NPRE (9), and thenselect the “good” terms via (7) to generate q scalar NPRE. First, we present the construction for CTsignals and then treat the case of DT sequences. 5 .1 Continuous-time case
Proposition 1.
Consider the NPRE (9) for CT signal. Define the signals˙ Y ( t ) = − λY ( t ) + Ω ⊤ ( t ) y ( t )˙Φ( t ) = − λ Φ( t ) + Ω ⊤ ( t )Ω( t ) Y ( t ) = C adj { Φ( t ) } Y ( t )∆( t ) = det { Φ( t ) } . (10)The q scalar NPRE Y i ( t ) = ∆( t ) G i ( η ) , i ∈ ¯ q ⇔ Y ( t ) = ∆( t ) G ( η ) (11)hold. Proof.
Multiplying (9) by Ω ⊤ ( t ) and applying the stable, linear time-invariant (LTI) filter H ( p ) = 1 p + λ (12)where λ ∈ R > , we get H ( p )[Ω ⊤ y ]( t ) = H ( p )[Ω ⊤ Ω]( t ) W ( η ) , whose state realization is given in (10) and yields the relation Y ( t ) = Φ( t ) W ( η ) . Now, multiplying this equation by the adjoint of the extended regressor matrix Φ( t ) we obtainadj { Φ( t ) } Y ( t ) = ∆( t ) W ( η )where we used the fact that for all— possibly singular — p × p matrices A we have adj { A } A = det { A } I p . The proof is completed multiplying the last equation by C , invoking (7), and noting that ∆( t ) is a scalar . (cid:3)(cid:3)(cid:3) Remark 4.
The construction of the extended regressor Φ( t ) proposed above is done following verbatim theDREM procedure of [2] with LTV operators. This construction was first proposed in [19] and is sometimescalled Memory Regressor Extension [11]. Proposition 2.
Consider the NPRE (9) for DT sequences. Fix 0 < α < Y ( k ) = − αY ( k −
1) + Ω ⊤ ( k − y p ( k − k ) = − α Φ( k −
1) + Ω ⊤ ( k − k − Y ( k ) = C adj { Φ( k ) } Y ( k )∆( k ) = det { Φ( k ) } . (13)The q scalar NPRE Y i ( k ) = ∆( k ) G i ( η ) , i ∈ ¯ q ⇔ Y ( k ) = ∆( k ) G ( η ) (14)hold. Proof.
The proof follows verbatim the one given in Proposition 1 replacing the CT filter (12) by the stable,LTI, DT filter q + α . (cid:3)(cid:3)(cid:3) Remark 5.
The construction of the extended regressor Φ( k ) given above is the discrete-time version ofthe one proposed in [19] and may be found in [11]. 6 Parameter Estimators Convergence Analysis
In this section we present the CT and DT estimation laws for the parameters η of the NPRE (11) and(14), respectively. Proposition 3.
Consider the NPRE (11) satisfying (8) of Assumption 1. Propose the parameter estimator˙ˆ η ( t ) = Γ P ∆( t )[ Y ( t ) − ∆( t ) G (ˆ η ( t ))] , (15)with Γ ∈ R q × q , Γ > norm of the parameter estimation vector ˜ η ( t ) := ˆ η ( t ) − η is monotonically non-increasing, thatis, | ˜ η ( t ) ≤ | ˜ η ( t ) | , ∀ t ≥ t ∈ R ≥ . (16)(ii) The following implication holds ∆( t ) / ∈ L ⇒ lim t →∞ | ˜ η ( t ) | = 0 . Proof.
Replacing (11) in (15) we get the error equation˙˜ η ( t ) = − ∆ ( t )Γ P [ G (ˆ η ( t )) − G ( η )] . To analyse its stability define the Lyapunov function candidate V (˜ η ) = ˜ η ⊤ Γ − ˜ η , whose derivative yields˙ V ( t ) = − ∆ ( t )(ˆ η ( t ) − η ) ⊤ P [ G (ˆ η ( t )) − G ( η )] ≤ − ∆ ( t ) ρ | ˜ η ( t ) | ≤ − kλ max { Γ } ∆ ( t ) V ( t ) , where we invoked Assumption 1 to get the first bound, where λ max {·} denotes the maximum eigenvalue.The fact that V ( t ) is non-increasing proves the first claim.To prove the second one, we invoke the Comparison Lemma [16, Lemma 3.4] that yields the bound V ( t ) ≤ e − kλ max { Γ } R t ∆ ( s ) ds V (0) , which ensures that | ˜ η ( t ) | → t → ∞ if ∆( t ) / ∈ L . (cid:3)(cid:3)(cid:3) Remark 6.
As is well known, convergence in all parameter estimators—as well as in state observers—can only be ensured under some kind of excitation conditions [22]. In particular, for standard gradientand least-squares estimators this property is encrypted in the well known PE requirement of the regressor[12, 14, 31]. As it has been shown in [2] convergence of DREM estimators can be ensured without requiringPE and replacing it, instead, by the assumption ∆( t ) / ∈ L , which is necessary and sufficient for parameterconvergence for linear regression equation. As shown in Proposition 3 this condition is sufficient for NPREof the form (1) with a P -“monotonizable” regressor S ( θ ). Notice, on the other hand, that the nice propertyof element-by-element monotonocity of the parameter estimation errors of linear regressions is lost, and wecan only ensure that the norm of this vector is monotonically non-increasing.7 emark 7. It is interesting to note that the following important implication for the CT DREM givenabove was recently proven in [18]: Ω( t ) ∈ P E ⇒ ∆( t ) ∈ P E.
Hence, if the standard gradient estimator for the overparameterized linear regression y ( t ) = Ω( t ) η a , with η a := W ( η ) is globally exponentially stable (GES) also the DREM estimator is GES. However, asymptoticconvergence of DREM is ensured with the condition ∆( t ) / ∈ L , which is strictly weaker than ∆( t ) ∈ P E . Remark 8.
We have assumed that the mapping G ( η ) is strongly P -monotonic. Its clear from the derivationsabove that this requirement can be relaxed to strictly P -monotonic adding some further assumptions on∆( t ). In this subsection we present the estimation law for the parameters η of the DT NPRE (14). Towards thisend, the following is needed. Assumption 2.
The mapping G ( η ) satisfies the Lipschitz condition |G ( a ) − G ( b ) | ≤ ν | a − b | , ∀ a, b ∈ R q , (17)for some ν > Proposition 4.
Consider the DT NPRE (14) with G ( η ) satisfying (8) of Assumption 1 and Assumption2. Propose the DT parameter estimatorˆ η ( k + 1) = ˆ η ( k ) + γP ∆( k )1 + κ ∆ ( k ) [ Y ( k ) − ∆( k ) G (ˆ η )] , (18)with γ > σ := 2 γρ − γ ν λ max { P } > , (19)and the constant κ verifying κ ≥ max { , σ } . (20) P1 The norm of the parameter estimation error ˜ η ( k ) := ˆ η ( k ) − η is monotonically non-increasing, that is, | ˜ η ( k ) ≤ | ˜ η ( k ) | , ∀ k ≥ k ∈ N ≥ . (21) P2 The following implication is true ∞ Y i =0 κ − σ ) ∆ ( k )1 + κ ∆ ( k ) = 0 ⇒ lim k →∞ | ˜ η ( k ) | = 0 . P3 The following implication is truelim k →∞ ∆( k ) =: ∆( ∞ ) = 0 ⇒ lim k →∞ | ˜ η ( k ) | = 0 . Clearly, for any positive ρ, ν and
P >
0, this condition is satisfied with γ < ρν λ max { P } . Assume ν ≤ ρλ max { P } , (22)and pick γ in the interval1 ν λ max { P } (cid:20) ρ − q ρ − ν λ max { P } (cid:21) ≤ γ ≤ ν λ max { P } (cid:20) ρ + q ρ − ν λ max { P } (cid:21) . (23)The following holds ∆( k ) / ∈ ℓ ⇒ lim k →∞ | ˜ η ( k ) | = 0 . Proof.
First, we observe that the condition (20), on one hand, ensures the following bound for the normal-ized scalar regressor ¯∆ ( k ) := ∆ ( k )1 + κ ∆ ( k ) ≤ , (24)and, on the other hand, shows that κ − σ ≥ . (25)Replacing (14) in (18) we get the error equation˜ η ( k + 1) = ˜ η ( k ) − γP ¯∆ ( k )[ G (ˆ η ( k )) − G ( η )] , where we invoked the definition in (24). To analyze the stability of this equation define the Lyapunovfunction candidate V ( k ) = 12 γ | ˜ η ( k ) | , (26)which satisfies V ( k + 1) = V ( k ) − ¯∆ ( k )˜ η ⊤ ( k ) P [ G (ˆ η ( k )) − G ( η )]+ γ ( k )[ G (ˆ η ( k )) − G ( η )] ⊤ P [ G (ˆ η ( k )) − G ( η )] ≤ V ( k ) − ρ ¯∆ ( k ) | ˜ η ( k ) | + γν λ max { P } ¯∆ ( k ) | ˜ η ( k ) | ≤ V ( k ) − ρ ¯∆ ( k ) | ˜ η ( k ) | + γν λ max { P } ¯∆ ( k ) | ˜ η ( k ) | = V ( k ) − (cid:20) ρ − γν λ max { P } (cid:21) ¯∆ ( k ) | ˜ η ( k ) | = (cid:2) − σ ¯∆ ( k ) (cid:3) V ( k )= 1 + ( κ − σ ) ∆ ( k )1 + κ ∆ ( k ) V ( k ) , (27)where we invoked Assumption 1 and (17) to get the first bound, inequality (24) for the second bound, (19)and (26) in the third identity and the definition of ¯∆( k ) for the last identity.[Proof of Property P1 ] The proof is completed observing that (25) ensures1 + ( κ − σ ) ∆ ( k )1 + κ ∆ ( k ) ≥ , consequently V ( k ) is a non-increasing sequence. 9Proof of Property P2 ] From V ( k + 1) ≤ k Y i =0 κ − σ ) ∆ ( i )1 + κ ∆ ( i ) V (0) , the claim follows immediately.[Proof of Property P3 ] To prove the second claim we first notice that, from the second inequality in (27)and (19), we get the bound V ( k + 1) ≤ V ( k ) − σ γ ¯∆ ( k ) | ˜ η ( k ) | summing the inequality above we get V ( k ) − V (0) ≤ − k X j =1 σ γ ¯∆ ( j ) | ˜ η ( j ) | ⇒ γV (0) σ ≥ k X j =1 ¯∆ ( j ) | ˜ η ( j ) | . Taking the limit as k → ∞ in the right hand side inequality we conclude that¯∆( k ) | ˜ η ( k ) | ∈ ℓ ⇒ ¯∆( k ) | ˜ η ( k ) | → , (28)independently of the behaviour of ∆( k ). Now, from the Algebraic Limit Theorem [30, Theorem 3.3] weknow that the limit of the product of two convergent sequences is the product of their limits. On the otherhand, from the fact that V ( k + 1) ≤ V ( k ) ≤ V (0) , ∀ k ∈ N > , we have that | ˜ η ( k ) | is a bounded monotonic sequence, hence it converges [30, Theorem 3.14]. Finally, if∆( k ) converges to a non-zero limit, ¯∆( k ) also converges to a non-zero limit and we conclude from (28) that | ˜ η ( k ) | → P4 ] To prove the third claim we first observe that the condition (22) ensures that theupper and lower limits of γ given in (23) are well defined. Some lengthy, but straightforward calculations,show then that the conditions (22) and (23) guarantee that σ ≥
1. Hence, in view of (20), the bound (24)as well as the derivations in (27), still hold. Then, setting κ = σ in the last equation of (27) we get V ( k + 1) ≤ k Y i =0
11 + κ ∆ ( i ) V (0)The proof is completed recalling that ∞ Y i =0
11 + κ ∆ ( i ) = 0 ⇔ ∆( k ) / ∈ ℓ . (cid:3)(cid:3)(cid:3) Remark 9.
Similarly to the observation made in Remark 6 the sufficient conditions for parameter conver-gence of Properties P2 - P4 should be interpreted as excitation requirements imposed on ∆( k ). Notice thatthe condition of Property P3 is sufficient to ensure ∆( k ) / ∈ ℓ and necessary for it to be PE. In Property P4 we prove that ∆( k ) / ∈ ℓ is sufficient for parameter convergence but, unfortunately, we need to imposethe rather “unnatural” condition (22). Indeed, roughly speaking, the Lipschitz constant ν is related withan “upper bound” on the derivative of G ( η ) [30, Theorem 9.19], while at the same time a high monotonicitydegree ρ requires this derivative to be large—which is in contradiction with (22).10 Application to CT Nonlinearly Parameterized Nonlinear Systems
In the section we apply the results on parameter estimation of CT NPRE of the previous section to tacklethe problem of adaptive control of uncertain, nonlinearly parameterized, nonlinear systems. First, we treatthe case of a rather general class of systems, then we specialize the result for EL models.
Consider CT systems described by the state equations˙ x ( t ) = F ( x ( t ) , u ( t )) + R ( x ( t )) S ( θ ) (29)where x ( t ) ∈ R n is the measurable state, u ( t ) ∈ R m , with n ≥ m , is the control signal, the mappings F : R n × R m → R n , R : R n → R n × p and S : R q → R p are known with p > q , and θ ∈ R q is a constantvector of unknown parameters .To streamline the formulation of the adaptive control problem we require the following sine qua non stabilizability condition. Assumption 3.
There exists a mapping β : R n × R q → R m , such that the system˙ x ( t ) = F ( x ( t ) , β ( x ( t ) , θ )) + R ( x ( t )) S ( θ ) =: f ⋆ ( x ( t )) (30)has a globally exponentially stable (GES) equilibrium at a desired value x ⋆ ∈ R n .The control objective is then to design a parameter estimator such that the (certainty-equivalent)adaptive control u = β ( x ( t ) , ˆ θ ( t )) ensures the asymptotic convergencelim t →∞ x ( t ) = x ⋆ , (31)with all signals bounded. To solve this problem we will impose Assumption 1 to the mapping S ( θ ) andapply the estimator of Proposition 3 to generate the adaptive controller.A fist step in the design is the derivation of the NPRE (1) for the system (29). This is easily obtainedapplying to (29) the stable, LTI filter (12) and defining y ( t ) := pH ( p )[ x ]( t ) − H ( p )[ F ( x, u )]( t )Ω := H ( p )[ R ( x )]( t ) , (32)and ε ( t ) is the solution of H ( p )[ ε ]( t ) = 0. A state-space realization of (32) is given by˙ z ( t ) = − λ ( z ( t ) + x ( t )) − F ( x ( t ) , u ( t ))˙Ω( t ) = − λ Ω( t ) + R ( x ( t )) y ( t ) = z ( t ) + x ( t ) . (33)We are in position to state the main result of this subsection. Proposition 5.
Consider the nonlinearly parameterized, nonlinear system (29) satisfying Assumptions 1and 3. Let the adaptive control be given by u ( t ) = β ( x ( t ) , D I (ˆ η ( t ))) , together with the parameter estimator (10), (15) and (33). If ∆( t ) / ∈ L we have that (31) holds with allsignals bounded. 11 roof. First, notice that the closed-loop system takes the form˙ x ( t ) = F ( x ( t ) , β ( x ( t ) , ˆ θ ( t ))) + R ( x ( t )) S ( θ )= f ⋆ ( x ( t )) + ξ ( x ( t ) , ˜ θ ( t )) , where we defined the perturbation term ξ ( x ( t ) , ˜ θ ( t )) := F ( x ( t ) , β ( x ( t ) , ˜ θ ( t ) + θ ) − F ( x ( t ) , β ( x ( t ) , θ )) . Using the fact that ˜ θ ( t ) = D I (˜ η ( t )) we see that the closed-loop system takes a cascade form˙ x ( t ) = f ⋆ ( x ( t )) + ξ ( x ( t ) , D I (˜ η ( t )))˙˜ η ( t ) = − ∆ ( t )Γ P [ G (˜ η ( t ) + η ) − G ( η )] , (34)with ξ ( x ( t ) ,
0) = 0. Assumption 3 ensures that x ⋆ is a GES equilibrium of the unperturbed system.Therefore, by [16, Lemma 4.6] the perturbed system is ISS with respect to the input ˜ η ( t ). Now, thecondition ∆ / ∈ L ensures that the origin of the ˜ η ( t ) subsystem is globally asymptotically stable (GAS).Hence, by [16, Lemma 4.7], the cascaded system (34) is GAS and, consequently, (31) holds with all signalsbounded. (cid:3)(cid:3)(cid:3) Remark 10.
To simplify the presentation we have restricted ourselves to regulation tasks with static state-feedback controllers and aimed at global properties. The extension for tracking with dynamic controllersand local results follows verbatim . In particular, local asymptotic stability follows replacing GES by GASin Assumption 3.
In this subsection we specialize the result of the previous subsection to the practically important case ofCT EL systems. On the other hand, we extend the scenario to treat the problem of tracking a referencefor the state vector. To simplify the notation, throughout this section we omit the time dependence fromall signals.
We consider n q degrees-of-freedom (dof), possibly underactuated, EL systems with generalized coordinates q ( t ) ∈ R n q and control vector u ( t ) ∈ R m , m ≤ n q , whose dynamics is described by the EL equations ofmotion ddt [ ∇ ˙ q L ( q, ˙ q )] − ∇ q L ( q, ˙ q ) = G ( q ) u, (35)where L : R n q × R n q → R is the Lagrangian function defined as L ( q, ˙ q ) := T ( q, ˙ q ) − U ( q ) , with T : R n q × R n q → R the kinetic co-energy function and U : R n q → R the potential energy function and G : R n q → R n q × m is the full-rank input matrix. We restrict our attention to simple EL systems, whosekinetic energy is of the form L ( q, ˙ q ) = 12 ˙ q ⊤ M ( q ) ˙ q, where M : R n q → R n q × n q is the generalized inertia matrix, which is positive definite and assumed to be bounded . See [27] for additional details on this model and many practical examples.For future reference we find convenient to write the dynamics of the EL system (35) as ddt [ M ( q ) ˙ q ] − ∇ q h ˙ q ⊤ M ( q ) ˙ q i + ∇ U ( q ) = G ( q ) u (36)12ith the more explicit form M ( q )¨ q + C ( q, ˙ q ) ˙ q + ∇ U ( q ) = G ( q ) u, (37)where C : R n q × R n q → R n q × n q represents the Coriolis and centrifugal forces matrix. As is well known[27, Lemma 2.8], if the matrix C ( q, ˙ q ) is defined via the Christoffel symbols of the first kind, the keyskew-symmetry property z ⊤ [ ˙ M ( q ) − C ( q, ˙ q )] z, ∀ z ∈ R n q , (38)holds.Similarly to the previous subsection, we require the existence of a global tracking controller. Assumption 4.
Given a desired bounded trajectory for the state vector ( q ⋆ ( t ) , ˙ q ⋆ ( t )) ∈ R n q × R n q . Definethe state tracking error col(˜ q, ˙˜ q ) := col( q − q ⋆ , ˙ q − ˙ q ⋆ ) . There exists a mapping β : R n q × R n q × R q × R ≥ → R m ,such that the system M ( q )¨ q + C ( q, ˙ q ) ˙ q + ∇ U ( q ) = G ( q ) β ( q, ˙ q, θ, t ) , has an error dynamics (cid:20) ˙˜ q ¨˜ q (cid:21) = f ⋆ (˜ q, ˙˜ q, t )whose origin is GES.The control objective is then to design a parameter estimator such that the (certainty-equivalent)adaptive control u = β ( q, ˙ q, ˆ θ, t ) ensures global asymptotic tracking, that is,lim t →∞ col(˜ q ( t ) , ˙˜ q ( t )) = 0 , (39)with all signals bounded. A fist step in the design is the derivation of the NPRE (1) for the system (37)—which was already reportedin [33]. Towards this end, we introduce the following parameterization of the inertia matrix M ( q ) and thepotential energy U ( q ) M ( q ) = ℓ X i =1 m i ( q ) S mi ( θ ) , U ( q ) = r X j =1 U j ( q ) S U j ( θ ) (40)with known matrices m i : R n q → R n q × n q and functions U j : R n q → R and known functions S mi ( θ ) , S U j ( θ ) : R q → R of the unknown physical parameters θ ∈ R q . We group together all functions S mi ( θ ) , S U j ( θ ) in asingle vector mapping S : R q → R p as S ( θ ) := col( S m ( θ ) , · · · , S mℓ ( θ ) , S U ( θ ) , · · · , S U r ( θ )) ∈ R p , (41)where p := ℓ + r > q . We are in position to present the following. Proposition 6.
There exists a regressor matrix Ω : R n q × R n q → R n q × p such that the EL system (37)satisfies the NPRE y = Ω( q, ˙ q ) S ( θ ) (42)where y := H ( p ) [ G ( q ) u ] , (43)with θ and S ( θ ) defined via (40) and (41). 13 roof. Applying the LTI filter (12) to both sides of (36) we get pH ( p )[ M ( q ) ˙ q ] − H ( p ) h ∇ q ( ˙ q ⊤ M ( q ) ˙ q ) i + H ( p )[ ∇ U ( q )] = y, (44)where we have used (43).Now, using the parameterization (40), the left hand side of (44) can be written as ℓ X i =1 H ( p ) h pm i ( q ) ˙ q − ∇ q ( ˙ q ⊤ m i ( q ) ˙ q ) i S mi ( θ ) + r X j =1 H ( p )[ ∇ U j ( q )] S U j ( θ ) = Ω( q, ˙ q ) S ( θ ) (45)where we used (41) and defined the regressor matrixΩ( q, ˙ q ) := H ( p ) pm ( q ) ˙ q − ∇ q ( ˙ q ⊤ m ( q ) ˙ q )... pm ℓ ( q ) − ∇ q ( ˙ q ⊤ m ℓ ( q ) ˙ q ) ∇ U ( q )... ∇ U r ( q ) ⊤ , (46)this completes the proof. (cid:3)(cid:3)(cid:3) Remark 11.
Notice that the terms H ( p )[ pm i ( q ) ˙ q ] , i ∈ ¯ ℓ , may be written as pp + λ [ m i ( q ) ˙ q ], hence they canbe computed without differentiation. Remark 12.
In [33] an alternative parameterization of the EL system (35) is proposed. Indeed, applyingthe filter (12) to the well-known power-balance equation [27, Proposition 2.5]˙ H = ˙ q ⊤ G ( q ) u, where H ( q, ˙ q ) := T ( q, ˙ q ) + U ( q ) is the total energy function, it is possible to obtain a NPRE of the form (42)with scalar y and Ω : R n q × R n q → R p . As argued in [33] this is a much simpler parameterization than theone given in Proposition 6. However, extensive simulated evidence shows that this yields a non-identifiable parameterization. We are now in position of present the main result of this subsection, whose proof follows verbatim the proofof Proposition 5, therefore it is omitted.
Proposition 7.
Consider the EL system (37) with NPRE (42) verifying Assumptions 1 and 4. Let theadaptive control be given by u = β ( q, ˙ q, D I (ˆ η ) , t ) (47)together with the parameter estimator (10), (15), (43) and (46). If ∆ / ∈ L we have that (39) holds withall signals bounded.In what follows we present two well-known choices of β ( q, ˙ q, θ, t ) for fully actuated systems, i.e. , m = n q ,and prove that they satisfy the key GES Assumption 4 • The
Computed Torque Controller in the known parameter case is given by β ( q, ˙ q, θ, t ) = M ( q )[¨ q ⋆ − K ˙˜ q − K ˜ q ] + C ( q, ˙ q ) ˙ q + g ( q ) , q + K ˙˜ q + K ˜ q = 0 , that, obviously, has a GES equilibrium at the origin for all positive definite control gains K , K ∈ R n q × n q . • The
Slotine-Li Controller in the known parameter case is given by [32] β ( q, ˙ q, θ, t ) = M ( q )¨ q r + C ( q, ˙ q ) ˙ q r + g ( q ) + K s, (48)where we defined the signals ˙ q r := ˙ q ⋆ − K ˜ q, s := ˙˜ q + K ˜ q. (49)The closed-loop system is then M ( q ) ˙ s + [ C ( q, ˙ q ) + K ] s = 0 , ˙˜ q + K ˜ q = s, that—as indicated in [27, Remark 4.5], see also [34]—has an GES equilibrium at the origin. Remark 13.
To the best of our knowledge, the proof of global stability of the adaptive version of thecomputed torque scheme proposed above is the first one reported in the literature. -DOF robot manipulator In this subsubection we show that the “monotonizability” Assumption 1 is verified for a 2-dof robotmanipulator. The equation of motion of the robot is given by (37) with M ( q ) = (cid:20) S ( θ ) + 2 S ( θ ) cos( q ) S ( θ ) + S ( θ ) cos( q ) S ( θ ) + S ( θ ) cos( q ) S ( θ ) (cid:21) U ( q ) = S ( θ ) g (1 + sin( q + q )) + S ( θ ) g (1 + sin( q )) , (50)with g the gravitational constant, the physical parameters θ := col( l , l , m , m ), where l i > i with mass m i > i = 1 ,
2, and the mappings S m ( θ ) := θ θ + θ ( θ + θ ) θ θ θ θ θ , S U ( θ ) := (cid:20) θ θ θ ( θ + θ ) (cid:21) , S ( θ ) := (cid:20) S m ( θ ) S U ( θ )) (cid:21) . (51)In the following lemma we verify Assumption 1 for the mapping S ( θ ). Lemma 2.
Consider the vector θ ∈ R > and the mapping S : R > → R > given by (51). Assume thebounds θ ≤ θ M , θ m ≤ θ ≤ θ M , θ m ≤ θ . (52)The mapping D : R > → R > η = D ( θ ) = col( θ , θ , θ θ , θ ( θ + θ )) , with right inverse D I : R > → R θ = D I ( η ) = col( η , η , η η − η η , η η ) , (53)verifies Assumption 1 with T = , P = diag { a, a, , } for any a ≥ θ m (cid:20) θ M + ( θ M ) θ m (cid:21) . roof. From (6) compute the mapping W ( η ) = S ( D I ( η )) = col( η η + η η , η η , η η , η , η ) . and the matrix C = (cid:2) I | × (cid:3) T = . Hence the “good” mapping is G ( η ) = col( W ( η ) , W ( η ) , W ( η ) , W ( η )), whose Jacobian yields ∇G ( η ) = η η η η . Since the real part of the eigenvalues of this matrix are positive and its a Metzler matrix it admits adiagonal matrix P such that (8) holds [5]. Computing the matrix P ∇G ( η ) + [ ∇G ( η )] ⊤ P = aη η
00 2 aη η η η , we see that it is positive definite if and only if its Schur complement of the (2 ,
2) block, given as,14 (cid:20) aη − η − η η − η η aη − η (cid:21) , is positive definite. This, in its turn, is true if and only if a > η ( η + η ) . The proof is completed bounding the right hand side from above, replacing η by θ and using the bounds(52). (cid:3)(cid:3)(cid:3) -DOF robot manipulator In this subsubsection we present in detail the adaptive controller of Proposition 7 with the Slotine-Lischeme for the 2-DOF robot manipulator. We show simulation results comparing the proposed schemewith the classical one relying on overparameterization.To derive the NPRE (42) we invoke (40) and (50) and define m := (cid:20) (cid:21) , m ( q ) := cos( q ) (cid:20) (cid:21) , m := (cid:20) (cid:21) U ( q ) := g [1 + sin( q + q )] , U ( q ) := g [1 + sin( q )] . Thus, the regressor matrix (42) takes the formΩ( q, ˙ q ) = H ( p ) (cid:20) p ˙ q p cos( q )(2 ˙ q + ˙ q ) p ˙ q g cos( q + q ) g cos( q )0 p cos( q ) ˙ q + sin( q )( ˙ q + ˙ q ˙ q ) p ( ˙ q + ˙ q ) g cos( q + q ) 0 (cid:21) . β ( q, ˙ q, θ, t ) := W ( q, ˙ q, t ) S ( θ ) + K s, with the matrix W ( q, ˙ q, t ) := (cid:20) ¨ q r cos( q )(2¨ q r + ¨ q r ) − sin( q )( ˙ q ˙ q r + ( ˙ q + ˙ q ) ˙ q r ) ¨ q r g cos( q + q ) g cos( q )0 cos( q )¨ q r + sin( q ) ˙ q ˙ q r ¨ q r + ¨ q r g cos( q + q ) 0 (cid:21) , where ˙ q r and s are defined in (49). In its standard version [32], to get a linear parametrization, the adaptiveimplementation is obtained estimating the vector S , yielding β ( q, ˙ q, ˆ S, t ) := W ( q, ˙ q, ˙ q r , ¨ q r ) ˆ S + K s. The parameter estimator is given as ˙ˆ S := − Γ W ⊤ ( q, ˙ q, t ) s, that, as shown in [34], yields a globally stable closed-loop system and ensures global tracking of the desiredreferences.In the proposed approach we estimate directly θ , that is, the adaptive control is β ( q, ˙ q, ˆ θ, t ) := W ( q, ˙ q, ˙ q r , ¨ q r ) S (ˆ θ ) + K s, with the parameter estimator (10), (15), (43) and (46), combined with ˆ θ = D I (ˆ η ), where the mapping D I ( · ) is given in (53).Now, we present some simulations comparing both approaches. For both controllers the gains are setas K = 3 I , K = I and Γ = 5 I . For the DREM-based controller the filter (12) is implemented with λ = 2 in Proposition 1 and λ = 1 in Proposition 6, both filters with zero initial conditions. The unknownparameters are set as θ = 0 . θ = 0 . θ = 1 . θ = 0 . q (0) = [0 . π ; 0 . π ]rad. The initial estimates are ˆ θ i (0) = 0 .
01 andˆ S i (0) = 0 .
01. The desired trajectory is q ⋆ ( t ) = col(0 . π sin(2 t ) + 0 . π, . π cos( t ) + 0 . π ) . Figure 1 shows the results of the simulations of the DREM-based and the standard schemes, from whichwe can observe that the trajectory tracking and the parameter estimation capabilities of our proposal clearlyoutperforms those of the classical adaptive controller. In this figure it can be also seen that consistentparameter estimation is quickly achieved. However, as indicated in Remark 6, the individual estimationerrors ˜ θ i are not monotonically decreasing.In Figure 2 we change the initial conditions of the estimated parameters. From this figure we concludethat these initial conditions strongly affect the excitation of the system, encrypted in the signal ∆ in (15).Notice that, although there is a “pattern” in the behavior of ∆ —as a function of the initial conditions—this is hard to predict. A similar “sensitivity” to variations in the estimator and controller gains wasobserved, rendering difficult their tuning to achieve a satisfactory transient performance. The figure alsoshows that the norm of the estimation error ˜ η is monotonically decreasing—as indicated in Proposition 3.17 igure 1: Simulation results for the DREM-based adaptive scheme (left column) and the classical adaptiveSlotine-Li (right column). 18
Figure 2: “Measure of excitation” (∆ ) and norm of the parameter estimation error ( | ˜ η | ) for differentinitial conditions of the estimated parameters. In this section we show how the proposed DREM-based parameter estimator can be applied to the problemsof identification of a nonlinearly parameterized DT plant and to solve the direct and indirect versions ofAPPC.
In [22, Example 1.1] the problem of identification of the parameters of a solar-heated house model isdiscussed. The system operates in such a way that a sun heats the air in the solar panel, this air is thenfanned into the heat storage. The stored energy can later be transferred to the house. The model of howthe storage temperature y p ( k ) is affected by the fan control u ( k ) and solar intensity I ( k ) is given in [22,19xample 5.1] as y p ( k ) =(1 − θ ) y p ( k −
1) + (1 − θ ) y p ( k − u ( k − u ( k −
2) + ( θ − θ ) y p ( k − u ( k − u ( k −
2) + (54) θ θ u ( k − I ( k − − θ u ( k − y p ( k −
1) + θ (1 + θ ) u ( k − y p ( k − , where y p ( k ) , u ( k ) , I ( k ) are measurable scalar variables and θ := col( θ , . . . , θ ) is a vector of constant,unknown, physical parameters of the system to be estimated. See [22, Example 5.1] for an explanation ofthe physical meaning of the parameters θ .Defining Ω ⊤ ( k ) := y p ( k − y p ( k − u ( k − u ( k − y p ( k − u ( k − u ( k − u ( k − I ( k − u ( k − y p ( k − u ( k − y p ( k − , S ( θ ) := − θ − θ ( θ − θ ) θ θ − θ θ (1 + θ ) . (55)the model (54) can be rewritten as the NLPRE (1) that, as shown below, verifies the required assumptionsfor the direct estimation of θ . Lemma 3.
The NLPRE (1), (55) verifies Assumptions 1 and 2 with the mapping D : R → R η = D ( θ ) = col( − θ , − θ , θ θ , − θ ) , with right inverse D I : R → R θ = D I ( θ ) = col( − η , − η , − η η , − η ) , (56)the matrices T = , P = I , and the constants ν = 1, ρ = 2 and κ = 3. Proof.
Compute the mapping W ( η ) := S ( D I ( η )) = col( η , η , η ( η − , η , η , η ( η − , and the matrix C := [ I | × ] T = . Hence the “good” mapping is G ( η ) = col( W ( η ) , W ( η ) , W ( η ) , W ( η )) = η, with obvious Jacobian ∇G ( η ) = I , which clearly satisfies (8) and (17) with the constants ν = 1 and ρ = 2,respectively. (cid:3)(cid:3)(cid:3)
20n [22, Fig. 1.4] an experimental record of the signals y p ( k ) , u ( k ) , I ( k ) over a 16-hour period, sampledevery 10 minutes, is given. The solar intensity I ( k ) changes periodically with decaying form from thebeginning till the end of the day, while the fan control u ( k ) acts like a pulse signal with only two possiblevalues. For simulation purposes a similar behavior of these signals was recreated and is presented in Fig. 3. Figure 3:
External signals I ( k ) and u ( k ) used for the simulation of the solar-heated house modelThe DREM-based estimator of Propositions 2 and 4, with the filter pole at α = 0 . γ = 1, was simulated. To comply with (20) we fixed κ = 3. The value of the system parametersused in the simulations was θ i = 0 . , i = 1 , . . . ,
4, and the estimator initial conditions were chosen asˆ η i (0) = η i − . , i = 1 , . . . , The transient behavior of the parameter estimation errors ˜ η i ( k ) are presentedin Fig. 4. The plot shows that convergence is achieved after the second pulse in u ( k ). Also, althoughnot predicted by he theory we observe a monotonic behavior of each error signal. Using the inversetransformation (56) it is possible to calculate the estimations of model parameters ˆ θ i ( k ) which are shownin Fig. 4 as well. It was observed that the behavior of the estimator remains unchanged for other values of these parameters and otherinitial conditions.
500 1000-0.6-0.5-0.4-0.3-0.2-0.10 0 500 1000-0.6-0.5-0.4-0.3-0.2-0.10 0 500 1000-0.6-0.5-0.4-0.3-0.2-0.10 0 500 1000-0.6-0.5-0.4-0.3-0.2-0.100 500 1000-0.100.10.20.30.40.50.6 0 500 1000-0.100.10.20.30.40.50.6 0 500 1000-0.8-0.6-0.4-0.20 0 500 1000-0.100.10.20.30.40.50.6
Figure 4:
Transient behaviour of the estimation errors ˜ η i ( k ) and estimation errors ˜ θ i ( k ) of the solar-heatedhouse model using DREM-based estimatorIn [22] it is proposed to overparameterize the NPRE to obtain a linear regression. As indicated there,the price that is paid is that the value of the physical parameters θ —which might be of interest in someapplications— cannot be recovered from the knowledge of S ( θ ). Clearly, this is not the case for the proposedscheme since θ can be calculated with the inverse transformation (56). In any case, for performance com-parison purposes a simulation was carried out with the overparameterized model (55) using the standardgradient estimator ˆ S ( k ) = ˆ S ( k −
1) + Ω ⊤ ( k ) γ + Ω( k )Ω ⊤ ( k ) [ y ( k ) − Ω( k ) ˆ S ( k − , with γ = 1. Simulation results are shown in Fig. 5. As seen from the plots, the parameters converge fasterthan the DREM estimator, but they converge to wrong values. Figure 5:
Transient behaviour of the estimation errors ˜ S i ( k ) of the overparameterized solar-heated housemodel using gradient estimator 22 .2 Adaptive Pole Placement Control of LTI Systems We are interested in this subsection in the problem of APPC of LTI DT system represented by it pulsetransfer function A ( q − ) y p ( k ) = B ( q − ) u ( k ) , (57)where the polynomials A ( q − ) = 1 + a q − + · · · + a n A q − n A , B ( q − ) = b + b q − + · · · + b n B q − n B , are coprime, with a known upperbound on their order, say v , but with unknown coefficients a i , b i . Thepole-placement problem consists of designing a controller L ( q − ) u ( k ) + P ( q − ) y p ( k ) = r ( k ) (58)such that the closed-loop system takes the form y p ( k ) = B ( q − ) A m ( q − ) r ( k ) , where r ( k ) is a bounded external signal and A m ( q − ) = 1 + a m q − + · · · + a mn Am q − n Am , is a desired closed-loop polynomial whose roots are inside the unit circle. That is, the controller relocates the poles of thesystem in a desired position but preserves the open-loop zeros. For a lucid exposition of this problem see[12, Section 5.3] and [29] for a review of the recent literature. Computing (57) in closed-loop with (58) we get y p ( k ) = B ( q − ) A ( q − ) L ( q − ) + B ( q − ) P ( q − ) r ( k ) . (59)Hence, to achieve the objective, we need to verify the Bezout equation A ( q − ) L ( q − ) + B ( q − ) P ( q − ) = A m ( q − ) . (60)As is well-known [12, Theorem 5.3.1], selecting n A m := 2 v −
1, there exists unique polynomials L ( q − ) and P ( q − ), both of order ( v − S ( a i , b i ) η = col( a m , a m , . . . , a m v − ) , (61)where η := col( l , l , . . . , l v − , p , p , . . . , p v − ) (62)and S ( a i , b i ) ∈ R v × v —called the Sylvester matrix—is linearly dependent on the coefficients a i , b i , and is full rank if and only if A ( q − ) and B ( q − ) are coprime.It is well-known that the adaptive version of the previous controller, called APPC, suffers from seriousdrawbacks [12, 29]. In its indirect version—that is when we estimate the parameters of the plant a i , b i andthen compute from them, via the solution of (61), the parameters of the controller l i , p i —the problem isthat the Sylvester matrix with the estimated parameters ˆ a i ( k ) , ˆ b i ( k ) may loose rank during the transientbehavior. Although this phenomenon can be avoided adding parameter projections, the prior knowledgerequired to implement this efficiently is never available in practice and relies on the availability of PE, see[29, Section 1]. 23n the other hand, in its direct version the estimation of the controller parameters involves a NPRE.Indeed, applying (60) to the output of the plant y p ( k ) we get L ( q − ) A ( q − ) y p ( k ) + P ( q − ) B ( q − ) y p ( k ) = A m ( q − ) y p ( k ) ⇔ L ( q − ) B ( q − ) u ( k ) + P ( q − ) B ( q − ) y p ( k ) = A m ( q − ) y p ( k ) ⇔ B ( q − )[ L ( q − ) u ( k ) + P ( q − ) y p ( k )] = A m ( q − ) y p ( k ) , (63)where we invoked (57) to get the second equation. The known parameter version of the direct pole-placement controller may be written in the LRE form u ( k ) + η ⊤ ψ ( k ) = r ( k )where we have used the fact that L ( q − ) is monic and defined ψ ( k ) := col( y p ( k ) , . . . , y p ( k − v + 1) , u ( k − , . . . , u ( k − v + 1)) ∈ R v − , with η , as defined in (62), contains the unknown coefficients of the polynomials L ( q − ) and P ( q − ). Adirect adaptive implementation of this controller takes then the form u ( k ) + ˆ η ⊤ ( k ) ψ ( k ) = r ( k ) , where ˆ η ( k ) denotes the estimates of η . The difficulty of designing an estimator for the controller parameters η is due to the fact that, in terms of η , (63) defines a parameterization of the form B ( q − )[ u ( k ) + η ⊤ ψ ( k )] = A m ( q − ) y p ( k ) =: y p ( k ) , (64)which is bilinear because the polynomial B ( q − ) is unknown .In the next two subsubsections we show that using the results reported in the paper it is possible toovercome the two obstacles mentioned above. To simplify the presentation we illustrate this fact withsimple representative examples, that can be easily extended to the general case. Consider the LTI DT system y p ( k + 1) + θy p ( k ) = u ( k ) + θ u ( k − , (65)where, to ensure the coprimeness assumption, θ = ±
1. Fixing a dead-beat objective, e.g. , A m ( q − ) = 1,and selecting L ( q − ) = l + l q − and P ( q − ) = p + p q − the Bezout equation (60) takes the form(1 + θq − )( l + l q − ) + q − (1 + θ q − )( p + p q − ) = 1 . (66)The latter can be rewritten as θ θ θ
10 0 0 θ l l p p = , (67)whose solution is l = 1, p = 0 and (cid:20) θ θ (cid:21) (cid:20) l p (cid:21) = (cid:20) − θ (cid:21) , (68)which corresponds to (cid:20) l p (cid:21) = 1 θ − θ (cid:20) − θ θ (cid:21) . u ( k ) = − θ − θ (cid:2) θ y p ( k ) − θ u ( k − (cid:3) + r ( k ) (69)and yields the desired closed-loop system y p ( k ) = q − (1 + θ q − ) r ( k ) . Obviously, the system admits an NPRE of the form (1) with y p ( k ) := y p ( k ) − u ( k − , Ω( k ) := (cid:20) − y p ( k − u ( k − (cid:21) , S ( θ ) := (cid:20) θθ (cid:21) . (70)If we overparametrize the NPRE and estimate the vector S ∈ R the controller parameters are computedfrom (cid:20) S ( k ) ˆ S ( k ) (cid:21) (cid:20) ˆ l ( k )ˆ p ( k ) (cid:21) = (cid:20) − ˆ S ( k )0 (cid:21) , (71)which yields the adaptive controller u ( k ) = − S ( k ) − ˆ S ( k ) h ˆ S ( k ) y p ( k ) − ˆ S ( k ) ˆ S ( k ) u ( k − i + r ( k ) . (72)Clearly, the controller computation has a singularity on the line ˆ S ( k ) = ˆ S ( k ). On the other hand, if weestimate θ , the adaptive version of (69) has a singularity only at the points ˆ θ ( k ) = ± changing parameters θ = (cid:26) ≤ t < − ≤ t. The external signal r ( k ) is a sinusoidal function. The initial conditions of the estimators were taken asˆ θ (0) = ˆ S (0) = and ˆ S (0) = . For 0 ≤ t < S ( θ ) < S ( θ ) and for t ≥ S ( θ ) > S ( θ ).Therefore, if the estimates ˆ S ( k ) converge they have to cross through singularity . On the other hand, theDREM-based scheme shouldn’t leave the singularity-free region θ ∈ ( − ,
1) because of the monotonicityproperty.The simulation results for the DREM-based estimation of θ with γ = 1 and κ = 2 are presentedin Fig. 6. As seen from the figure, the controller parameter error converges to zero and the estimatedparameter ˆ θ ( k ) does not leave the singularity-free region θ ∈ ( − , e ( k ) := y p ( k ) − B ( q − ) r ( k ) also converges to zero in the closed-loop system.25 Figure 6:
Transient behaviour of the systems switched parameter θ , its estimate ˆ θ ( k ), the estimationerror ˜ θ ( k ) and the tracking error e ( k ) in the indirect APPC task using DREM-based estimatorFor performance comparison a simulation was completed with the overparameterized model (70) usingthe standard gradient estimatorˆ S ( k ) = ˆ S ( k −
1) + Ω ⊤ ( k ) γ + Ω( k )Ω ⊤ ( k ) [ y ( k ) − Ω( k ) ˆ S ( k − , with γ = 1 and the adaptive controller (71). The simulation results are presented in Figs. 7 and 8. Asseen from Fig. 7 the estimated parameters cross the singularity line S = S . However, due to the DTnature of the equations, they “jump” through it without inducing an unacceptable transient behavior inthe control calculation—a coincidence that, of course, cannot be theoretically predicted. As seen fromFigs. 8, parameter and tracking error convergence is twice as slow as the one of the DREM estimator.26 Figure 7:
Transient behaviour of the estimated parameters ˆ S ( k ) and ˆ S ( k ) and the singularity line S = S in the plane S − S Figure 8:
Transient behaviour of the switching parameters of the system S i ( θ ), their estimation ˆ S i ( k ),the estimation error ˜ S i ( k ) and the tracking error e ( k ) in the overparameterized indirect APPC27 .2.3 DREM-based direct APPC In this subsubsection we illustrate with a simple example how the DREM-based direct APPC avoids thebilinearity problem mentioned in Subsection 6.2. Towards this end, consider the DT system (57) with A ( q − ) = 1 + a q − , B ( q − ) = b q − + b q − , and choose a deadbeat control objective, that is, A m ( q − ) = 1. Since v = 2 the known parameter controllaw (58) takes the form (1 + l q − ) u ( k ) + ( p + p q − ) y p ( k ) = r ( k ) . Hence (63) becomes ( b q − + b q − )[(1 + l q − ) u ( k ) + ( p + p q − ) y p ( k )] = y p ( k ) . By solving equation (60) it is easy to see that p = 0, reducing the equation above to the form( b q − + b q − )[(1 + l q − ) u ( k ) + p y p ( k )] = y p ( k ) . Some simple calculations show that the latter may be written in the 5-dimensional LRE form y p ( k ) = Ω ⊤ ( k ) S ( θ ) , (73)where θ := col( b , b , p , l )Ω( k ) := col( y p ( k − , y p ( k − , u ( k − , u ( k − , u ( k − S ( θ ) := col( θ θ , θ θ , θ , θ θ + θ , θ θ ) . (74)The bijective mapping D : R → R η = D ( θ ) = col( θ , θ θ , θ θ , θ θ ) , with right inverse D I : R → R θ = D I ( θ ) = col (cid:16) η , η η η , η η , η η η η (cid:17) , (75)verifies Assumption 1 with, T := , P = I . Indeed, computing the mapping W ( η ) := S ( D I ( η )) = col (cid:16) η , η , η , η η η − η η η , η (cid:17) , and the matrix C := [ I | × ] T = Hence, we get the “good” mapping is G ( η ) = col( W ( η ) , W ( η ) , W ( η ) , W ( η )) = η, whose Jacobian is ∇G ( η ) = I , which clearly satisfies (8) and (17) with the constants ν = 1 and ρ = 2,respectively.. 28 Conclusions
It has been shown that the DREM procedure can be used to estimate the parameters of a CT or DTNPRE of the form (1), provided the “monotonizability” Assumption 1 holds and some weak excitationconditions—encrypted in the scalar signal ∆—are satisfied. The applicability of the method has beenillustrated with several classical examples.We are currently pursuing the following research avenues. R1 As indicated in Remark 12 the highly attractive parameterization of EL systems proposed in [33] seemsto yield a non-identifiable NPRE. A rigorous proof of this claim is yet to be established. R2 Although the DREM estimator has a few tuning gains, e.g. , the filter constants ( λ for CT, and α for DT) and the adaptation gain γ , their impact on the transient behavior is hard to predict—see Subsubsection 5.2.5. A more thorough analysis of the sensitivity of the design vis-`a-vis thesecoefficients is yet to be derived. R3 Although avoiding overparameterization to handle NPRE seems, in principle, a sensible objective, itis not clear under which conditions this approach is really more convenient. Particularly consideringthat this is, until now, only applicable to “monotonizable” NPRE. R4 The verification of the conditions of Proposition 1 is carried out in our examples via direct inspection.A deeper understanding of the underlying structural features of the mapping S ( θ ) under which this ispossible would be highly desirable. It seems that such a study should appeal to principles of differentialalgebra. Acknowledgment
This paper is partly supported by the Ministry of Education and Science of Russian Federation (14.Z50.31.0031,goszadanie no. 8.8885.2017/8.9), NSFC (61473183, U1509211) and the Mexican CONACyT Basic ScientificResearch grant CB-282807.
References [1] A. Annaswamy, F. P. Skantze and A.P. Loh, Adaptive control of continuous-time systems with con-vex/concave parametrizations,
Automatica , vol. 34, pp. 33-49, 1998.[2] S. Aranovskiy, A. Bobtsov, R. Ortega and A. Pyrkin, Performance enhancement of parameter esti-mators via dynamic regressor extension and mixing,
IEEE Trans. Automatic Control , vol. 62, pp.3546-3550, 2017. (See also arXiv:1509.02763 for an extended version.)[3] A. Astolfi, D. Karagiannis and R. Ortega,
Nonlinear and Adaptive Control Design with Applications ,Springer-Verlag, London, 2007.[4] G. Bastin and D. Dochain,
On-line Estimation and Adaptive Control of Bioreactors , Elsevier, Ams-terdam, 1990.[5] A. Berman and R. Plemmons,
Nonnegative Matrices in the Mathematical Sciences , SIAM, 1979.[6] S. Boyd and L. van den Berghe,
Convex Optimization , Cambridge University Press, NewYork, 2004.[7] S. Dasgupta and B.D.O. Anderson, Physically based parameterizations for designing adaptive algo-rithms,
Automatica , vol. 23, no. 4, pp. 469-477, 1987.298] B. P. Demidovich, Dissipativity of nonlinear systems of differential equations,
Vestnik Moscow StateUniversity, Ser. Mat. Mekh., Part I-6 , (1961) pp. 19-27;
Part II-1 , (1962), pp. 3-8, (in Russian).[9] V. Fomin, A. Fradkov and V. Yakubovich,
Adaptive Control of Dynamical Systems , Eds. Nauka,Moskow, 1981 (in Russian).[10] A. Fradkov, R. Ortega and G. Bastin, Semi-adaptive control of convexly parametrized systems withapplication to temperature regulation of chemical reactors,
Int. J. of Adaptive Control and SignalProcessing , vol.15, pp. 415-426, 2001.[11] D.N Gerasimov, M.E. Belyaev and V.O. Nikiforov, Performance improvement of discrete MRAC bydynamic and memory regressor extension,
European Control Conference (ECC’19) , Naples, Italy, June25-28, 2019.[12] G. Goodwin and K. S. Sin,
Adaptive Filtering Prediction and Control , Prentice Hall, Leban, Indiana,U.S.A, 1984.[13] J. -B. Hiriart-Urruty and C. Lemar´echal,
Fundamentals of Convex Analysis , Springer, London, 2001.[14] P. A. Ioannou and J. Sun,
Robust Adaptive Control , Printice Hall, 1996[15] E. Izhikevich,
Dynamical Systems in Neuroscience: the Geometry of Excitability and Bursting , MITPress, USA, 2007.[16] H. K. Khalil,
Nonlinear Systems , Third Edition, Prentice Hall, 2002.[17] P. Khosla and T. Kanade, Parameter identification of robot dynamics, , Ft. Lauderdale, FL, USA, December 1985.[18] M. Korotina, S. Aranovskiy, R. Ushirobina and A. Vedyakov, On parameter tuning and convergenceproperties of the DREM procedure, in Proc. European Control Conference , Saint-Petersburg, Russia,2020 (
Submitted ).[19] G. Kreisselmeier, Adaptive observers with exponential rate of convergence,
IEEE Trans. AutomaticControl , vol. 22, no. 1, pp. 2-8, 1977.[20] X. Liu, R. Ortega, H. Su and J. Chu, Immersion and invariance adaptive control of nonlinearlyparameterized nonlinear systems,
IEEE Trans. Automatic Control , vol. 55, no. 9, pp. 2209-2214, 2010.[21] X. Liu, R. Ortega, H. Su and J. Chu, On adaptive control of nonlinearly parameterized nonlinearsystems: Towards a constructive procedure,
Systems and Control Letters , vol. 10, pp. 36-43, 2011.[22] L. Ljung,
System Identification: Theory for the User , Prentice Hall, New Jersey, 1987.[23] O. Nelles,
Nonlinear System Identification , Springer-Verlag, Berlin, 2001.[24] M. Netto, A. Annaswamy, R. Ortega and P. Moya, Adaptive control of a class of nonlinearlyparametrized systems using convexification,
Int. J. of Control , vol. 73, No. 14, pp. 1312-1321, 2000.[25] M. Netto, A. Annaswamy, S. Mammar and N. Minoiu, A new adaptive control algorithm for systemswith multilinear parametrization, in
Taming Heterogeneity and Complexity of Embedded Control , EdsF. Lamnabhi et al., ISTE Ltd, London, pp. 505-522, 2006.[26] R. Ortega, Some remarks on adaptive neuro-fuzzy systems,
Intern. J. Adaptive Control and SignalProcessing , vol. 10, pp.79-83, 1996. 3027] R. Ortega, A. Loria, P. J. Nicklasson and H. Sira-Ramirez,
Passivity–Based Control of Euler–LagrangeSystems , Springer-Verlag, Berlin, Communications and Control Engineering, 1998.[28] A. Pavlov, A. Pogromsky, N. van de Wouw and H. Nijmeijer, Convergence dynamics, a tribute toBoris Pavlovich Demidovich,
Systems & Control Letters , vol. 52, pp. 257-261, 2004.[29] A. Pyrkin, R. Ortega, V. Gromov, A. Bobtsov and A. Vedyakov, A Globally convergent direct adaptivepole-placement controller for nonminimum phase systems with relaxed excitation assumptions,
Int. J.on Adaptive Control and Signal Processing , vol 33, no. 10, pp. 1457-1600, 2019.[30] W. Rudin,
Principles of Mathematical Analysis , 3rd Ed., McGraw-Hill, Inc. NY, 1976.[31] S. Sastry and M. Bodson,
Adaptive Control: Stability, Convergence and Robustness , Prentice Hall,Englewood Cliffs, N.J, 1989.[32] J.J. E. Slotine and W. Li, Adaptive manipulator control: a case study,
IEEE Trans. AutomaticControl , vol. 33, no. 11, 995-1003, 1988.[33] J.J. E. Slotine and W. Li, Composite adaptive control of robot manipulators,
Automatica , vol. 25, no.4, pp. 509-519, 1989.[34] M. Spong, R. Ortega and R. Kelly, Comments on adaptive manipulator control: a case study,
IEEETrans. Aut. Cont. (Correspondence) , vol. 35, vo. 6, pp. 761-762, 1990.[35] I. Y. Tyukin, D. V. Prokhorov and V. A. Terekhov, Adaptive control with nonconvex parameterization,
IEEE Trans. Automatic Control , vol. 48, no. 4, pp. 554-567, 2003.[36] I. Y. Tyukin, D. V. Prokhorov and C. V. Leeuwen, Adaptation and parameter estimation in systemswith unstable target dynamics and nonlinear parameterization,
IEEE Trans. Automatic Control , vol.52, no. 9, pp. 1543-1559, 2007.[37] A. van der Schaft, L –Gain and Passivity Techniques in Nonlinear Control–Gain and Passivity Techniques in Nonlinear Control