Schur functions for approximation problems
aa r X i v : . [ m a t h . NA ] M a y Schur functions for approximation problems
Nadezda Sukhorukova and Julien UgonOctober 5, 2018
Abstract
In this paper we propose a new approach to least squares approxima-tion problems. This approach is based on partitioning and Schur func-tion. The nature of this approach is combinatorial, while most existingapproaches are based on algebra and algebraic geometry. This problemhas several practical applications. One of them is curve clustering. Weuse this application to illustrate the results.
In this paper we formulate a specific least least squares approximationproblem and provide a signal processing application where this problemis used. The main technical difficulty for this problem is to solve linearsystems with the same system matrix and different right-hand sides. Onesimple approach that can be proposed here is to invert the system matrixand multiply the updated right-hand side by this inverse at each iteration.In general, it is not very efficient to solve linear systems through comput-ing matrix inverses, but in this particular application it is very beneficial.One technical difficulty here is to know in advance whether the system ma-trix is invertible or not. Similar problems appear in Chebyshev (uniform)approximation problems as well.In this paper we suggest a new approach for dealing with this kind ofsystems. This approach in based on Schur functions, a well-establishedtechniques that is used to describe partitioning [3]. The very nature ofthese functions is combinatorial. Based on our previous experience [1], thecharacterisation of the necessary and sufficient optimality conditions formultivariate Chebyshev approximation is also combinatorial and thereforeSchur function is a very natural tool to work with these problems.This paper is organised as follows. In section 2 we introduce a signalprocessing application that relies on approximation and optimisation. Insection 3 we provide a mathematical formulation to the signal processingproblem and discuss how it can be simplified. In section 4 we introduce aninnovative approach for solving the problem. This approach is based onSchur functions. Finally, in section 5 we provide future research directions. Signal clustering
In signal processing, there is a need for constructing signal prototypes.Signal prototypes are summary curves that may replace the whole groupof signal segments, where the signals are believed to be similar to eachother. Signal prototypes may be used for characterising the structure ofthe signal segments and also for reducing the amount of information tobe stored.Any signal group prototype should be an accurate approximation foreach member of the group. On the top of this, it is desirable that theprocess of recomputing group prototypes, when new group members areavailable, is not computationally expensive.In this paper we suggest a k -means and least square approximationbased model. Similar models are proposed in [4]. This is a convex optimi-sation problem. There are several advantages of this model. First of all,it provides an accurate approximation to the group of signals. Second,this problem can be obtained as a solution to a linear system and can besolved efficiently. Finally, the proposed approach allows one to computeprototype updates without recomputing from scratch. Assume that there is a group of l signals S ( t ) , . . . , S l ( t ), whose values aremeasured at discrete time moments t , . . . , t N , t i ∈ [ a, b ] , i = 1 , . . . , N. We suggest to construct the prototype as a polynomial P n ( X , t ) = P ni =0 x i t i of degree n , whose least squares deviation from each member of the groupon [ a, b ] is minimal. That is, one has to solve the following optimisationproblem: minimise F ( X ) = N X i =1 l X j =1 ( S j ( t i ) − P n ( X , t i )) , (3.1)where X = ( x , . . . , x n ) ∈ R n +1 , x k , k = 0 , . . . , n are the polynomialparameter and also the decision variables. Each signal is a column vector S j = ( S j ( t ) , . . . , S j ( t N )) T , j = 1 , . . . , l. Problem (3.1) can be formulated in the following matrix form:minimise F ( X ) = k Y − BX k , (3.2)where X = ( x , . . . , x n ) ∈ R n +1 , are the decision variables (same as in (3.1)); ector Y = S S ... S l ∈ R ( n +1) l matrix B contains repeated matrix blocks, namely, B = B B B ... B , where B = t t . . . t n t t . . . t n ... ... .. . . . . ...1 t N t N . . . t nN . This least squares problem can be solved using a system of normal equa-tions: B T BX = B T Y . (3.3)Taking into account the structure of the system matrix in (3.3), the prob-lem can be significantly simplified: l B T B X = B T l X k =1 S k . (3.4)Therefore, instead of solving (3.3), one can solve B T B X = B T P lk =1 S k l = B T S , (3.5)where S is the average of all l signals of the group (centroid). Suppose that a signal group prototype has been constructed. Assume nowthat we need to update our group of signals: some new signals have to beincluded, while some others are to be excluded. To update the prototype,one needs to update the centroid and solve (3.5) with the updated right-hand side, while the system matrix B T B remains the same.If only few signals are moving in and out of the group, then the updatedcentroid can be calculated without recomputing from scratch. Assumethat l a signals are moving in the group (signals S a ( t ) , . . . S l a a ), while l r aremoving out (signals S r ( t ) , . . . S l r r ), then the centroid can be recalculatedas follows: S new ( t ) = lS old ( t ) + P l a k =1 S ka ( t ) + P l r k =1 S kr ( t ) l − l r + l a . ince the same system has to be solved repeatedly with different right-hand sides, one approach is to invert matrix B T B , which is an ( n + 1) × ( n +1) matrix. In most cases, n is much smaller than N or l and thereforethis approach is quite attractive, if we can guarantee that matrix B T B isinvertible. In the next section we discuss the verification of this property. Consider matrix B T B . In general, matrix B can be defined as follows: B = g ( t ) g ( t ) g ( t ) . . . g n +1 ( t ) g ( t ) g ( t ) g ( t ) . . . g n +1 ( t )... ... . .. . . . ... g ( t N ) g ( t N ) g ( t N ) . . . g n +1 ( t N ) , where g i , i = 1 , . . . , n + 1 are basis functions. In section 3 we were dis-cussing polynomial approximation and therefore, the components of ma-trix B T B are monomials that are evaluated at different time-moments.Recall that n + 1 << N . Matrix B T B is invertible if and only if ma-trix B has exactly n + 1 linearly independent rows. This is always thecase when functions g i , i = 1 , . . . , n + 1 form a Chebyshev system (forexample, monomials g i = t i − , some systems of trigonometric functions).This is not always the case when, for example, some of the monomials are“missing” from the system. This situation is illustrated in the followingexample. Example 4.1
Consider the system of two monomials on the segment [ − , : g ( t ) = 1 , g ( t ) = t , t = t , the monomial t is “missing”. Take time-moments t , t ∈ [ − , . Thedeterminant (cid:12)(cid:12)(cid:12)(cid:12) t t (cid:12)(cid:12)(cid:12)(cid:12) = 0 ⇔ t = − t . Therefore, these functions do not form a Chebyshev system, since thecorresponding determinant is zero when, for example, t = − t = 1 andthere is only one linear independent row. Recall that in the case of classical polynomial approximation (all mono-mials are included into the set of basis functions), the corresponding de-terminant is non-zero as it is the determinant of a Vandermonde matrix.We now need to introduce so called generalised Vandermonde matrices.
Definition 4.1
Generalised Vandermonde matrices have the followingstructure: G = t m t m . . . t m n +1 t m t m . . . t m n +1 ... ... . . . ... t m n t m n . . . t m n n +1 . enote m = λ + n − , m = λ + n − , . . . , m n = λ n + n − n = λ n . (4.1)Define the following function s λ ( t , . . . , t n +1 ) = det( G )det( V ) , (4.2)where G is the matrix with one or more missing monomials and V is theVandermonde matrix. Vandermonde matrices correspond to λ = λ = · · · = λ n = 0 .s λ ( t , . . . , t n +1 ) is called Schur function, named after Issai Schur. Schurpolynomials are certain symmetric polynomials of n variables. This poly-nomials are used in representation and partitioning. A good introductionto Schur polynomials can be found in [3]. Therefore,det( G ) = s λ ( t , . . . , t n )det( V )and hence one needs to study the behaivour of Schur functions. Therefore,the following theorem holds. Theorem 4.1
Matrix B T B is non-singular if and only if the corre-sponding Schur function (4.2) is non-zero. In particular, if t i > , i = 1 , . . . , n + 1, then the system is Chebyshev.Note that this statement can be proven using a logarithmic transforma-tion [2]. We believe, however, that our approach is also applicable to moregeneral settings.There are many studies on Schur polynomials and many efficient waysfor computing them. This approach can be used, for example, if one needsto know if B T B is invertible. If the matrix is invertible, one can developa very fast and efficient algorithm for curve cluster prototype updates. Ifthe matrix is singular, one can use the singular-value decomposition forconstructing the prototype updates. This decomposition can be computedonce, since B T B remains unchanged when the cluster membership isupdated. There are many studies on how to compute Schur functions. We areparticularly interested in the extension of this approach to Chebyshev(uniform) approximation and multivariate approximation. This is a verypromising approach for dealing with this type of problems, since, as ourprevious studies suggested [1] the corresponding optimality conditions arevery combinatorial in their nature and therefore, Schur functions are avery natural tool for study this kind of systems.We are also planning to conduct a thorough numerical study of thesignal processing application we are discussing in this paper. Acknowledgement
This paper was inspired by the discussions during a recent MATRIX pro-gram “Approximation Optimisation and Algebraic Geometry’’ that tookplace in February 2018. We are thankful to the MATRIX organisers,support team and participants for a terrific research atmosphere and pro-ductive discussions.
References [1] N. Sukhorukova, J. Ugon and D. Yost,
Chebyshev multivariate poly-nomial approximation and exchange procedure , Editor-in-chief: DavidWood, Jan de Gier, Cheryl Praeger and Terence Tao, , Springer International Publishing, 2018.[2] Samuel Karlin and William Studden,
Tchebycheff systems, with appli-cations in analysis and statistics , Interscience Publishers New York,1966 (English).[3] I. G. Macdonald,
Symmetric functions and hall polynomials , Claren-don Press Oxford University Press, Oxford New York, 1995.[4] H. Sp¨ath,
Cluster analysis algorithms for data reduction and classifi-cation of objects , Ellis Horwood Limited, Chichester, 1980., Ellis Horwood Limited, Chichester, 1980.