[PDF] High-dimensional nonlinear approximation by parametric manifolds in Hölder-Nikol'skii spaces of mixed smoothness

Abstract

We study high-dimensional nonlinear approximation of functions in H\"older-Nikol'skii spaces H^\alpha_\infty(\mathbb{I}^d) on the unit cube \mathbb{I}^d:=[0,1]^d having mixed smoothness, by parametric manifolds. The approximation error is measured in the L_\infty-norm. In this context, we explicitly constructed methods of nonlinear approximation, and give dimension-dependent estimates of the approximation error explicitly in dimension d and number N measuring computation complexity of the parametric manifold of approximants. For d=2, we derived a novel right asymptotic order of noncontinuous manifold N-widths of the unit ball of H^\alpha_\infty(\mathbb{I}^2) in the space L_\infty(\mathbb{I}^2). In constructing approximation methods, the function decomposition by the tensor product Faber series and special representations of its truncations on sparse grids play a central role.

Full PDF

aa r X i v : . [ m a t h . NA ] F e b High-dimensional nonlinear approximation by parametric manifoldsin H¨older-Nikol’skii spaces of mixed smoothness

Dinh D˜ung a and Van Kien Nguyen ba Vietnam National University, Hanoi, Information Technology Institute144 Xuan Thuy, Cau Giay, Hanoi, VietnamEmail: [email protected] b Faculty of Basic Sciences, University of Transport and CommunicationsNo.3 Cau Giay Street, Lang Thuong Ward, Dong Da District, Hanoi, VietnamEmail: [email protected] 9, 2021

Abstract

We study high-dimensional nonlinear approximation of functions in H¨older-Nikol’skii spaces H α ∞ ( I d ) on the unit cube I d := [0 , d having mixed smoothness, by parametric manifolds. The ap-proximation error is measured in the L ∞ -norm. In this context, we explicitly constructed methodsof nonlinear approximation, and give dimension-dependent estimates of the approximation errorexplicitly in dimension d and number N measuring computation complexity of the parametricmanifold of approximants. For d = 2, we derived a novel right asymptotic order of noncontinuousmanifold N -widths of the unit ball of H α ∞ ( I ) in the space L ∞ ( I ). In constructing approximationmethods, the function decomposition by the tensor product Faber series and special representationsof its truncations on sparse grids play a central role. Keywords and Phrases:

High-dimensional problem; Nonlinear approximation; Parametric man-ifold; Mixed smoothness; Sparse grids

Mathematics Subject Classiﬁcations (2020)

Some problems in approximation theory and numerical analysis driven by a lot of applications inInformation Technology, Mathematical Finance, Chemistry, Quantum Mechanics, Meteorology, and,in particular, in Uncertainty Quantiﬁcation and Deep Machine Learning, are formulated in high di-mensions when the number of involved variables are very large. Numerical methods for such problemsmay require computational cost increasing exponentially in dimension which makes the computationintractable when the dimension of input data is very large. Hyperbolic crosses and sparse grids promiseto rid this “curse of dimensionality” in some problems when high-dimensional data belongs to cer-tain classes of functions having mixed smoothness. Function spaces having mixed smoothness appearnaturally in many models of real world problem in mathematical physics, ﬁnance and other ﬁelds,for instance, the regularity properties eigenfunctions of the electronic Schr¨odinger operator [33] or1he existence of solution of Navier-Stokes equations when initial data belonging to spaces with mixedsmoothness [29, Chapter 6]. Approximation methods and sampling algorithms for functions havingmixed smoothness constructed on hyperbolic crosses and sparse grids give a surprising eﬀect sincehyperbolic crosses and sparse grids have the number of elements much less than those of standarddomains and grids but give the same approximation error. This essentially reduces the computationalcost, and therefore makes the problem tractable. Sparse grids for approximate sampling recovery andintegration were ﬁrst considered by Smolyak [24]. In computational mathematics, the sparse grid ap-proach was initiated by Zenger [34]. There has been a very large number of papers on hyperbolic crossand sparse-gird approximation and numerical applications to count all of them. We refer the reader to[1, 11] for surveys and for recent further developments and results. We also refer to the monographs[20, 21] for concepts and results on high dimensional problems and computation complexity.Let us mention some recent results on diﬀerent aspects of the problem of dimension-dependent errorestimation in high-dimensional approximation which are directly related to our papers. The papers[2, 3, 4, 6, 7, 10, 16, 18] are on this problem for hyperbolic cross approximation of functions withmixed smoothness in terms of various n -widths and ε -dimensions. The authors of [6, 10] in particular,extended these problems for inﬁnite-dimensional approximation with applications to stochastic andparametric PDEs. Preasymptotic estimation of high-dimensional problems were also treated in [16,17, 18]. Related high-dimensional problems were studied in [22, 30] based on ANOVA decomposition.The paper [11] has investigated dimension-dependent estimates of the approximation error for linearalgorithms of sampling recovery on Smolyak grids of functions from the space with H¨older–Zygmundmixed smoothness. It proved some upper bounds and lower bounds of the error of the optimal samplingrecovery on Smolyak grids, explicit in dimension. All of the above mentioned papers considered onlylinear problems of high-dimensional approximation.While linear methods utilize approximation from ﬁnite-dimensional spaces, nonlinear approxima-tion means that the approximants do not come from linear spaces but rather from sets of nonlinearstructure such as nonlinear manifolds, set of ﬁnite cardinality, . . . It is well understood that nonlinearmethods of approximation and numerical methods derived from them often produce superior perfor-mance when compared with linear methods. Several notions of linear and nonlinear widths have beenintroduced to quantify optimality of approximation methods. Let us recall some of them.Let X be a normed space, F and G subsets in X . We consider the problem of approximation of f ∈ F by elements g ∈ G . The approximation error is measured by k f − g k X . The worst case errorof the approximation of elements f ∈ F by elements g ∈ G is deﬁned as E ( F, G, X ) := sup f ∈ F E ( f, G, X ) := sup f ∈ F inf g ∈ G k f − g k X . In numerical applications, an (linear and nonlinear) approximation method is usually based on aﬁnite information in the form of the N values b ( f ) , . . . , b N ( f ) of functionals. Such an approximationmethod can be seen as Q N ( f ) = P N ( a N ( f )) for a pair of mappings a N : F → R N and P N : R N → X. (1.1)The approximant set G N := P N ( R N ) can be seen as a manifold in X parameterized by R N . Theparameter N characterizes computation complexity of the approximation method. We specify ap-proximation methods having common properties by a certain set Q N of pairs ( a N , P N ), and look foran optimal method Q N ∈ Q N of approximation of f ∈ F in terms of the quantity d ( F, Q N , X ) := inf ( a N ,P N ) ∈Q N sup f ∈ F k f − P N ( a N ( f )) k X . (1.2)2t is remarkable that the deﬁnition (1.2) is ﬁt to notion of some important quantities of best linear andnonlinear approximation. Thus, if linear approximation is understood as approximation by elementsfrom a ﬁnite-dimensional linear subspace, the well-known Kolmogorov widths d N ( F, X ) and linear N -widths λ N ( F, X ) being diﬀerent quantities of best linear approximation can be deﬁned as d N ( F, X ) := d ( F, Q dN , X ) = inf linear subspaces XN dim XN ≤ N E ( F, X N ) , (1.3)and λ N ( F, X ) := d ( F, Q λN , X ), where Q dN is the set of all pairs of mappings ( a N , P N ) such that P N maps R N to some linear subspace X N ⊂ X of dimension at most N , and Q λN is the set of all pairs oflinear mappings ( a N , P N ). Here the right-hand side of (1.3) is the traditional deﬁnition of Kolmogorov N -widths (see, e.g., [11] for a traditional deﬁnition of linear N -widths).We next discuss the deﬁnition (1.2) for some quantities of best nonlinear approximation. Theﬁrst notion given in [14], is (continuous) manifold N -width and deﬁned by requiring ( a N , P N ) to becontinuous: δ N ( F, X ) := d ( F, Q δN , X ) := inf ( a N ,P N ) ∈Q δN sup f ∈ F k f − P N ( a N ( f )) k X , where Q δN is the set of all pairs of continuous mappings ( a N , P N ). Here the approximant set G N := P N ( R N ) is a continuous manifold in X .The requirement of continuity on a N , P N is too minimal and does not give stability used in practice.To have stability in the numerical implementation one can restrict mappings a N and P N to be Lipschitzcontinuous. Based on this idea, in [5] the authors have introduced a notion of stable manifold N -widthsby the formula δ ∗ N ( F, X ) := d ( F, Q δ ∗ N , X ), where Q δ ∗ N is the subset in Q δN of all Lipschitz mappings a N , P N with some ﬁxed constant γ ≥

1, that is | a N ( f ) − a N ( g ) | ≤ γ k f − g k X , and k P N ( x ) − P N ( y ) k X ≤ γ | x − y | , x , y ∈ R N with the Euclid norm | · | .However, in many numerical applications approximation methods do not have continuous proper-ties. The nonlinear N -width which is not based on continuity condition was suggested by Kolmogorov(1955) in the form of inverse quantity, ε -entropy. This is entropy N -width ε N ( F, X ) := d ( F, Q εN , X ) = inf X N ⊂ X : | X N |≤ N E ( F, X N ) , where Q εN is the set of all pairs of mappings ( a N , P N ) such that a N maps F into { , } N ⊂ R N and | X N | denotes the cardinality of | X N | .Another way to avoid the continuous restriction to require the approximant set G N := P N ( R N )which is in general, noncontinuous manifold parameterized by R N , to be contained in a ﬁnite-dimensional linear subspace. This leads to a notion of (noncontinuous) nonlinear manifold N, M -width d N,M ( F, X ) of a subset F in X as d N,M ( F, X ) := d ( F, Q dN,M , X ) = inf ( a N ,P N ) ∈Q dN,M sup f ∈ F k f − P N ( a N ( f )) k X , where Q dN,M is the set of all pairs of mappings ( a N , P N ) such that P N maps R N to some linearsubspace X M ⊂ X of dimension at most M . The parameter M in some sense only controls the lineardimension of the parametric manifold P N ( R N ), but is not related to computation complexity of theapproximation method which is as above mentioned, characterized by the parameter N . Notice thatwith N ≤ M d M ( F, X ) ≤ d N,M ( F, X ) ≤ d N ( F, X ) and d N,N ( F, X ) = d N ( F, X ) . (1.4)3ne may assume that N and M are comparable, in particular, take M = M ( N ) with the restriction N ≤ M ( N ) ≤ CN (log N ) κ for some κ ≥ C ≥

1. With this assumption d N,M ( N ) ( F, X ) now onlydepends on N . It is surprising that for some cases, d N,M ( N ) ( F, X ) may have asymptotic order lessthan asymptotic order of any known nonlinear N -widths. This is conﬁrmed at least by the followingexample.Let ˚ U α,d ∞ be the unit ball of the H¨older-Nikol’skii space of functions on the unit cube I d of mixedsmoothness 0 < α ≤ I d (see Section 2 for a deﬁnition). Then when d = 1 we have that d N (˚ U α, ∞ , L ∞ ( I )) ≍ δ N (˚ U α, ∞ , L ∞ ( I )) ≍ N − α , (1.5)and d N, ⌊ N log N ⌋ (˚ U α, ∞ , L ∞ ( I )) ≍ ( N log N ) − α . (1.6)The asymptotic order of d N (˚ U α, ∞ , L ∞ ( I )) in (1.5) is well-known, see, e.g. , [25] for references. Theasymptotic order of δ N (˚ U α, ∞ , L ∞ ( I )) in (1.5) was proven in [14]. The upper bound of (1.6) followsfrom recent results obtained in [32] for the case α = 1 and in [13] for the case α ∈ (0 , d N, ⌊ N log N ⌋ (˚ U α, ∞ , L ∞ ( I )) ≥ d ⌊ N log N ⌋ (˚ U α, ∞ , L ∞ ( I )) ≍ ( N log N ) − α . Here, we want to emphasize that the right asymptotic order of Kolmogorov width d N (˚ U α,d ∞ , L ∞ ( I d ))and linear N -widths λ N (˚ U α,d ∞ , L ∞ ( I d )) as well as of nonlinear N -widths for d ≥ d = 2 (see [11, Chapter 4] for detailed comments).All the above remarks and comments motivate us to consider high-dimensional nonlinear approxi-mation by parametric manifolds for functions from the unit ball ˚ U α,d ∞ of H¨older-Nikol’skii spaces havingmixed smoothness α . The approximation error is measured in the L ∞ -norm. In this context, we inves-tigate the explicit construction of approximation methods of the form (1.1) with ( a N , P N ) ∈ Q dN,M ( N ) for approximation of f ∈ ˚ U α,d ∞ , and explicit estimates in dimension d and N of the approximationerror. We also treat the problem of right asymptotic order of d N,M ( N ) (˚ U α,d ∞ , L ∞ ). (Here and in whatfollows, we use the abbreviation: L ∞ := L ∞ ( I d ) and k · k ∞ := k · k L ∞ .)Let us brieﬂy describe our main contribution. Let N ∈ N with N ≥ N ( d ) be given and M ( N ) := (cid:4) (12 d ) d − ( d − N (log N )(log log N ) − ( d − (cid:5) where N ( d ) is a certain number (see (3.21)). Then wecan explicitly construct a M ( N )-dimensional subspace F M ( N ) of continuous functions on I d spannedby tensor product Faber basis functions, and maps λ ∗ N : ˚ U α,d ∞ → R N and G ∗ N : R N → F M ( N ) ⊂ C ( I d ) , (1.7)so thatsup f ∈ ˚ U α,d ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ ≤ C α (cid:18) K d − ( d − (cid:19) α +1 (log N ) ( d − α +1) ( N log N ) α (log log N ) ( d − α , (1.8)and sup f ∈ ˚ U α,d ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ ≥ C α,d (log N ) ( d − α + ) ( N log N ) α (log log N ) ( d − α , (1.9)where K := (4 α / (2 α − / (2 α +1) , C α := 2 α +2 / (2 α −

1) and the constant C α,d depends on α, d only.In the case d = 1, (1.8) follows from results on approximation by deep ReLU networks which have4een proven in [32] ( α = 1) and [13] ( α ∈ (0 , (cid:0) K d − ( d − (cid:1) α +1 in the right-hand of(1.8) decays super-exponentially when d → ∞ . To our knowledge (1.8) is the ﬁrst result on dimension-dependent error estimation of nolinear approximation of functions having mixed smoothness.From (1.8) and (1.9) we also derived some upper and lower estimates for the noncontinuous man-ifold widths d N,M ( N ) (˚ U α,d ∞ , L ∞ ) (see Corollary 3.5). Especially, when d = 2 we obtain the novel rightasymptotic order d N, ⌊ N (log N )(log log N ) − ⌋ (˚ U α, ∞ , L ∞ ( I )) ≍ N − α log N (log log N ) α . (The case d = 1 already is given in (1.6).) Let us compare this asymptotic order with the asymptoticorder of other well-known N -widths. We have for α ∈ (0 ,

1) that ε N (˚ U α, ∞ , L ∞ ( I )) ≍ δ ∗ N (˚ U α, ∞ , L ∞ ( I )) ≍ d N (˚ U α, ∞ , L ∞ ( I )) ≍ λ N (˚ U α, ∞ , L ∞ ( I )) ≍ N − α (log N ) α +1 . (1.10)The results in (1.10) were proven in [26] for entropy N -widths, and in [27] Kolmogorov N -widths.For the asymptotic order of λ N (˚ U α, ∞ , L ∞ ( I )) in (1.10) see [11, p. 67]. The asymptotic order of δ ∗ N (˚ U α, ∞ , L ∞ ( I )) in (1.10) follows from a Carl’s type inequality between ε N and δ ∗ N [5] and theinequality δ ∗ N ≤ λ N . To knowledge of the authors, the asymptotic order of δ N (˚ U α, ∞ , L ∞ ( I )) is notknown, except the upper bound via the inequality δ N ≤ d N and (1.10). Comparing the asymptoticorder of d N, ⌊ N (log N )(log log N ) − ⌋ (˚ U α, ∞ , L ∞ ( I )) with the asymptotic order of the “smallest” entropy N -widths ε N (˚ U α, ∞ , L ∞ ( I )) and the other N -widths in (1.10), we ﬁnd that the ﬁrst one is smallest inlogarithm scale.In construction of approximation and estimation of the approximation error, a representation offunctions in ˚ U α,d ∞ by tensorized Faber series plays a central role. We primarily approximate f ∈ ˚ U α,d ∞ by the truncations of tensorized Faber series R m ( f ) on sparse grids, and then approximate the function f − R m ( f ) by combining a sparse-grid interpolation approximation and an approximation by sets ofﬁnite cardinality.The outline of this paper is as follows. In Section 2, we present some auxiliary knowledge: adeﬁnition of H¨older-Nikol’skii spaces of mixed smoothness H α ∞ ( I d ) and a representation of functionsin H α ∞ ( I d ) based on tensorized Faber basis. In this section, we also study auxiliary approximation offunctions f ∈ ˚ U α,d ∞ by truncations of tensorized Faber series R m ( f ) on sparse grids, and approximationof f by sets of ﬁnite cardinality. Section 3 is devoted to construction of manifold approximation forfunctions in H¨older-Nikol’skii spaces and estimation of the approximation error. Notation.

As usual, N is the natural numbers, Z is the integers, R is the real numbers and N := { s ∈ Z : s ≥ } ; N − = N ∪ {− } . The letter d is reserved for the underlying dimension of R d , N d , etc. We use x i to denote the i th coordinate of x ∈ R d , i.e., x := ( x , . . . , x d ). For x , y ∈ R d , xy denotes the Euclidean inner product of x , y , and 2 x := (2 x , . . . , x d ). For k , s ∈ N d , we denote2 − k s := (2 − k s , . . . , − k d s d ). For x ∈ R d , we denote | x | := | x | + . . . + | x d | . We use the abbreviation: L ∞ := L ∞ ( I d ) and k · k ∞ := k · k L ∞ . Universal constants or constants depending on parameter α, d aredenoted by C or C α,d , respectively. Values of constants C and C α,d in general, are not speciﬁed exceptthe case when they are precisely given, and may be diﬀerent in various places. For two sequences a n and b n we will write a n . b n if there exists a constant C > a n ≤ C b n for all n , and a n ≍ b n if a n . b n and b n . a n . | A | denotes the cardinality of the ﬁnite set | A | .5 Approximation by truncated Faber series

This section presents some preliminaries. We ﬁrst provide a deﬁnition of H¨older-Nikol’skii spaces ofmixed smoothness H α ∞ ( I d ) and certain properties of these spaces. As a preparation for the manifoldapproximation in the next section we recall a representation of continuous functions on the unit cubeby the tensorized Faber series. We then give an estimate for the representation coeﬃcients of functionsfrom H¨older-Nikol’skii spaces and the error of the approximation of f ∈ H α ∞ ( I d ) by truncations of thetensorized Faber series R m ( f ). In the last part of this section, a set of ﬁnite cardinality is explicitlyconstructed to approximate functions in H α ∞ ( I d ) and approximation error is given explicitly in d . This subsection is devoted to introducing the H¨older-Nikol’skii spaces of mixed smoothness underconsideration. For univariate functions f on I , the diﬀerence operator ∆ h is deﬁned by∆ h f ( x ) := f ( x + h ) − f ( x ) , for all x and h ≥ x, x + h ∈ I . If u is a subset of { , . . . , d } , for multivariate functions f on I d the mixed diﬀerence operator ∆ h ,u is deﬁned by∆ h ,u := Y i ∈ u ∆ h i , ∆ h , ∅ = Id , for all x and h such that x , x + h ∈ I d . Here the univariate operator ∆ h i is applied to the univariatefunction f by considering f as a function of variable x i with the other variables held ﬁxed. If 0 < α ≤ | f | H α ∞ ( u ) for functions f ∈ C ( I d ) by | f | H α ∞ ( u ) := sup h > Y i ∈ u h − αi k ∆ h ,u ( f ) k C ( I d ( h ,u )) (in particular, | f | H α ∞ ( ∅ ) = k f k C ( I d ) ), where I d ( h , u ) := { x ∈ I d : x i + h i ∈ I , i ∈ u } . The H¨older-Nikol’skii space H α ∞ ( I d ) of mixed smoothness α then is deﬁned as the set of functions f ∈ C ( I d ) forwhich the norm k f k H α ∞ ( I d ) := max u ⊂{ ,...,d } | f | H α ∞ ( u ) is ﬁnite. From the deﬁnition we have that H α ∞ ( I d ) ⊂ C ( I d ). The space H α ∞ ( I d ) is a d -time tensorproduct of the space H α ∞ ( I ) in the sense of equivalent norms. For further properties of this spacesuch as embeddings, characterization by wavelets and atoms, we refer the reader to [19, 23, 31] andreferences there.Denote by ˚ C ( I d ) the set of all functions f ∈ C ( I d ) vanishing on the boundary ∂ I d of I d , i.e., theset of all functions f ∈ C ( I d ) such that f ( x ) = 0 if x j = 0 or x j = 1 for some index j ∈ { , . . . , d } .Let ˚ U α,d ∞ be the set of all functions f in the intersection H α ∞ ( I d ) ∩ ˚ C ( I d ) such that k f k H α ∞ ( I d ) ≤ In this subsection we describe a representation of functions in H α ∞ ( I d ) by tensorized Faber series whichplays a central role in the construction of nonlinear methods of noncontinuous manifold approximation6f functions from the unit ball ˚ U α,d ∞ . We give a dimension-dependent estimate of the approximationerror by truncation R m ( f ) of the tensorized Faber series for functions f ∈ ˚ U α,d ∞ . The approximant R m ( f ) represents an interpolation sampling recovery on sparse Smolyak grids.We start with the univariate case. Let ϕ ( x ) = (1 − | x − | ) + , x ∈ I , be the hat function (thepiece-wise linear B-spline with knots at 0 , , x + := max( x,

0) for x ∈ R . For k ∈ N − wedeﬁne the Faber functions ϕ k,s by ϕ k,s ( x ) := ϕ (2 k +1 x − s ) , k ≥ , s ∈ Z ( k ) := { , , . . . , k − } , and ϕ − ,s ( x ) := ϕ ( x − s + 1) , s ∈ Z ( −

1) := { , } . For a univariate function f on I , k ∈ N − , and s ∈ Z ( k ) we deﬁne λ k,s ( f ) := −

12 ∆ − k − f (cid:0) − k s (cid:1) , k ≥ , λ − ,s ( f ) := f ( s ) . Here ∆ h f ( x ) := f ( x + 2 h ) − f ( x + h ) + f ( x ) , for all x and h ≥ x, x + h ∈ I . The functions ϕ k,s , k ∈ N − , s ∈ Z ( k ), constitute a basisfor C ( I ) and every function f ∈ C ( I ) can be represented by the Faber series [15] f = X k ∈ N − q k ( f ) , q k ( f ) := X s ∈ Z ( k ) λ k,s ( f ) ϕ k,s (2.1)converging in the norm of C ( I ).For m ∈ N , we deﬁne the truncation of the Faber series R m ( f ) by R m ( f ) := m X k =0 q k ( f ) . The continuous piece-wise linear function R m ( f ) ∈ ˚ C ( I ) possesses a certain interpolatory property.Indeed, one can check that for f ∈ ˚ C ( I ) R m ( f ) = X s ∈ Z ∗ ( m ) f (2 − m − s ) ϕ ∗ m,s , (2.2)where for k ∈ N , ϕ ∗ m,s ( x ) := ϕ (2 m +1 x − s + 1) , s ∈ Z ∗ ( m ) := { , . . . , m +1 − } . Hence one can see that R m ( f ) interpolates f at the points 2 − m − s , s ∈ Z ∗ ( m ), that is, R m ( f )(2 − m − s ) = f (2 − m − s ) , s ∈ Z ∗ ( m ) . We next extend the representation (2.1) to functions in C ( I d ) by tensorization of the univariateFaber basis. Putting Z ( k ) := × di =1 Z ( k i ) , k ∈ N d − , s ∈ Z ( k ), we introduce the tensor product Faber functions ϕ k , s ( x ) := d Y i =1 ϕ k i ,s i ( x i ) , x ∈ I d , and deﬁne the linear functionals λ k , s for multivariate function f on I d by λ k , s ( f ) := d Y i =1 λ k i ,s i ( f ) , where the univariate functional λ k i ,s i is applied to the univariate function f by considering f as afunction of variable x i with the other variables held ﬁxed. Lemma 2.1

The tensorized functions (cid:8) ϕ k , s : k ∈ N d − , s ∈ Z ( k ) (cid:9) are a basis in C ( I d ) . Moreover,every function f ∈ C ( I d ) can be represented by the tensorized Faber series f = X k ∈ N d − q k ( f ) , q k ( f ) := X s ∈ Z ( k ) λ k , s ( f ) ϕ k , s (2.3) converging in the norm of C ( I d ) . The decomposition (2.3) when d = 2 and an extension for function spaces with mixed smoothness wasobtained independently in [28, Theorem 3.10] and in [8, Section 4]. A generalization for the case d ≥ f ∈ ˚ U α,d ∞ , λ k , s ( f ) = 0 if k j = − j ∈ { , . . . , d } , hence we can write f = X k ∈ N d q k ( f )with unconditional convergence in C ( I d ), see [28, Theorem 3.13]. In this case it holds the followingestimate | λ k , s ( f ) | = 2 − d (cid:12)(cid:12)(cid:12)(cid:12) d Y i =1 ∆ − ki − f (cid:0) − k s (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) = 2 − d (cid:12)(cid:12)(cid:12)(cid:12) d Y i =1 h ∆ − ki − f (cid:0) − k s + 2 − k i − e i (cid:1) − ∆ − ki − f (cid:0) − k s (cid:1)i(cid:12)(cid:12)(cid:12)(cid:12) ≤ − αd − α | k | , (2.4)for k ∈ N d , s ∈ Z ( k ). Here { e i } di =1 is the standard basis of R d .For f ∈ ˚ C ( I d ), we deﬁne the truncation of Faber series R m ( f ) by R m ( f ) := X k ∈ N d , | k | ≤ m q k ( f ) = X k ∈ N d , | k | ≤ m X s ∈ Z ( k ) λ k , s ( f ) ϕ k , s . (2.5)The function R m ( f ) belongs to ˚ C ( I d ) and is completely determined by sampled values of f at thepoints in the Smolyak grid G d ( m ) := (cid:8) ξ k , s = 2 − k − s : | k | = m, s ∈ Z ∗ ( k ) (cid:9) , = (1 , . . . , ∈ N d and Z ∗ ( k ) := × dj =1 Z ∗ ( k j ) . Moreover, R m ( f ) interpolates f at the points ξ ∈ G d ( m ). R m ( f )( ξ ) = f ( ξ ) , ξ ∈ G d ( m ) . Thus, the truncation of the Faber series R m ( f ) can be seen as a formula of interpolation samplingrecovery on the grids G d ( m ) for f ∈ ˚ C ( I d ).Notice that the Smolyak grids G d ( m ) are very sparse. The number of knots in G d ( m ) is smallerthan d ( d − m m d − and is much smaller than 2 dm , the number of knots in corresponding standardfull grids. However, for periodic functions having mixed smoothness, they give the same error of thesampling recovery on the standard full grids. See [11, Chapter 5] for details.The following lemma gives a d -dependent estimate of the error of approximation by the sparse-gridinterpolation operators R m ( f ) of functions having mixed smoothness from ˚ U α,d ∞ . Lemma 2.2

Let d ≥ , m ∈ N , and < α ≤ . Then we have sup f ∈ ˚ U α,d ∞ k f − R m ( f ) k ∞ ≤ − α B d − αm (cid:18) m + dd − (cid:19) , B = (2 α − − . Proof . For every f ∈ ˚ U α,d ∞ and k ∈ N d , as the functions ϕ k , s , s ∈ Z ( k ), have disjoint supports, by(2.4) we have k f − R m ( f ) k ∞ ≤ X k ∈ N d : | k | >m (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X s ∈ Z ( k ) λ k , s ( f ) ϕ k , s (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ ≤ X k ∈ N d : | k | >m − αd − α | k | = 2 − αd ∞ X ℓ = m +1 (cid:18) ℓ + d − d − (cid:19) − αℓ = 2 − αd ∞ X s =0 (cid:18) m + s + dd − (cid:19) − α ( s + m +1) = 2 − αd − α ( m +1) ∞ X s =0 (cid:18) m + s + dd − (cid:19) − αs . Using ∞ X s =0 (cid:18) m + sn (cid:19) t s ≤ (1 − t ) − n − (cid:18) mn (cid:19) , t ∈ (0 , , see [12, Lemma 2.2], we ﬁnally obtain k f − R m ( f ) k ∞ ≤ − αd − α ( m +1) (1 − − α ) − d (cid:18) m + dd − (cid:19) = 2 − α B d − αm (cid:18) m + dd − (cid:19) . In this subsection, we explicitly construct a set of ﬁnite cardinality for approximation of f ∈ ˚ U α,d ∞ andgive an estimate of the approximation error as well as the cardinality of this set.9gain, we start with the univariate case. For f ∈ ˚ U α, ∞ we explicitly construct the function S f ∈ ˚ U α, ∞ by S f := X s ∈ Z ∗ ( m ) − α ( m +1) l s ( f ) ϕ ∗ m,s , (2.6)where we put l ( f ) = 0 and assign the values S f (2 − m − s ) = 2 − α ( m +1) l s ( f ) from left to right closestto f (2 − m − s ) for s = 1 , . . . , m +1 −

1. If there are two possible choices for l s ( f ) we choose l s ( f ) thatis closest to the already determined l s − ( f ). We deﬁne S α ( m ) := (cid:8) S f : f ∈ ˚ U α, ∞ (cid:9) . Lemma 2.3

Let < α ≤ , m ∈ N . Then it holds |S α ( m ) | ≤ m +1 and for every f ∈ ˚ U α, ∞ we have k R m ( f ) − S f k L ∞ ( I ) ≤ − α ( m +1) − . Proof . In this proof we develop a technique used in [13]. For every f ∈ ˚ U α, ∞ , from the construction of S f we have S f = X s ∈ Z ∗ ( m ) S f (2 − m − s ) ϕ ∗ m,s and | S f (2 − m − s ) − f (2 − m − s ) | ≤ − α ( m +1) − , s = 0 , . . . , m +1 . (2.7)From this, (2.2) and the inequality P s ∈ Z ∗ ( m ) ϕ ∗ m,s ( x ) ≤

2. But the case | l s ( f ) − l s − ( f ) | = 2 is not possible since it would imply that l s ( f ) is not closest to l s − ( f ). This means that | l s ( f ) − l s − ( f ) | ≤

1. Taking account this inequalityand l ( f ) = 0, we can see that |S α ( m ) | ≤ m +1 .In the following, we make use the abbreviations: x j := ( x , . . . , x j ) ∈ R j ; ¯ x j := ( x j +1 , . . . , x d ) ∈ R d − j with the convention x := 0 for x ∈ R d and j = 0 , , . . . , d −

1. When j = 1 we denote x insteadof x .We now construct a set of ﬁnite cardinality for approximation of f ∈ ˚ U α,d ∞ . Our strategy is toapply the above result to explicitly construct a set of ﬁnite cardinality for approximation of R m ( f ),and show that this set approximates f as well as R m ( f ). To do this we need a special representationof R m ( f ) in terms of the tensor product of ϕ ¯ k , ¯ s (¯ x ) and R m −| ¯ k | of a function in ˚ U α, ∞ of variable x . 10 emma 2.4 Let d ≥ , < α ≤ , m > and f ∈ ˚ U α,d ∞ . It holds the representation R m ( f )( x ) = X | ¯ k | ≤ m X ¯ s ∈ Z (¯ k ) − α ( | ¯ k | + d − ϕ ¯ k , ¯ s (¯ x ) R m −| ¯ k | (cid:0) K ¯ k , ¯ s ( f )( x ) (cid:1) , (2.8) where the univariate function K ¯ k , ¯ s ( f ) belongs to ˚ U α, ∞ and is deﬁned by K ¯ k , ¯ s ( f )( x ) := d Y j =2 (cid:18) −

12 2 α ( k j +1) ∆ − kj − f (cid:0) x , − ¯ k ¯ s (cid:1)(cid:19) . (2.9) Proof . We have that R m ( f )( x ) = m X k =0 m − k X k =0 . . . m − k − ... − k d − X k d =0 " m −| ¯ k | X k =0 q k d Y j =2 q k j ( f ) ! ( x ) = m X k =0 m − k X k =0 . . . m − k − ... − k d − X k d =0 R m −| ¯ k | d Y j =2 q k j ( f ) ! ( x )= X | ¯ k | ≤ m R m −| ¯ k | X ¯ s ∈ Z (¯ k ) d Y j =2 (cid:18) −

12 ∆ − kj − f (cid:0) x , − ¯ k ¯ s (cid:1)(cid:19) ϕ ¯ k , ¯ s (¯ x ) ! . Here R m −| ¯ k | applies to the function of variable x . Hence, we can write R m ( f )( x ) = X | ¯ k | ≤ m X ¯ s ∈ Z (¯ k ) d Y j =2 − α ( k j +1) ! ϕ ¯ k , ¯ s (¯ x ) R m −| ¯ k | (cid:0) K ¯ k , ¯ s ( f )( x ) (cid:1) . Thus, (2.8) is proven. For x ∈ I and x + h ∈ I , is holds the estimates | ∆ h K ¯ k , ¯ s ( f )( x ) | ≤ d Y j =2 · α ( k j +1) · · − α ( k j +1) ! | h | α ≤ h α which implies that K ¯ k , ¯ s ( f ) ∈ ˚ U α, ∞ .From the above special representation of R m ( f ) and Lemmata 2.2 and 2.3 we derive the followingresult. Lemma 2.5

Let m > , d ≥ and < α ≤ . For f ∈ ˚ U α,d ∞ , let the function S m ( f ) be deﬁned by S m ( f )( x ) := X | ¯ k |≤ m − α ( | ¯ k | + d − X ¯ s ∈ Z (¯ k ) ϕ ¯ k , ¯ s (¯ x ) S K ¯ k , ¯ s ( f ) ( x ) , (2.10) where S K ¯ k , ¯ s ( f ) ∈ S α ( m − | ¯ k | ) is as in (2.6) for the function K ¯ k , ¯ s ( f ) . Then it holds the inequality k f − S m ( f ) k ∞ ≤ B d − αm (cid:18) m + dd − (cid:19) . (2.11) Moreover, for the set S α,d ( m ) := (cid:8) S m ( f ) : f ∈ ˚ U α,d ∞ (cid:9) , we have N d ( m ) := |S α,d ( m ) | ≤ m +1 ( m + d − d − ) . roof . We ﬁrst estimate N d ( m ). Since the number of ¯ k with | ¯ k | ≤ m is (cid:0) m + d − d − (cid:1) , and the cardinalityof S α ( m − | ¯ k | ) is bounded by 3 m −| ¯ k | , we have N d ( m ) ≤ (cid:18)(cid:16) m −| ¯ k | (cid:17) | ¯ k | (cid:19) ( m + d − d − ) = 3 m +1 ( m + d − d − ) . Lemma 2.2 gives k f − R m ( f ) k ∞ ≤ − α B d − αm (cid:18) m + dd − (cid:19) . (2.12)Next we show that k R m ( f ) − S m ( f ) k ∞ ≤ − − αd − αm (cid:18) m + dd − (cid:19) . (2.13)Since K ¯ k , ¯ s ( f ) ∈ ˚ U α, ∞ by Lemma 2.4, applying Lemma 2.3 we deduce that there is S K ¯ k , ¯ s ( f ) ∈S α (cid:0) m − | ¯ k | (cid:1) such that (cid:13)(cid:13) R m −| ¯ k | (cid:0) K ¯ k , ¯ s ( f ) (cid:1) − S K ¯ k , ¯ s ( f ) (cid:13)(cid:13) ∞ ≤ − − α − α ( m −| ¯ k | ) . (2.14)Hence, taking account that the supports of ϕ ¯ k , ¯ s with ¯ s ∈ Z (¯ k ) are disjoint, by the representation(2.8)–(2.9) of R m ( f ) and (2.14) we have that for every x ∈ I d , (cid:12)(cid:12) R m ( f )( x ) − S m ( f )( x ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X | ¯ k |≤ m X ¯ s ∈ Z (¯ k ) − α ( | ¯ k | + d − ϕ ¯ k , ¯ s (¯ x ) (cid:16) R m −| ¯ k | (cid:0) K ¯ k , ¯ s ( f )( x ) (cid:1) − S K ¯ k , ¯ s ( f ) ( x ) (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ X | ¯ k |≤ m − α ( | ¯ k | + d − sup ¯ s ∈ Z (¯ k ) (cid:13)(cid:13)(cid:13) R m −| ¯ k | (cid:0) K ¯ k , ¯ s ( f ) (cid:1) − S K ¯ k , ¯ s ( f ) (cid:13)(cid:13)(cid:13) ∞ ≤ − − α − α ( d − X | ¯ k |≤ m − α | ¯ k | − α ( m −| ¯ k | ) = 2 − − αd − αm (cid:18) m + d − d − (cid:19) This implies (2.13). From (2.12) and (2.13), the triangle inequality and 2 − α B d + 2 − − αd ≤ B d prove(2.11). This section aims at constructing nonlinear methods of parametric manifold approximation of functions f ∈ ˚ U α,d ∞ . More precisely, we construct such a nonlinear method Q N ( f ) = G ∗ N (( λ ∗ N ( f ))) with mappings λ ∗ N and G ∗ N of the form (1.7), satisfying the upper and lower estimates of approximation error (1.8)–(1.9). In order to do this we use the truncation of the tensorized Faber series R n ( f ) as an intermediateapproximation. We then represent the diﬀerence f − R n ( f ) in a special form and approximate termsin this representation by functions in the set of ﬁnite cardinality constructed in the previous section.For univariate functions f ∈ ˚ C ( I ), let the operator T k , k ∈ N , be deﬁned by T k ( f ) := f − R k − ( f )12ith the operator R k deﬁned as in (2.5) and the convention R − := 0. From this deﬁnition we have T is the identity operator. Notice that for f ∈ ˚ U α, ∞ , it holds the inequality k T k ( f ) k H α ∞ ( I ) ≤

2. For amultivariate function f ∈ ˚ C ( I d ), the tensor product operator T k , k = ( k , . . . , k d ) ∈ N d , is deﬁned by T k ( f ) := d Y j =1 T k j ( f ) , where the univariate operator T k j is applied to the univariate function f by considering f as a functionof variable x j with the other variables held ﬁxed. It holds that k T k ( f ) k H α ∞ ( I d ) ≤ d .For n ∈ N we have f − R n ( f ) = X k ∈ N d , | k | >n q k ( f ) = X k >nkj ≥ ,j =2 ,...,d q k ( f ) + n X k =0 q k X | ¯ k | >n − k q ¯ k ( f ) ! = T ( n +1) e ( f ) + n X k =0 q k T ( n +1 − k ) e ( f ) + n − k X k =0 q k (cid:18) X | ¯ k | >n −| k | q ¯ k ( f ) (cid:19)! = T ( n +1) e ( f ) + n X k =0 q k T ( n +1 − k ) e ( f ) + X | k | ≤ n q k X | ¯ k | >n −| k | q ¯ k ( f ) ! . Continuing in this way, we arrive at f − R n ( f ) = T ( n +1) e ( f ) + n X k =0 q k T ( n +1 − k ) e ( f ) + . . . + X | k d − | ≤ n q k d − (cid:16) T ( n +1 −| k d − | ) e d ( f ) (cid:17) = T ( n +1) e ( f ) + n X k =0 T ( n +1 − k ) e (cid:0) q k ( f ) (cid:1) + . . . + X | k d − | ≤ n T ( n +1 −| k d − | ) e d (cid:0) q k d − ( f ) (cid:1) . Putting F k j := T ( n +1 −| k j | ) e j +1 (cid:0) q k j ( f ) (cid:1) , j = 0 , , . . . , d − . we can write f − R n ( f ) = d − X j =0 X | k j | ≤ n F k j , (3.1)Let f ∈ ˚ U α,d ∞ be given. We will use this special representation to explicitly construct mappings λ ∗ N and G ∗ N of the form (1.7). To this end, caused by (3.1), we will preliminarly approximate T k ( f ).Put I k , s := × dj =1 I k j ,s j = × dj =1 [2 − k j s j , − k j ( s j + 1)] , k ∈ N d , s ∈ Z ( k ) , and T k , s ( f )( x ) := 2 α | k | − d (cid:0) T k ( f ) χ I k , s (cid:1)(cid:0) − k ( x + s ) (cid:1) . Since supp (cid:0) T k ( f ) χ I k , s (cid:1) ⊂ I k , s and k T k ( f ) χ I k , s k H α ∞ ( I d ) ≤ d , we have thatsupp (cid:0) T k , s ( f ) (cid:1) ⊂ I d , T k , s ( f ) ∈ ˚ U α,d ∞ . (3.2)13his allows us to apply Lemma 2.5 to the functions T k , s ( f ). Namely, according to this lemma weexplicitly construct the function S m ( T k , s ( f )) by the formula (2.10) so that we have by (2.11) (cid:13)(cid:13) T k , s ( f ) − S m ( T k , s ( f )) (cid:13)(cid:13) ∞ ≤ B d − αm (cid:18) m + dd − (cid:19) . (3.3)Deﬁne S k ,m ( f )( x ) := 2 − α | k | + d X s ∈ Z ( k ) S m (cid:0) T k , s ( f ) (cid:1)(cid:0) k x − s (cid:1) . (3.4)We then get (cid:13)(cid:13) T k ( f ) − S k ,m ( f ) (cid:13)(cid:13) ∞ = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X s ∈ Z ( k ) h T k ( f ) χ I k , s ( · ) − − α | k | + d S m (cid:0) T k , s ( f ) (cid:1)(cid:0) k · − s (cid:1)i(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ = 2 − α | k | + d (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X s ∈ Z ( k ) h T k , s ( f ) − S m (cid:0) T k , s ( f ) (cid:1)i(cid:0) k · − s (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ , and, consequently, by the equality | Z ( k ) | = 2 | k | and (3.3), the estimate of the error of the approxi-mation T k ( f ) by S k ,m ( f ) (cid:13)(cid:13) T k ( f ) − S k ,m ( f ) (cid:13)(cid:13) ∞ ≤ (2 B ) d (cid:0) m | k | (cid:1) − α (cid:18) m + dd − (cid:19) . (3.5)Let F d ( m ) be the ﬁnite-dimensional subspace in ˚ C ( I d ) of the form g = X k ∈ N d , | k | ≤ m X s ∈ Z ( k ) α k , s ϕ k , s , α k , s ∈ R . (3.6)It is easy to see that R m ( f ) ∈ F d ( m ) for f ∈ ˚ C ( I d ) and dim F d ( m ) = P mℓ =0 ℓ (cid:0) ℓ + d − d − (cid:1) .In the following, for any N ∈ N , N ≥ N we will explicitly construct the maps λ ∗ N : ˚ U α,d ∞ → R N a nd G ∗ N : R N → F d (cid:0) ⌊ log N ⌋ + ⌊ log log N ⌋ + 1 (cid:1) and estimate the approximation error sup f ∈ ˚ U α,d ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ in terms of N .For j = 0 , , . . . , d − M d − j ( m ) := N d − j ( m ) dim (cid:0) F d − j ( m ) (cid:1) , where recall, N d − j ( m ) := |S α,d − j ( m ) | , see Lemma 2.5. We have M d − j ( m ) ≤ m +1 ( m + d − j − d − j − ) m X ℓ =0 ℓ (cid:18) ℓ + d − j − d − j − (cid:19) ≤ m +1 ( m + d − j − d − j − )2 m +1 (cid:18) m + d − j − d − j − (cid:19) . (3.7)Let Γ j ( n ) be the set of all triples ( k j , s j , s j +1 ) satisfying the condition | k j | ≤ n, s j ∈ Z ( k j ) , s j +1 = 0 , . . . , n +1 −| k j | − , in particular, Γ ( n ) = { s : 0 ≤ s ≤ n +1 − } . We have | Γ j ( n ) | = X | k j |≤ n | k j | · n +1 −| k j | = 2 n +1 n X k =0 (cid:18) k + j − j − (cid:19) = 2 n +1 (cid:18) n + jj (cid:19) (3.8)14or all j = 0 , . . . , d − η ∈ [ N d − j ( m )] := { , . . . , N d − j ( m ) } and a sequence a η = (cid:0) a η ¯ ℓ j , ¯ t j (cid:1) | ¯ ℓ j | ≤ m, ¯ t j ∈ Z (¯ ℓ j ) ∈ R dim( F d − j ( m )) , we put S a η (¯ x j ) := X | ¯ ℓ j | ≤ m X ¯ t j ∈ Z (¯ ℓ j ) a η ¯ ℓ j , ¯ t j d Y i = j +1 ϕ ℓ i ,t i ( x i ) . If a := ( a η ) N d − j ( m ) η =1 ∈ R M d − j ( m ) , and θ := ( θ ( k j , s j ,s j +1 ) ) ( k j , s j ,s j +1 ) ∈ Γ j ( n ) ∈ [ N d − j ( m )] | Γ j ( n ) | , that is, elements of θ are numbered by indices ( k j , s j , s j +1 ) ∈ Γ j ( n ), we deﬁne the maps G jm,n : R M d − j ( m ) × [ N d − j ( m )] | Γ j ( n ) | → F d ( m + n + 1) , j = 0 , . . . , d − , (3.9)by λ j := ( a , θ ) G jm,n ( λ j ) , where G jm,n ( λ j ) : = N d − j ( m ) X η =1 X ( k j, s j,sj +1): θ ( k j, s j,sj +1)= η d − j α ( n +1+ j ) ϕ k j , s j ( x j ) S a η (cid:0) n +1 −| k j | x j +1 − s j +1 , ¯ x j +1 (cid:1) if j = 1 , . . . , d − , and G m,n ( λ ) : = N d ( m ) X η =1 X s : θ s = η − α ( n +1)+ d S a η (cid:0) n +1 x − s , ¯ x (cid:1) . These maps are well-deﬁned in the sense that the functions G jm,n ( λ j ), j = 0 , . . . , d −

1, belong to F d ( m + n + 1). We prove this for the case j = 1 , . . . , d −

1. The case j = 0 can be carried out similarly.For the function ϕ ¯ ℓ j , ¯ t j (¯ x j ) with | ¯ ℓ j | ≤ m, ¯ t j ∈ Z (¯ ℓ j ), we have that ϕ ¯ ℓ j , ¯ t j (cid:0) n +1 −| k j | x j +1 − s j +1 , ¯ x j +1 (cid:1) = (cid:18) d Y i = j +2 ϕ ℓ i ,t i ( x i ) (cid:19) ϕ ℓ j +1 ,t j +1 (cid:0) n +1 −| k j | x j +1 − s j +1 (cid:1) = (cid:18) d Y i = j +2 ϕ ℓ i ,t i ( x i ) (cid:19) ϕ (cid:0) ℓ j +1 + n +2 −| k j | x j +1 − ℓ j +1 +1 s j +1 − t j +1 (cid:1) = (cid:18) d Y i = j +2 ϕ ℓ i ,t i ( x i ) (cid:19) ϕ ℓ j +1 + n +1 −| k j | , ℓj +1 s j +1 + t j +1 ( x j +1 ) ∈ F d − j ( n + m + 1 − | k j | ) . Since S a η is a linear combination of ϕ ¯ ℓ j , ¯ t j with | ¯ ℓ j | ≤ m, ¯ t j ∈ Z (¯ ℓ j ), we conclude that S a η (cid:0) n +1 −| k j | x j +1 − s j +1 , ¯ x j +1 (cid:1) ∈ F d − j ( n + m + 1 − | k j | )15hich implies ϕ k j , s j ( x j ) S a η (cid:0) n +1 −| k j | x j +1 − s j +1 , ¯ x j +1 (cid:1) ∈ F d ( m + n + 1) . In the following lemma we explicitly construct a preliminary approximation of f − R n ( f ) andestimate the approximation error. Lemma 3.1

Let α ∈ (0 , , j = 0 , . . . , d − , and m, n ∈ N . Then we can explicitly construct a map λ jm,n : ˚ U α,d ∞ → R M d − j ( m ) × [ N d − j ( m )] | Γ j ( n ) | (3.10) so that for every f ∈ ˚ U α,d ∞ , (cid:13)(cid:13)(cid:13)(cid:13) f − R n ( f ) − d − X j =0 G jm,n ( λ jm,n ( f )) (cid:13)(cid:13)(cid:13)(cid:13) ∞ ≤ − α +1 (2 B ) d − α ( m + n ) (cid:18) m + n + dd − (cid:19) , where B is given in Lemma 2.2.Proof . Step 1.

We auxiliarily construct an approximation and estimate the approximation error forall F k j , j = 0 , , . . . , d −

1. For F k = T ( n +1) e ( f ) we take S ( n +1) e ,m ( f ) by formula (3.4) and apply(3.5) to obtain the estimate (cid:13)(cid:13) F k − S ( n +1) e ,m ( f ) (cid:13)(cid:13) ∞ ≤ (2 B ) d (cid:0) m | ( n +1) e | (cid:1) − α (cid:18) m + dd − (cid:19) = 2 − α (2 B ) d − α ( n + m ) (cid:18) m + dd − (cid:19) . (3.11)For j = 1 , . . . , d −

1, we rewrite F k j ( x ) in the form F k j ( x ) = T ( n +1 −| k j | ) e j +1 X s j ∈ Z ( k j ) (cid:18) ( − j − j j Y i =0 ∆ − ki − f (cid:0) − k j s j , ¯ x j (cid:1)(cid:19) ϕ k j , s j ( x j ) ! = X s j ∈ Z ( k j ) ϕ k j , s j ( x j ) (cid:18) T ¯ k ∗ j (cid:18) ( − j − j j Y i =0 ∆ − ki − f (cid:0) − k j s j , ¯ x j (cid:1)(cid:19)(cid:19) , where ¯ k ∗ j := ( n + 1 − | k j | , , . . . , ∈ N d − j . Notice that the functions f k j , s j (¯ x j ) := ( − j − j α ( j + | k j | ) j Y i =0 ∆ − ki − f (cid:0) − k j s j , ¯ x j (cid:1) are in variable ¯ x j ∈ I d − j and their norm in H α ∞ ( I d − j ) (with respect to ¯ x j ) satisﬁes the inequality k f k j , s j k H α ∞ ( I d − j ) ≤ . Again, for T ¯ k ∗ j ( f k j , s j ) with ¯ k ∗ j ∈ N d − j we take S ¯ k ∗ j ,m ( f k j , s j ) by formula (3.4) andapply (3.5) to have the estimate (cid:13)(cid:13) T ¯ k ∗ j ( f k j , s j ) − S ¯ k ∗ j ,m ( f k j , s j ) (cid:13)(cid:13) ∞ ≤ (2 B ) d − j (cid:0) m | ¯ k ∗ j | (cid:1) − α (cid:18) m + d − jd − j − (cid:19) . (3.12)For approximation of F k j , j = 0 , . . . , d − , we take the functions S F k j which are deﬁned by the explicitformulas S F k ( x ) := S ( n +1) e ,m ( f )( x ); S F k j ( x ) := X s j ∈ Z ( k j ) − α ( j + | k j | ) ϕ k j , s j ( x j ) S ¯ k ∗ j ,m ( f k j , s j )(¯ x j ) , j = 1 , . . . , d − . (3.13)16e have the estimates by (3.12) for every j = 1 , . . . , d − x ∈ I d , (cid:12)(cid:12) F k j ( x ) − S F k j ( x ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) X s j ∈ Z ( k j ) − α ( j + | k j | ) ϕ k j , s j ( x j ) (cid:16) T ¯ k ∗ j ( f k j , s j ) − S ¯ k ∗ j ,m ( f k j , s j ) (cid:17) (¯ x j ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ X s j ∈ Z ( k j ) − α ( j + | k j | ) ϕ k j , s j ( x j ) (cid:13)(cid:13) T ¯ k ∗ j ( f k j , s j ) − S ¯ k ∗ j ,m ( f k j , s j ) (cid:13)(cid:13) ∞ ≤ (2 B ) d − j − α ( j + | k j | ) (cid:0) m | ¯ k ∗ j | (cid:1) − α (cid:18) m + d − jd − j − (cid:19) = 2 − α (2 B ) d − j − αj − ( m + n ) α (cid:18) m + d − jd − j − (cid:19) . From the last estimate and (3.11) we deduce that (cid:13)(cid:13) F k j ( x ) − S F k j ( x ) (cid:13)(cid:13) ∞ ≤ − α (2 B ) d − j − αj − ( m + n ) α (cid:18) m + d − jd − j − (cid:19) , j = 0 , , . . . , d − . Then we get (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) f − R n ( f ) − d − X j =0 X | k j |≤ n S F k j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ ≤ d − X j =0 X | k j |≤ n (cid:13)(cid:13) F k j ( x ) − S F k j ( x ) (cid:13)(cid:13) ∞ ≤ − α − α ( m + n ) d − X j =0 (2 B ) d − j − jα (cid:18) m + d − jd − j − (cid:19)(cid:18) n + jj (cid:19) , where we have used P | k j |≤ n (cid:0) n + jj (cid:1) . By the inequalities (cid:0) m + d − jd − j − (cid:1)(cid:0) n + jj (cid:1) ≤ (cid:0) m + n + dd − (cid:1) for j = 0 , . . . , d − B ≥ f − R n ( f ) by the function d − X j =0 X | k j |≤ n S F k j can be estimated as (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) f − R n ( f ) − d − X j =0 X | k j |≤ n S F k j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ ≤ − α (2 B ) d − α ( m + n ) (cid:18) m + n + dd − (cid:19) ∞ X j =0 (2 B ) − j − jα ≤ − α +1 (2 B ) d − α ( m + n ) (cid:18) m + n + dd − (cid:19) . (3.14) Step 2.

Due to (3.14), to complete the proof we explicitly construct a map of the form (3.10) so that X | k j | ≤ n S F k j = G jm,n ( λ jm,n ) , j = 0 , . . . , d − . (3.15)We deal with the cases j ∈ { , . . . , d − } . The case j = 0 is carried out similarly with slight17odiﬁcation. For j ∈ { , . . . , d − } , with (3.4) and (2.10), from (3.13) we can write X | k j | ≤ n S F k j ( x ) = X | k j | ≤ n X s j ∈ Z ( k j ) − α ( j + | k j | ) ϕ k j , s j ( x j ) S ¯ k ∗ j ,m ( f k j , s j )(¯ x j )= X | k j | ≤ n X s j ∈ Z ( k j ) d − j α ( j + | k j | + | ¯ k ∗ j | ) ϕ k j , s j ( x j ) X ¯ s ∗ j ∈ Z (¯ k ∗ j ) S m (cid:0) T ¯ k ∗ j , ¯ s ∗ j ( f k j , s j ) (cid:1)(cid:0) ¯ k ∗ j ¯ x j − ¯ s ∗ j (cid:1) = X | k j | ≤ n X s j ∈ Z ( k j ) d − j α ( n +1+ j ) ϕ k j , s j ( x j ) × n +1 −| k j | X s j +1 =0 S m (cid:0) T ( n +1 −| k j | , ,..., , ( s j +1 , ,..., ( f k j , s j ) (cid:1)(cid:0) n +1 −| k j | x j +1 − s j +1 , ¯ x j +1 (cid:1) . Notice that by (3.2) the functions T ( n +1 −| k j | , ,..., , ( s j +1 , ,..., ( f k j , s j ) belong to ˚ U α,d − j ∞ and, thereforeby Lemma 2.5, the functions S m (cid:0) T ( n +1 −| k j | , ,..., , ( s j +1 , ,..., ( f k j , s j ) (cid:1) belong to S α,d − j ( m ). Numberingelements of S α,d − j ( m ) as S α,d − j ( m ) := (cid:8) S d − jη (cid:9) N d − j ( m ) η =1 , we obtain X | k j | ≤ n S F k j ( x ) = N d − j ( m ) X η =1 X ( k j , s j ,s j +1 ) ∈ Γ η d − j α ( n +1+ j ) ϕ k j , s j ( x j ) S d − jη (cid:0) n +1 −| k j | x j +1 − s j +1 , ¯ x j +1 (cid:1) , where Γ η = n ( k j , s j , s j +1 ) ∈ Γ j ( n ) : S m (cid:0) T ( n +1 −| k j | , ,..., , ( s j +1 , ,..., ( f k j , s j ) (cid:1) = S d − jη o . We deﬁne the map λ jm,n : ˚ U α,d ∞ → R M d − j ( m ) × [ N d − j ( m )] | Γ j ( n ) | by f λ jm,n ( f ) := ( a η ) N d − j ( m ) η =1 × (cid:0) θ ( k j , s j ,s j +1 ) ( f ) (cid:1) ( k j , s j ,s j +1 ) ∈ Γ j ( n ) , where a η := (cid:0) a η ¯ ℓ j , ¯ t j (cid:1) | ¯ ℓ j | ≤ m, ¯ t j ∈ Z (¯ ℓ j ) are coeﬃcients of S d − jη in Faber series representation and θ ( k j , s j ,s j +1 ) ( f ) = η if ( k j , s j , s j +1 ) ∈ Γ η .Hence we can write X | k j | ≤ n S F k j ( x ) = N d − j ( m ) X η =1 X ( k j, s j,sj +1): θ ( k j, s j,sj +1)( f )= η d − j α ( n +1+ j ) ϕ k j , s j ( x j ) S d − jη (cid:0) n +1 −| k j | x j +1 − s j +1 , ¯ x j +1 (cid:1) = G jm,n ( λ jm,n ( f )) . This proves (3.15).Let the map G Rn : R | dim F d ( n ) | → F d ( n )18e deﬁned by λ R = ( λ k , s ) | k | ≤ n, s ∈ Z ( k ) G Rn ( λ R ) = X | k | ≤ n X s ∈ Z ( k ) λ k , s ϕ k , s . We extend the map G jm,n deﬁned in (3.9) as a map G jm,n : R M d − j ( m ) × R | Γ j ( n ) | → F d ( m + n + 1)(the extension denoted again by G jm,n ) by assigning G jm,n ( λ j ) = 0 if λ j R M d − j ( m ) × [ N d − j ( m )] | Γ j ( n ) | . Denote N m,n := dim F d ( n ) + d − X j =0 (cid:0) M d − j ( m ) + | Γ j ( n ) | (cid:1) (3.16)and λ := (cid:0) λ R , λ , . . . , λ d − (cid:1) ∈ R N m,n , where λ R ∈ R | dim F d ( n ) | and λ j ∈ R M d − j ( m ) × R | Γ j ( n ) | . We deﬁne the map G m,n : R N m,n → F d ( m + n + 1) (3.17)by G m,n ( λ ) := G Rn ( λ R ) + d − X j =0 G jm,n ( λ j ) , (3.18)and put K m,n := dim( F d ( m + n + 1)). Corollary 3.2

Let α ∈ (0 , , d, m, n ∈ N . Then we can explicitly construct a map λ m,n : ˚ U α,d ∞ → R N m,n so that sup f ∈ ˚ U α,d ∞ k f − G m,n ( λ m,n ( f )) k ∞ ≤ − α +1 (2 B ) d − α ( m + n ) (cid:18) m + n + dd − (cid:19) , (3.19) and hence, d K m,n (cid:0) ˚ U α,d ∞ , L ∞ (cid:1) ≤ − α +1 (2 B ) d − α ( m + n ) (cid:18) m + n + dd − (cid:19) . Proof . We deﬁne the operator λ m,n by λ m,n := (cid:0) λ Rn , λ m,n , . . . , λ d − m,n (cid:1) , where the operators λ jm,n are as in Lemma 3.1 and the operator λ Rn is deﬁned by λ Rn ( f ) := ( λ k , s ( f )) | k | ≤ n, s ∈ Z ( k ) . Then the upper bound is already proved in Lemma 3.1. For the lower bound, by deﬁnition of Kol-mogorov width we derive that d K m,n (˚ U α,d ∞ , L ∞ ) ≤ sup f ∈ ˚ U α,d ∞ k f − G m,n ( λ m,n ( f )) k ∞ . (3.20)19 emma 3.3 For n, m, d ∈ N its holds the inequality N m,n ≤ m +1 ( m + d − d − )2 m +1 (cid:18) m + dd − (cid:19) + 2 n +2 (cid:18) n + dd − (cid:19) . Proof . From (3.16), (3.7), and (3.8) we have that N m,n ≤ dim F d ( n ) + d − X j =0 (cid:18) m +1 ( m + d − j − d − j − )2 m +1 (cid:18) m + d − j − d − j − (cid:19) + 2 n +1 (cid:18) n + jj (cid:19)(cid:19) ≤ n X ℓ =0 ℓ (cid:18) ℓ + d − d − (cid:19) + 3 m +1 ( m + d − d − )2 m +1 d − X j =0 (cid:18) m + d − j − d − j − (cid:19) + 2 n +1 d − X j =0 (cid:18) n + jj (cid:19) ≤ n +1 (cid:18) n + d − d − (cid:19) + 3 m +1 ( m + d − d − )2 m +1 (cid:18) m + dd − (cid:19) + 2 n +1 (cid:18) n + dd − (cid:19) ≤ m +1 ( m + d − d − )2 m +1 (cid:18) m + dd − (cid:19) + 2 n +2 (cid:18) n + dd − (cid:19) , where in the third estimate we have used P ℓj = k (cid:0) jk (cid:1) = (cid:0) ℓ +1 k +1 (cid:1) .We now are able to explicitly construct such a method Q N ( f ) = G ∗ N (( λ ∗ N ( f ))) with mappings λ ∗ N and G ∗ N of the form (1.7), satisfying the upper and lower estimates of approximation error (1.8)–(1.9). Recall that F d ( m ) is the ﬁnite-dimensional subspace in ˚ C ( I d ) of the form (3.6) and that R m ( f ) ∈ F d ( m ) for f ∈ ˚ C ( I d ), and dim F d ( m ) = P mℓ =0 ℓ (cid:0) ℓ + d − d − (cid:1) . Theorem 3.4

Let α ∈ (0 , , d ∈ N . Then for every N ≥ N ( d ) := 3 d +2 ( dd − )2 d +3 (cid:18) d + 1 d − (cid:19) , N ∈ N , (3.21) we can explicitly determine a number m ∗ ( N ) ≤ log N + log log N + 1 , m ∗ ( N ) ∈ N , and explicitlyconstruct maps λ ∗ N : ˚ U α,d ∞ → R N a nd G ∗ N : R N → F d (cid:0) m ∗ ( N ) (cid:1) so that N ≤ M ( N ) := dim F d (cid:0) m ∗ ( N )) , (( d − (4 d − d − N log N (log log N ) − d ≤ M ( N ) ≤ (12 d ) d − ( d − N (log N )(log log N ) − d (3.22) and sup f ∈ ˚ U α,d ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ ≤ C α (cid:18) K d − ( d − (cid:19) α +1 (log N ) ( d − α +1) ( N log N ) α (log log N ) ( d − α , (3.23) where K := (4 α / (2 α − / (2 α +1) , C α := 2 α +2 / (2 α − . Moreover, if α < , sup f ∈ ˚ U α,d ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ ≥ d M ( N ) (˚ U α,d ∞ , L ∞ ) ≥ C d,α (log N ) ( d − α + ) ( N log N ) α (log log N ) ( d − α . (3.24) Proof . We prove the case d ≥

2. The case d = 1 is carried out similarly. Fix a number N ∈ N satisfying the condition (3.21). We deﬁne m ∗ ( N ) := n ( N ) + m ( N ) , n = n ( N ) and m = m ( N ) are chosen so that2 n +2 (cid:18) n + dd − (cid:19) ≤ N < n +3 (cid:18) n + d + 1 d − (cid:19) (3.25)and 3 m +1 ( m + d − d − )2 m +1 (cid:18) m + dd − (cid:19) ≤ N < m +2 ( m + dd − )2 m +2 (cid:18) m + d + 1 d − (cid:19) . From this choice of m, n and Lemma 3.3 we can see that N m,n ≤ N. This allows us to deﬁne λ ∗ N := λ m,n and G ∗ N as an extension of G m,n from R N m,n to R N with the chosen m, n , where λ m,n : ˚ U α,d ∞ → R N m,n is as in Corollary 3.2 and G m,n : R N m,n → F d ( m + n + 1)as in (3.17)–(3.18).Let us ﬁrst prove the dimension-dependent upper estimate (3.23). The choice of m , n and theassumption (3.21) implies n ≥ m ≥ d + 1. With n ≥ d + 1 we have the estimate2 N − n d − ( d − d − ≤ − n < N − (cid:18) n + d + 1 d − (cid:19) < N − (2 n ) d − ( d − n ≤ log N ≤ dn. (3.26)With m ≥ d + 1 we deduce3 m +1 ( m + d − d − )2 m +1 (cid:18) m + dd − (cid:19) ≤ N < m +2 ( m + d +1 d − )2 m +2 (cid:18) m + d + 1 d − (cid:19) ≤

12 4 m +2 ( m + d +1 d − ) , where we have used 3 t t ≤ t for t ≥

16. Hence,2 m +1 (cid:18) m + d − d − (cid:19) log 3 ≤ log N < m +4 (cid:18) m + d + 1 d − (cid:19) which implies2 log 3(log N ) − m d − ( d − d − ≤ − m ≤ (log N ) − (2 m ) d − ( d − m ≤ log log N ≤ dm. (3.27)Consequently, we obtain2 − α ( m + n ) (cid:18) m + n + dd − (cid:19) ≤ − α ( m + n ) (3 n ) d − ( d − ≤ (cid:18) N − (2 log N ) d − ( d − (cid:19) α (cid:18) (log N ) − (2 log log N ) d − ( d − (cid:19) α (3 log N ) d − ( d − ≤ α (4 α d − (( d − α +1 (log N ) ( d − α +1) ( N log N ) α (log log N ) ( d − α . This together with the upper bound (3.19) proves the upper bound (3.23).21ext, we prove (3.22). With M ( N ) = dim( F d ( m ∗ ( N ))), from the choice of n as in (3.25) we obtain N ≤ n +4 (cid:18) n + d + 1 d − (cid:19) − ≤ m + n +1 X ℓ =0 ℓ (cid:18) ℓ + d − d − (cid:19) = M ( N ) . Moreover, taking account n ≥ m ≥ d + 1 we derive that2 m + n +1 (cid:16) nd − (cid:17) d − ≤ m + n +1 (cid:18) m + n + dd − (cid:19) ≤ M ( N ) ≤ m + n +2 (cid:18) m + n + dd − (cid:19) ≤ d − ( d − n + m +2 n d − which from (3.26) and (3.27) implies(( d − (4 d − d − N log Nm d − ≤ M ( N ) ≤ d − ( d − d − d − N log Nm d − and therefore, (3.22).We ﬁnally verify the lower bound (3.24). From the known inequality d M (˚ U α,d ∞ , L ∞ ) & M − α (log M ) ( d − α + ) , (see, e.g., [11, Theorem 4.3.11]) we obtain that d M ( N ) (˚ U α,d ∞ , L ∞ ) ≥ C α,d ( M ( N )) − α (log M ( N )) ( d − α + ) ≥ C α,d (cid:18) N log N (log log N ) d − (cid:19) − α (log N ) ( d − α + ) ≥ C α,d (log N ) ( d − α + ) ( N log N ) α (log log N ) ( d − α . Now provided with M ( N ) = K m ( N ) ,n ( N ) the lower bound (3.24) follows from (3.20).From the left inequality in (1.4) and Theorem 3.4 we deduce the following upper and lower boundsfor d N,M ( N ) (˚ U α,d ∞ , L ∞ ). Corollary 3.5

Let α ∈ (0 , , d ∈ N and N ≥ . With M ( N ) = ⌊ N (log N )(log log N ) − ( d − ⌋ we have (log N ) ( d − α + ) ( N log N ) α (log log N ) ( d − α . d N,M ( N ) (˚ U α,d ∞ , L ∞ ) . (log N ) ( d − α +1) ( N log N ) α (log log N ) ( d − α . In the case when d = 2 we get the right asymptotic order of d N,M ( N ) (˚ U α, ∞ , L ∞ ( I )) as in thefollowing theorem. Theorem 3.6

Let α ∈ (0 , . With M ( N ) := ⌊ N (log N )(log log N ) − ⌋ we have d N,M ( N ) (˚ U α, ∞ , L ∞ ( I )) ≍ sup f ∈ ˚ U α, ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ ≍ N − α log N (log log N ) α . (3.28) Proof . From (3.23) we immediately get the upper bound in (3.28): d N,M ( N ) (˚ U α, ∞ , L ∞ ( I )) . sup f ∈ ˚ U α, ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ . N − α log N (log log N ) α .

22o prove the lower bound we use the known asymptotic order of the Kolmogorov M -widths for α ∈ (0 , d M (˚ U α, ∞ , L ∞ ( I )) ≍ M − α (log M ) α +1 , see, e.g., [11, Theorem 4.3.14] and references there. Hence, with n = n ( N ) and m = m ( N ) deﬁned asin the proof of Theorem 3.4 for d = 2 we derive the lower bound:sup f ∈ ˚ U α, ∞ k f − G ∗ N ( λ ∗ N ( f )) k ∞ ≥ d N,M ( N ) (˚ U α, ∞ , L ∞ ( I )) ≥ d M ( N ) (˚ U α, ∞ , L ∞ ( I )) ≥ C α M ( N ) − α (log M ( N )) α +1 ≥ C α N − α log N (log log N ) α . Acknowledgments.

This work is funded by Vietnam National Foundation for Science and Tech-nology Development (NAFOSTED) under Grant No. 102.01-2020.03. A part of this work was donewhen the authors were working at the Vietnam Institute for Advanced Study in Mathematics (VI-ASM). They would like to thank the VIASM for providing a fruitful research environment and workingcondition.

References [1] H.-J. Bungartz and M. Griebel. Sparse grids.

Acta Numer. , 13:147–269, 2004.[2] A. Chernov and D. D˜ung. New explicit-in-dimension estimates for the cardinality of high-dimensional hyperbolic crosses and approximation of functions having mixed smoothness.

J.Complexity , 32:92–121, 2016.[3] F. Cobos, T. K¨uhn, and W. Sickel. Optimal approximation of Sobolev functions in the sup-norm.

J. Funct. Anal. , 270:4196–4212, 2016.[4] F. Cobos, T. K¨uhn, and W. Sickel. On optimal approximation in periodic Besov spaces.

J. Math.Anal. Appl , 474:1441–1462, 2019.[5] A. Cohen, R. DeVore, G. Petrova, and P. Wojtaszczyk. Optimal stable nonlinear approximation. arXiv:2009.09907 , 2020.[6] D. D˜ung and M. Griebel. Hyperbolic cross approximation in inﬁnite dimensions.

J. Complexity ,33:33–88, 2016.[7] D. D˜ung and T. Ullrich. N-widths and ε -dimensions for high-dimensional approximations. Found.Comput. Math. , 13:965–1003, 2013.[8] D. D˜ung. B-spline quasi-interpolant representations and sampling recovery of functions withmixed smoothness.

J. Complexity , 27:541–567, 2011.[9] D. D˜ung. Sampling and cubature on sparse grids based on a B-spline quasi-interpolation.

Found.Comp. Math. , 16:1193–1240, 2016.[10] D. D˜ung, M. Griebel, V. N. Huy, and C. Rieger. ε - dimension in inﬁnite dimensional hyperboliccross approximation and application to parametric elliptic PDEs. J. Complexity , 46:66 – 89, 2018.[11] D. D˜ung, V. N. Temlyakov, and T. Ullrich.

Hyperbolic Cross Approximation . Advanced Coursesin Mathematics - CRM Barcelona, Birkh¨auser/Springer, 2018.2312] D. D˜ung and M. X. Thao. Dimension-dependent error estimates for sampling recovery on Smolyakgrids based on B-spline quasi-interpolation.

J. Approx. Theory , 250:185–205, 2020.[13] I. Daubechies, R. DeVore, S. Foucart, B. Hanin, and G. Petrova. Nonlinear approximation and(Deep) ReLU networks. arXiv:1905.02199 , 2019.[14] R. A. DeVore, R. Howard, and C. Micchelli. Optimal non-linear approximation.

ManuscriptaMath. , 63:469–478, 1989.[15] G. Faber. ¨Uber stetige Funktionen.

Math. Ann. , 66:81–94, 1909.[16] T. K¨uhn, S. Mayer, and T. Ullrich. Counting via entropy: new preasymptotics for the approxi-mation numbers of Sobolev embeddings.

SIAM J. Numer. Anal. , 54:3625–3647, 2016.[17] T. K¨uhn, W. Sickel, and T. Ullrich. Approximation numbers of Sobolev embeddings – Sharpconstants and tractability.

J. Complexity , 30:95–116, 2014.[18] T. K¨uhn, W. Sickel, and T. Ullrich. Approximation of mixed order Sobolev functions on the d -torus – Asymptotics, preasymptotics and d -dependence. Constr. Approx. , 42:353–398, 2015.[19] S. M. Nikolskii.

Approximation of Functions of Several Variables and Embedding Theorems .Springer, Berlin, 1975.[20] E. Novak and H. Wo´zniakowski.

Tractability of Multivariate Problems, Volume I: Linear Infor-mation . EMS Tracts in Mathematics, Vol. 6, Eur. Math. Soc. Publ. House, Z¨urich, 2008.[21] E. Novak and H. Wo´zniakowski.

Tractability of Multivariate Problems, Volume II: StandardInformation for Functionals . EMS Tracts in Mathematics, Vol. 12, Eur. Math. Soc. Publ. House,Z¨urich, 2010.[22] D. Potts and M. Schmischke. Learning high-dimensional additive models on the torus. arXiv:1907.11412 , 2019.[23] H. Schmeisser and H. Triebel.

Topics in Fourier Analysis and Function Spaces . Chichester; NewYork : Wiley, 1987.[24] S. Smolyak. Quadrature and interpolation formulas for tensor products of certain classes offunctions.

Dokl. Akad. Nauk , 148:1042–1045, 1963.[25] V. Temlyakov.

Multivariate Approximation . Cambridge University Press, 2018.[26] V. N. Temlyakov. An inequality for trigonometric polynomials and its application for estimatingthe entropy numbers.

J. Complexity , 11:293–307, 1995.[27] V. N. Temlyakov. An inequality for trigonometric polynomials and its application for estimatingthe Kolmogorov widths.

East J. Approx. , 2:253–262, 1996.[28] H. Triebel.

Bases in Function Spaces, Sampling, Discrepancy, Numerical Integration . EuropeanMath. Soc. Publishing House, Z¨urich, 2010.[29] H. Triebel.

Hybrid Function Spaces, Heat and Navier-Stokes Equations . European MathematicalSociety, 2015.[30] H. Tyagi and J. Vybiral. Learning general sparse additive models from point queries in highdimensions.

Constr. Approx. , 50:403–455, 2019.2431] J. Vybiral. Function spaces with dominating mixed smoothness.

Diss. Math. , 436:1–73, 2006.[32] D. Yarotsky. Quantiﬁed advantage of discontinuous weight selection in approximations with deepneural networks. arXiv: 1705.01365 , 2017.[33] H. Yserentant.

Regularity and Approximability of Electronic Wave Functions . Lecture Notes inMathematics, Springer, 2010.[34] C. Zenger. Sparse grids.