[PDF] Approximation of functions with small mixed smoothness in the uniform norm

Abstract

In this paper we present results on asymptotic characteristics of multivariate function classes in the uniform norm. Our main interest is the approximation of functions with mixed smoothness parameter not larger than 1/2. Our focus will be on the behavior of the best m-term trigonometric approximation as well as the decay of Kolmogorov and entropy numbers in the uniform norm. It turns out that these quantities share a few fundamental abstract properties like their behavior under real interpolation, such that they can be treated simultaneously. We start with proving estimates on finite rank convolution operators with range in a step hyperbolic cross. These results imply bounds for the corresponding function space embeddings by a well-known decomposition technique. The decay of Kolmogorov numbers have direct implications for the problem of sampling recovery in L_2 in situations where recent results in the literature are not applicable since the corresponding approximation numbers are not square summable.

Full PDF

aa r X i v : . [ m a t h . F A ] D ec Approximation of functions with small mixedsmoothness in the uniform norm

Vladimir N. Temlyakov, Tino Ullrich ∗ University of South Carolina, Steklov Institute of Mathematics,Lomonosov Moscow State University,and Moscow Center for Fundamental and Applied Mathematics;Faculty of Mathematics, 09107 Chemnitz, GermanyDecember 23, 2020

Abstract

In this paper we present results on asymptotic characteristics of multivariate func-tion classes in the uniform norm. Our main interest is the approximation of functionswith mixed smoothness parameter not larger than / . Our focus will be on the behav-ior of the best m -term trigonometric approximation as well as the decay of Kolmogorovand entropy numbers in the uniform norm. It turns out that these quantities share afew fundamental abstract properties like their behavior under real interpolation, suchthat they can be treated simultaneously. We start with proving estimates on ﬁnite rankconvolution operators with range in a step hyperbolic cross. These results imply boundsfor the corresponding function space embeddings by a well-known decomposition tech-nique. The decay of Kolmogorov numbers have direct implications for the problemof sampling recovery in L in situations where recent results in the literature are notapplicable since the corresponding approximation numbers are not square summable. Keywords and phrases : Best m − term trigonometric approximation, Kolmogorov num-bers, entropy numbers, small smoothness, uniform norm : 41A10, 41A25, 41A60, 41A63, 42A10,68Q25, 94A20 In this paper we provide new upper bounds for the best m − term trigonometric approxi-mation ( σ m ), the Kolmogorov numbers ( d m ), and the entropy numbers ( e m ) of multivariatefunction classes in the uniform norm. It is nowadays widely believed that the target space L ∞ ( T d ) comes with additional diﬃculties and often requires new and involved techniques.Another challenge is the treatment of classes of periodic functions with small mixed smooth-ness (derivative or diﬀerence), where several questions concerning approximation and inte-gration have not yet been settled. We make progress towards the solution of the Outstanding ∗ Corresponding author: [email protected] s m ( T ) , where “ s m ( T ) ∈ { d m ( T ) , e m ( T ) , σ m ( T ) } ” and T denotes an operator mapping into L ∞ ( T d ) . We follow the classical approach and startwith new results for ﬁnite rank convolution operators T = S Q n , the orthogonal projectiononto the trigonometric polynomials with frequencies in the dyadic step hyperbolic cross Q n ⊂ Z d , deﬁned by ̺ ( s ) := (cid:8) k ∈ Z d : [2 s j − ] ≤ | k j | < s j , j = 1 , . . . , d (cid:9) , (1.1) Q n := [ k s k ≤ n ̺ ( s ) . (1.2)Namely, for ≤ p < ∞ it holds s m ( S Q n : L p ( T d ) → L ∞ ( T d )) . (cid:18) n m (cid:19) p n ( d − (cid:16) − p (cid:17) + p , m ≤ | Q n | . The result is based on the common real interpolation properties of all three asymptoticcharacteristics in connection with a “corner result” due to Pajor, Tomczak-Jaegermann [18],Belinskii [2] and Dunker, Kühn, Linde, Lifshits [10]. A corresponding corner result for thebest m -term trigonometric approximation σ m in the univariate case was obtained by Belinskii[1], who used a probabilistic technique, and in the multivariate case by Temlyakov [28], whoused the greedy approximation technique.It is well-known that the analysis of approximation problems for function classes withsmall mixed smoothness involves several technical diﬃculties, see for instance [36, Rem. 4.10]and [15] for the study of entropy numbers. Similar diﬃculties have been already observed forthe quantities of numerical integration, see [35], where the bounds look similar. Indeed, thesequantities serve as lower bounds for Kolmogorov numbers in L ∞ , which has been observedby Novak [17]. In this paper we give the asymptotic bounds s m (I : W rp ( T d ) → L ∞ ( T d )) . m − r (log m ) ( d − − r )+ r as well as s m (I : H rp ( T d ) → L ∞ ( T d )) . m − r (log m ) d − r for function classes with small mixed smoothness /p < r < / . In the endpoint situation r = 1 / we encounter an additional (log log m ) / factor, see Theorem 6.3 below. It is stillopen whether these bounds are sharp when d > . The reader can ﬁnd a brief discussion ofthe case d = 2 in the Remark 6.4 below. Thus, we obtain new results on three asymptoticcharacteristics – Kolmogorov numbers d m , entropy numbers e m , and best m -term approxi-mations σ m – for two kinds of classes W rp and H rp in the case of small smoothness r ≤ / ,when the error is evaluated in the uniform norm L ∞ . There is an extensive history of study-ing each of the above asymptotic characteristics. They were studied for large smoothness r > / , for classes W rp , H rp , and for Besov classes B rp,θ , where the error is evaluated in the L q norm, ≤ q ≤ ∞ . We refer the reader for a detailed historical discussion to the tworecent books [7] and [30]. For the d m see [7, Sect. 4.3], and [30, Sect. 5.3]. For the e m see[7, Chapt. 6] and [30, Chapt. 7]. Finally, for the σ m see [7, Chapt. 7] and [30, Chapt. 9]. Inaddition to the above books we mention the recent paper Romanyuk [23].2e continue the investigation of asymptotic characteristics of classes of multivariate func-tions with small mixed smoothness started in [33] on this topic. There we concentrated onthe study of asymptotic characteristics from linear approximation theory – the Kolmogorovwidths. We pointed out some applications of new results on the Kolmogorov widths to thesampling recovery problem. In this paper focus is set on the study of asymptotic charac-teristics from nonlinear approximation theory – sparse approximation with respect to thetrigonometric system and entropy numbers. Those are interpreted as pseudo s − numberssharing a few fundamental properties. We use a classical decomposition machinery (sim-ilar to the one used in [33]), where we rely on ﬁnite rank operators ranging in subspacesof trigonometric polynomials with frequencies from hyperbolic crosses as building blocks.However, in contrast to [33] we heavily apply well-known tools from interpolation theoryof operators to analyze the ﬁnite rank operators. In [33] an elementary approach is usedto estimate widths of function classes , which is based on the application of a standard cut-oﬀ operator to dyadic building blocks. Certainly, deeply at the roots both approaches arerelated, since the cutoﬀ operator is also used for computing K -functionals in real interpo-lation theory. However, technical realizations of these approaches are diﬀerent and may beinteresting for diﬀerent communities.Recent observations regarding the problem of optimal sampling recovery of functionsin L bring classes with small mixed smoothness to the focus again. Since several newlydeveloped techniques only work for Hilbert-Schmidt operators [13], [16] or, more generally,in situations where certain asymptotic characteristics (approximation numbers) are squaresummable [14], we need new techniques in situations where this is not the case. Especiallyin the range of small smoothness we are far away from square summability. Nevertheless,multivariate function classes of this type are of interest, since for instance a mixed Hölder-Zygmund regularity r ≤ / falls into this scope. Recently, see [31], the sampling recoveryerror in L was directly related to the Kolmogorov numbers in L ∞ . It seems that, especiallyfor the case of small smoothness, this represents the only available tool at the moment apartfrom sparse grid methods. Surprisingly, as an application of our results on Kolmogorovnumbers we show that any sparse grid technique performs asymptotically worse by a log -factor with exponent growing with the dimension d . This motivates further research inﬁnding better constructive sampling algorithms.The paper is organized as follows. In Sections 2 and 3 we deﬁne the asymptotic character-istics of interest in a framework of operators and pseudo s -numbers. This notion goes backto Pietsch [19]. We particularly pay attention to the real interpolation properties. Section 4deals with the relevant function spaces with bounded mixed derivative or diﬀerence. Here wealso give a new real interpolation formula. Afterwards in Section 5 we establish ﬁrst resultsfor the orthogonal projection operators with respect to the (trigonometric) step hyperboliccrosses. These estimates are used to obtain the main results in Section 6 for function spaceembeddings into L ∞ ( T d ) . Finally, in Section 7 we discuss the obtained results and giveapplications for the problem of sampling recovery. s -numbers In this section we introduce the asymptotic characteristics of interest, namely the Kol-mogorov and entropy numbers as well as the error of best approximation with respect to anapproximation scheme. 3 eﬁnition 2.1 (Kolmogorov numbers) . For Banach spaces

A, B and a linear operator T : A → B , we deﬁne the m -th Kolmogorov number as d m ( T : A → B ) := inf dim L m

A, B, C be Banach spaces and

S, T ∈L ( A, B ) , R ∈ L ( B, C ) . We have the following properties.(K1) k T k L ( A,B ) = d ( T ) ≥ d ( T ) ≥ · · · ≥ ,(K2) For all m , m ∈ N , it holds d m + m − ( R ◦ S ) ≤ d m ( R ) d m ( S ) . (K3) For all m , m ∈ N , it holds d m + m − ( S + T ) ≤ d m ( S ) + d m ( T ) . (K4) d m ( T ) = 0 whenever rank( T ) < m .Note that, except for (K4), these properties are shared by dyadic entropy numbers ( e m ) m which we deﬁne below. To incorporate also dyadic entropy numbers into the frameworkPietsch introduced the notion of pseudo s -numbers . We may use this notion here in aslightly diﬀerent way. Deﬁnition 2.3 (Entropy numbers) . Let T : A → B be a linear operator between twoBanach spaces A, B . The entropy numbers of T are deﬁned as e m ( T : A → B ) := inf n ε > T ( U A ) ⊂ m − [ k =1 ( b k + ε · U B ) o , m ∈ N . Let us ﬁnally recall the deﬁnition of the asymptotic quantity measuring the best approx-imation with respect to an approximation scheme. This notion goes back to Pietsch [20] andincludes the case of the best m -term approximation with respect to a dictionary D . We willuse it later for the multivariate trigonometric system. Let X, Y denote arbitrary Banachspaces and let ( Y n ) n ∈ N denote a sequence of subsets of Y satisfying(Y1) Y = { } ,(Y2) Y n ⊂ Y n +1 , n ∈ N ,(Y3) λY n ⊂ Y n for all n ∈ N and all scalars λ , and ﬁnally(Y4) Y n + Y m ⊂ Y m + n . Deﬁnition 2.4 (Error of best approximation, [20]) . Let X and Y be as above. Let further T : X → Y denote a linear and bounded operator. Then we deﬁne the asymptotic characteristic σ m ( T : X → Y ; ( Y n ) n ) := sup k x k X ≤ inf y ∈ Y m − k T x − y k Y .

4t turns out that counterparts of (K1), (K3) and (K4) hold true. (K2) has to be replacedby a weaker version (S2) which, however, is suﬃcient for our approach.

Lemma 2.5 (Properties of σ m ) . Let

Z, X, Y be Banach spaces and S ∈ L ( Z, X ) , R, T ∈L ( X, Y ) . Let further ( Y m ) m a sequence of subsets in Y fulﬁlling (Y1),...,(Y4) above. Wehave the following properties for σ m ( T : X → Y ; ( Y k ) k ) .(S1) k T k L ( X,Y ) = σ ( T ) ≥ σ ( T ) ≥ · · · ≥ ,(S2) For all m ∈ N , it holds σ m ( T ◦ S ) ≤ σ m ( R ) k S k . (S3) For all m , m ∈ N , it holds σ m + m − ( R + T ) ≤ σ m ( R ) + σ m ( T ) . (S4) If ran T ⊂ Y m − then σ m ( T ) = 0 . s -numbers We ﬁrst need the K -functional of a Banach couple embedded into one joint Hausdorﬀ space A . Deﬁnition 3.1 ( K -functional, [3]) . For two Banach spaces A , A which are jointly embed-ded into a common Hausdorﬀ space A , we deﬁne for a ∈ A + A K ( t, a ; A , A ) = inf a = a + a (cid:16) k a k A + t k a k A (cid:17) . The following interpolation results are well-known, see [19, Sect. 11.6.8, 12.1.11]. Thebelow condition on the intermediate space A θ is called K -type θ with respect to the couple ( A , A ) . Theorem 3.2 (Interpolation of entropy and Kolmogorov numbers, [19]) . Let A , A and A θ be embedded into the same Hausdorﬀ space A . A so-called intermediate space A θ issupposed to satisfy sup t> t − θ K ( t ; a ) ≤ C k a k A θ . ( Θ )Then, we have for any linear operators T : A → B , T : A → B that e n + m − ( T : A θ → B ) ≤ C · e n ( T : A → B ) − θ e m ( T : A → B ) θ and d n + m − ( T : A θ → B ) ≤ C · d n ( T : A → B ) − θ d m ( T : A → B ) θ . The counterpart for the ( σ m ( T )) m numbers is straight-forward. Since we did not ﬁndsuch a result in the literature we decided to state it here explicitly and give a proof. Theorem 3.3 (Best approximation and interpolation) . Let X , X and X θ be embeddedinto the same Hausdorﬀ space A . The intermediate space X θ is supposed to satisfy ( Θ )with respect to the couple ( X , X ) . Then, we have for any linear operator T : X → Y , T : X → Y with Y and ( Y k ) k as in Deﬁnition 2.4 σ n + m − ( T : X θ → Y ; ( Y k ) k ) ≤ C · σ n ( T : X → Y ; ( Y k ) k ) − θ σ m ( T : X → Y ; ( Y k ) k ) θ . roof. Let us abbreviate σ n := σ n ( T : X → Y ; ( Y k ) k ) , σ m := ( T : X → Y ; ( Y k ) k )) . We clearly have for any ε > , x ∈ X and x ∈ X elements y ∈ Y n − , y ∈ Y m − such that k T x − y k Y ≤ (1 + ε ) σ n k x k X , k T x − y k Y ≤ (1 + ε ) σ m k x k X . (3.1)Let now x ∈ X θ and t > . Then, for any δ > there exist x , x such that x = x + x and k x k X + t k x k X ≤ Ct θ k x k X θ (1 + δ ) . Put t := σ m /σ n in the sequel (assuming σ n > , otherwise there is nothing to prove). Hence,due to (3.1), there are y , y such that (cid:13)(cid:13) T x − ( y + y ) (cid:13)(cid:13) Y ≤ k T x − y k Y + k T x − y k Y ≤ (1 + ε ) (cid:16) σ n k x k X + σ m k x k X (cid:17) ≤ (1 + ε ) σ n k x k X + σ m σ n k x k X ! = C (1 + ε )(1 + δ ) σ n σ m σ n ! θ k a k X θ . Put y = y + y and observe by property (Y4) that y ∈ Y m + n − . Since ε, δ can be chosenarbitrarily small, we have σ n + m − ( T : X θ → B ) ≤ Cσ − θn ( T : X → B ) · σ θm ( T : X → B ) . Let us introduce the function spaces of interest. Deﬁne for x ∈ T the univariate Bernoullikernel F r,α ( x ) := 1 + 2 ∞ X k =1 k − r cos( kx − απ/ and deﬁne the multivariate Bernoulli kernels as the corresponding tensor products F r, α ( x ) := d Y j =1 F r,α j ( x j ) , x = ( x , . . . , x d ) ∈ T d , α = ( α , . . . , α d ) . (4.1) Deﬁnition 4.1.

Let r > , α ∈ R d and ≤ p ≤ ∞ . Then W rp, α is deﬁned as the normedspace of all f ∈ L p ( T d ) such that f = F r, α ∗ ϕ := (2 π ) − d Z T d F r, α ( x − y ) ϕ ( y ) d y for some ϕ ∈ L p ( T d ) , equipped with the norm k f k W rp, α := k ϕ k p .6or the Littlewood-Paley characterization we need the building blocks δ s ( f, x ) , deﬁnedwith (1.1) by δ s ( f, x ) := X k ∈ ̺ ( s ) ˆ f ( k ) e i k · x . (4.2) Lemma 4.2. If < p < ∞ and r > then the norms k f k W rp, α with diﬀerent α are allequivalent to the Littlewood-Paley type norm k f k W rp ( T d ) ≍ (cid:13)(cid:13)(cid:13)(cid:16) X s ∈ N d r k s k (cid:12)(cid:12) δ s ( f, x ) (cid:12)(cid:12) (cid:17) (cid:13)(cid:13)(cid:13) p . We now proceed with spaces with bounded mixed diﬀerence. Let e be any subset of { , ..., d } . For multivariate functions f : T d → C and h ∈ [0 , d the mixed ﬁrst orderdiﬀerence operator ∆ e h is deﬁned by ∆ e h := Y i ∈ e ∆ eh i ,i and ∆ ∅ h = I , where I f = f and ∆ h i ,i is the univariate ﬁrst order diﬀerence operator ∆ h g := g ( · + h ) − g ( · ) applied to the i -th variable of f with the other variables kept ﬁxed. We ﬁrst introducespaces/classes H rp of functions with bounded mixed diﬀerence. We restrict to ﬁrst orderdiﬀerence operators since in this paper we are only interested in small smoothness. Deﬁnition 4.3.

Let < r < and ≤ p ≤ ∞ . We deﬁne the space H rp as the set of all f ∈ L p ( T d ) such that for any e ⊂ { , ..., d } (cid:13)(cid:13) ∆ e h ( f, · ) (cid:13)(cid:13) p ≤ C Y i ∈ e | h i | r for some positive constant C , and introduce the norm in this space k f k H rp := X e ⊂{ ,...,d } | f | H rp ( e ) , where | f | H rp ( e ) := sup < | h i |≤ π, i ∈ e (cid:16) Y i ∈ e | h i | − r (cid:17) (cid:13)(cid:13) ∆ e h ( f, · ) (cid:13)(cid:13) p . For the purpose of the paper a characterization in terms of Fourier analytic buildingblocks is necessary. Since we also need to deal with p = 1 and p = ∞ the blocks δ ( f, x ) will not be suﬃcient. We need the counterparts based on the classical de la Vallée Poussinmeans, see [7, Chapt. 2]. Denote with V m ( t ) the univariate de la Vallée Poussin kernel V m ( t ) = 1 m m − X k = m D k ( t ) = sin( mt/

2) sin(3 mt/ m sin ( t/ , m ∈ N . We further denote for s ∈ N A s ( t ) := ( V s ( t ) − V s − ( t ) : s ≥ , V ( t ) : s = 0 .

7n the multivariate case we use the tensorized version and deﬁne for s ∈ N d A s ( x ) := d Y i =1 A s i ( x i ) , x = ( x , ..., x d ) . Finally, the convolution operator A s ( f, · ) is given by A s ( f, · ) := f ∗ A s . (4.3)It holds f = P s ∈ N d A s ( f, · ) and for ≤ p ≤ ∞k A s : L p ( T d ) → L p ( T d ) k ≍ , s ∈ N d . (4.4) Lemma 4.4.

Let < r < . We have the following equivalent characterizations for f ∈ L p ( T d ) . (i) If ≤ p ≤ ∞ we have k f k H rp ( T d ) ≍ sup s ∈ N d (cid:13)(cid:13) A s ( f, · ) (cid:13)(cid:13) p r k s k . (4.5) (ii) If < p < ∞ we have with (4.2) k f k H rp ( T d ) ≍ sup s ∈ N d (cid:13)(cid:13) δ s ( f, · ) (cid:13)(cid:13) p r k s k . (4.6) Remark 4.5.

We will also need the reﬁnement spaces B rp,q ( T d ) , ≤ q ≤ ∞ , for technicalreasons k f k B rp,q ( T d ) := (cid:16) X s ∈ N d r k s k q (cid:13)(cid:13) A s ( f, · ) (cid:13)(cid:13) qp (cid:17) q . (4.7)In this notation we have H rp ( T d ) = B rp, ∞ ( T d ) in the sense of equivalent norms. Note also,that in case < p < ∞ we may replace A s ( f, · ) by δ s ( f, · ) in (4.7). This together withLemma 4.2 yields the identity B r , ( T d ) = W r ( T d ) in the sense of equivalent norms.Let us ﬁnally state a result on real interpolation of classes with bounded mixed diﬀerencewhich may be of interest on its own. Since the focus on the paper is on small smoothnesswe restrict here to smoothness parameters r less than one. The below theorem also worksfor higher smoothness (using an isomorphism diﬀerent from the Faber-Schauder system inthe proof). Theorem 4.6.

Let < p < ∞ , < r < / and r = r + 1 / . Then ( B r ∞ , ∞ ( T d ) , W r ( T d )) θ,p = B rp,p ( T d ) (in the sense of equivalent norms) if θ = 2 /p and r = r + 1 /p . Proof.

First note that W r ( T d ) = B r , ( T d ) . The interpolation formula is an easy conse-quence of the classical interpolation formula ( ℓ ∞ , ℓ ) θ,p = ℓ p with < p < ∞ , θ = 2 /p and the fact that we have a common isomorphism J r (dependingon r ), mapping all three occurring function spaces to either ℓ ∞ , ℓ or ℓ p . As an isomorphismwe may use the periodic Faber-Schauder representation given in [12, (3.5), (3.6)] togetherwith [12, Prop 3.4 and 3.5]. 8 Convolution operators onto the step hyperbolic cross

Let us refer to the deﬁnitions (4.2) and (4.3) of the dyadic block operators δ s ( f, x ) and A s ( f, x ) based on the tensorized dyadic Dirichlet kernel and the tensorized de la ValléePoussin kernel, respectively. We further deﬁne for n ∈ N the hyperbolic cross operators S Q n f := X k s k ≤ n δ s ( f, · ) and A Q n f = X k s k ≤ n A s ( f, · ) . The image of S Q n represents the space of trigonometric polynomials with frequencies sup-ported on the step hyperbolic cross Q n . We denote this space with T ( Q n ) := range( S Q n ) . We may also need the operators S ∆ Q n := S Q n − S Q n − and A ∆ Q n := A Q n − A Q n − for n ∈ N (we set S ∆ Q := S Q and A ∆ Q := A Q ). Let us ﬁrst observe that S Q n = A Q n ◦ S Q n = S Q n ◦ A Q n (5.1)and, for some integer b ∈ N , A Q n = S Q n + b ◦ A Q n = A Q n ◦ S Q n + b . (5.2)It is well-known that in case < p < ∞k S Q n : L p ( T d ) → L p ( T d ) k ≍ k A Q n : L p ( T d ) → L p ( T d ) k ≍ , n ∈ N . (5.3)Moreover, in case p = ∞ we have k A Q n : L ∞ ( T d ) → L ∞ ( T d ) k . n d − , n ∈ N , (5.4)see [30, Lem. 4.2.3]. We are interested in the Kolmogorov and entropy numbers of such aﬁnite rank operator. More generally, we deﬁne for a ﬁnite set of frequencies E ⊂ Z d thecorresponding projection operator S E : L ( T d ) → L ∞ ( T d ) given by S E f ( x ) := X k ∈ E ˆ f ( k ) e i k · x . In this context it is natural to study the best approximation error with respect to themultivariate trigonometric system with free spectrum, i.e., the best m − term trigonometricapproximation of an operator T deﬁned by σ m ( T ) := σ m ( T : L p ( T d ) → L ∞ ( T d ); ( Y n ) n ) (5.5)with Y n := n t ( x ) = X k ∈ Λ c k e i k · x : | Λ | ≤ n, c k ∈ C o . (5.6)If not stated otherwise the quantity σ m will always be used in the context of best m -termtrigonometric approximation in the sequel, see (5.5), (5.6). We may drop the ( Y n ) n in thenotation then.In order to make use of the above interpolation results in Theorems 3.2 and 3.3, we mayneed an appropriate “corner result”. The below bounds for entropy and Kolmogorov numbersare due to Pajor and Tomczak-Jaegermann [18], Belinskii [2], see also 11.2.2 and 11.3.1 in[34], and Dunker, Kühn, Linde, Lifshits [10]. The version for ( σ m ) m is due to Temlyakov,see Theorem 2.6 in [28]. For a univariate version of this result we refer to Belinskii [1].9 heorem 5.1 ([18], [2], [10], [28]) . Let E ⊂ Z d be a ﬁnite set such that E ⊂ B ∞ ( R ) = { x ∈ R d : k x k ∞ ≤ R } for some R ≥ . Then we have (i) for “ s m ∈ { d m , σ m } ” s m ( S E : L ( T d ) → L ∞ ( T d )) . (cid:16) | E | log Rm (cid:17) / , m ≤ | E | . (ii) For the entropy numbers it holds e m ( S E : L ( T d ) → L ∞ ( T d )) .  (cid:16) | E | log Rm (cid:17) / : m ≤ | E | , − m/ | E | √ log R : m > | E | . The following bounds are direct consequences of Theorem 5.1 in connection with Theorem3.2. Note, that (i) below for s m = e m is already known. It was obtained in [32] to prove theMarcinkiewicz type discretization theorems for the hyperbolic cross polynomials. The proofthere is based on a diﬀerent technique. Theorem 5.2.

Let ≤ p < ∞ . Then, it holds for ≤ m ≤ | Q n | and “ s m ∈ { d m , e m , σ m } ” (i) s m ( S Q n : L p ( T d ) → L ∞ ( T d )) . (cid:18) n m (cid:19) p n ( d − (cid:16) − p (cid:17) + p . (ii) If r ≥ then s m ( S ∆ Q n : W rp ( T d ) → L ∞ ( T d )) . − rn (cid:18) n m (cid:19) p n ( d − (cid:16) − p (cid:17) + p , (iii) and if r > /p s m ( A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d )) . − rn (cid:18) n m (cid:19) p n d − p . Proof.

Let us start proving (i). It is well-known for the real interpolation method ( · , · ) θ,q that (cid:0) L ∞ ( T d ) , L ( T d ) (cid:1) θ,p = L p ( T d ) whenever p = 1 − θ ∞ + θ , which means θ = 2 /p . Hence, we have with A θ = L p ( T d ) the condition ( Θ ) fulﬁlled. So,we may interpolate the numbers s m according to Theorems 3.2 and 3.3. This gives for theoperator A Q n s m (cid:0) A Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) . (cid:13)(cid:13)(cid:13) A Q n : L ∞ ( T d ) → L ∞ ( T d ) (cid:13)(cid:13)(cid:13) − θ · s m (cid:0) A Q n : L ( T d ) → L ∞ ( T d ) (cid:1) θ . n ( d − − θ ) s m (cid:0) A Q n : L ( T d ) → L ∞ ( T d ) (cid:1) θ , S n + b : L ( T d ) → L ∞ ( T d ) and A Q n : L ( T d ) → L ( T d ) . Then, (K2) and (5.3) yield s m (cid:0) A Q n : L ( T d ) → L ∞ ( T d ) (cid:1) θ . s m (cid:0) S Q n + b : L ( T d ) → L ∞ ( T d ) (cid:1) θ . Applying Theorem 5.1 to the right-hand side together with θ = 2 /p yields for m ≤ | Q n | s m (cid:0) A Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) . (cid:16) n m (cid:17) p n ( d − − p )+ p . (5.7)Using the ﬁrst identity in (5.1) together with the properties (K2), (S2) and (5.3) gives s m (cid:0) S Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) . s m (cid:0) A Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) , where the right-hand side can be bounded by (5.7). This proves (i).For (ii) observe that by Lemma 4.2 s m (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) ≍ − rn s m (cid:0) S ∆ Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) , which will be bounded using (i).As for (iii), we have by the real interpolation formula in Theorem 4.6 with r = r − /p , r = r − /p + 1 / and θ = 2 /p (cid:16) B r ∞ , ∞ ( T d ) , B r , ( T d ) (cid:17) θ,p = B rp,p ( T d ) , < p < ∞ . As a consequence, we obtain the condition ( Θ ) for A θ = B rp,p ( T d ) with respect to the abovecouple. Interpolating Kolmogorov numbers according to Theorem 3.2 gives s m (cid:0) A ∆ Q n : B rp,p ( T d ) → L ∞ ( T d ) (cid:1) . k A ∆ Q n : B r ∞ , ∞ ( T d ) → L ∞ ( T d ) k − θ (5.8) · s m (cid:0) A ∆ Q n : B r , ( T d ) → L ∞ ( T d ) (cid:1) θ . By Lemma 4.4 together with (4.4) and θ = 2 /p we ﬁnd k A ∆ Q n : B r ∞ , ∞ ( T d ) → L ∞ ( T d ) k − θ . n ( d − (cid:16) − p (cid:17) − r n (cid:16) − p (cid:17) . (5.9)Since B r , ( T d ) = W r ( T d ) in the sense of equivalent norms, we may use (5.7) and plug theresult together with (5.9) into (5.8). This yields s m (cid:0) A ∆ Q n : B rp,p ( T d ) → L ∞ ( T d ) (cid:1) . n ( d − (cid:16) − p (cid:17) − r n (cid:16) − p (cid:17) − r n p " n · n d − nm p ≍ − rn (cid:16) n m (cid:17) p n ( d − − p )+ p . (5.10)Finally, by Lemma 4.4, (ii), Remark 4.5 and (4.4), we see that k S Q n + b − S Q n − b : B rp, ∞ ( T d ) → B rp,p ( T d ) k ≍ n d − p . Using A ∆ Q n = A ∆ Q n ◦ ( S Q n + b − S Q n − b ) together with (K2), (S2) gives s m (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . n d − p s m (cid:0) A ∆ Q n : B rp,p ( T d ) → L ∞ ( T d ) (cid:1) . − rm (cid:16) n m (cid:17) p n d − p , where we used (5.10) in the last step. 11 Embeddings into L ∞ ( T d ) Let us present here our main results for embeddings of Sobolev and Hölder-Nikolskii spaceswith small mixed smoothness into L ∞ ( T d ) . Theorem 6.1.

Let < p < ∞ , p < r < . Then, for “ s m ∈ { d m , e m , σ m } ” we have s m (cid:0) I : W rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (log m ) ( d − − r )+ r . Proof.

We decompose the identity operator

I = ∞ X n =0 S ∆ Q n , where the S ∆ Q n are the operators deﬁned above. Using (K3), (S3) we have that s m (I) ≤ ∞ X n =0 s m n (cid:0) S ∆ Q n (cid:1) with m = P ∞ n =0 m n . We decompose the sum into three parts s m (I) ≤ n X n =0 s m n (cid:0) S ∆ Q n (cid:1) + n X n = n s m n (cid:0) S ∆ Q n (cid:1) + ∞ X n = n s m n (cid:0) S ∆ Q n (cid:1) . (6.1)Let us consider the ﬁrst sum in (6.1). The following argument only works for “ s m ∈{ d m , σ m } ” since a counterpart of (K4) or (S4) is not available for entropy numbers. Wewill indicate the necessary modiﬁcation for s m = e m below. Let n be the largest numbersuch that n X n =0 rank (cid:0) S ∆ Q n (cid:1) ≤ m and put m n := rank (cid:0) S ∆ Q n (cid:1) + 1 . Due to property (K4) and (S4) in Lemma 2.2, we make theﬁrst sum disappear. As for the second sum, we choose m n := ⌊ n ( n − n ) κ n − ( d − ⌋ and n such that n n d − ≍ m . (6.2)Clearly, n X n = n n ( n − n ) κ n − ( d − ≍ n n d − ≍ m . Here, κ is chosen such that r < κ < . Let us decompose as follows S ∆ Q n = S ∆ Q n ◦ I with I : W rp ( T d ) → W r ( T d ) and get s m n (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) . (cid:13)(cid:13)(cid:13) I : W rp ( T d ) → W r ( T d ) (cid:13)(cid:13)(cid:13) · s m n (cid:0) S ∆ Q n : W r ( T d ) → L ∞ ( T d ) (cid:1) . − rn (cid:0) − ( n − n ) κ n d − (cid:1) n d − + , n ≤ n ≤ n gives n X n = n s m n (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) . − rn n d − n d − + . (6.3)Using the fact that n n d − ≍ m , we have(6.2) . m − r n − ( d − r + d − + d − + ≍ m − r n ( d − − r )+ r . (6.4)Now we care for the third sum and choose m n = m · ( n − n ) ζ , where ζ is chosen such that ζp < r − p . Clearly, P ∞ n = n m n ≍ m . By the results from Theorem 5.2, we obtain s m n (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) . (cid:20) n m · ( n − n ) ζ (cid:21) p − rn n ( d − (cid:16) − p (cid:17) + p . Summing over n in the range n = n , n + 1 , . . . gives ∞ X n = n s m n (cid:0) S ∆ Q n (cid:1) . (cid:18) n m (cid:19) p − rn · n ( d − (cid:16) − p (cid:17) + p . (6.5)Because of (6.2), we have (6.5) . n d − p − rn n ( d − (cid:16) − p (cid:17) + p ≍ − rn n d − ≍ m − r n d − n − ( d − r ≍ m − r (cid:0) log m (cid:1) ( d − − r )+ r . This, combined with (6.4), gives the result of the theorem for “ s m ∈ { d m , σ m } ” .We ﬁnally comment on the estimate of the ﬁrst sum in (6.1) in case of entropy numbers.We modify the argument as follows: Instead of choosing m n = rank( S ∆ Q n ) + 1 we choose m n := rank( S ∆ Q n )2 ( n − n ) ε , n = 1 , ..., n , (6.6)with < ε < . This gives n X n =0 m n ≍ rank( S ∆ Q n ) ≍ n n d − , where we choose n such that n n d − ≍ m . By (K2) and Theorem 5.1 we obtain (note that m n > rank( S ∆ Q n ) ) e m n ( S ∆ Q n : W rp ( T d ) → L ∞ ( T d )) . − rn − ( n − n ) ε n / . n = 0 , ..., n yields n X n =0 e m n ( S ∆ Q n : W rp ( T d ) → L ∞ ( T d )) . − rn n / ≍ m − r (log m ) r ( d − / . m − r (log m ) ( d − − r )+ r . This ﬁnishes the proof.

Theorem 6.2.

Let < p ≤ ∞ and p < r < . Then, for “ s m ∈ { d m , e m , σ m } ” s m (cid:0) I : H rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (cid:0) log m (cid:1) d − r . Proof.

This time, we decompose the identity using the operators A ∆ Q n . Let us ﬁrst dealwith the case p < ∞ . Applying again property (K3) and (S3) we ﬁnd s m (cid:0) I : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . n X n =0 s m n (cid:0) A ∆ Q n (cid:1) + n X n = n s m n (cid:0) A ∆ Q n (cid:1) + ∞ X n = n s m n (cid:0) A ∆ Q n (cid:1) . (6.7)We argue analogously as in the proof of Theorem 6.1 for the ﬁrst sum. For the second sum,we choose m n = ⌊ n ( n − n ) κ n ⌋ (6.8)with r < κ < and n such that n n ≍ m . Hence, n X n = n m n ≍ n n ≍ m . (6.9)Then we decompose A ∆ Q n = S Q n + b ◦ A ∆ Q n . (6.10)This gives s m n (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . (cid:13)(cid:13)(cid:13) A ∆ Q n : B rp, ∞ ( T d ) → B r , ( T d ) (cid:13)(cid:13)(cid:13) · s m n ( S Q n + b : B r , ( T d ) → L ∞ ( T d )) . − rn (cid:18) n m n (cid:19) n d − , where we used Theorem 5.2, (ii) for estimating s m n . To estimate k A ∆ Q n k we used Lemma4.4 together with (4.4). Inserting (6.8) yields s m n (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . − rn h ( n − n ) κ · n − i n d − . Summation over n = n , . . . , n leads to n X n = n s m n (cid:0) A ∆ Q n (cid:1) . − rn √ n n d − . Since m ≍ n n , due to (6.9), we get n X n = n s m n (cid:0) A ∆ Q n (cid:1) . − rn n d − r ≍ m − r (cid:0) log m (cid:1) d − r . (6.11)14e ﬁnally deal with the last sum in (6.7). Indeed, by choosing m n := ⌊ m · ( n − n ) ζ ⌋ with r − p − ζp > , we have by Theorem 5.2 s m n (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . − rn (cid:18) n m n (cid:19) p n d − p ≍ (cid:18) n m ( n − n ) ζ (cid:19) p − rn n d − p . Summing up over n = n , n + 1 , . . . yields ∞ X n = n s m n (cid:0) A ∆ Q n (cid:1) . − rn · (cid:18) n m (cid:19) p n d − p . By (6.9) we get n m ≍ n . Hence, we obtain ∞ X n = n s m n (cid:0) A ∆ Q n (cid:1) . m − r (cid:0) log m (cid:1) d − r . Together with (6.11), this proves the theorem in case p < ∞ . For the case p = ∞ we use thebounded embedding I : B r ∞ , ∞ ( T d ) → B rp ∗ , ∞ ( T d ) , where p ∗ < ∞ is chosen such that r > p ∗ .This gives s m (cid:0) I : B r ∞ , ∞ ( T d ) → L ∞ ( T d ) (cid:1) ≤ (cid:13)(cid:13)(cid:13) I : B r ∞ , ∞ ( T d ) → B rp ∗ , ∞ ( T d ) (cid:13)(cid:13)(cid:13) · s m (cid:0) I : B rp ∗ , ∞ ( T d ) → L ∞ ( T d ) (cid:1) . m − r (cid:0) log m (cid:1) d − r , where we used the result for p ∗ < ∞ .Again, we comment on the necessary modiﬁcations in case of s m = e m . Let us considerthe ﬁrst sum in (6.7) again and use (6.10) . We choose m n and n as after (6.6). By thecounterpart of (K2) for entropy numbers we ﬁnd e m n ( A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d )) ≤ k A ∆ Q n : B rp, ∞ ( T d ) → B r , ( T d ) k e m ( S Q n + b : B r , ( T d ) → L ∞ ( T d )) ≤ n ( d − / − rn − ( n − n ) ε n / , where we applied Theorem 5.1, (ii). Summing over n = 0 , ..., n yields n X n =0 e m n ( A ∆ Q n ) . − rn n ( d − / / ≍ m − r (log m ) ( d − r +1 / / . m − r (log m ) d − r . This concludes the proof.For the endpoint situation r = 1 / we obtain an additional (log log m ) / factor in theupper bounds. 15 heorem 6.3 (Endpoint cases) . Let “ s m ∈ { d m , e m , σ m } ” . (i) If < p < ∞ and r = 1 / then s m (cid:0) I : W rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (log m ) ( d − − r )+ r (log log m ) r +1 . (ii) If < p ≤ ∞ and r = 1 / then s m (cid:0) I : H rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (cid:0) log m (cid:1) d − r (log log m ) r +1 . Proof.

We use the same decomposition of the identity operator as above. The ﬁrst andthird sum will be treated analogously. In the second sum it is not possible to choose the κ < . We choose κ = 1 but pay a log( n ) in both summations. Rephrasing the ﬁnal boundin terms of m yields an additional (log log m ) r +1 factor. Remark 6.4 ( d = 2 ) . (i) We would like to emphasize that in Theorem 6.2, when d = 2 ,we actually do not need the middle sum ranging over [ n , n ] in (6.7). Hence, the restriction r ≤ / does not play a role here. This results in s m (I : H rp ( T ) → L ∞ ( T )) . m − r (cid:0) log m (cid:1) r (6.12)for all r > /p and ≤ p ≤ ∞ . Compared to Theorem 6.3, (ii) we do not have a log log -termhere for r = 1 / . In addition, together with Theorem 7.8.4 from [30] (see also [7, Thm. 6.3.4]and the references therein) and Carl’s inequality [5] we get the correct order in case d = 2 for Kolmogorov and entropy numbers. Namely for ≤ p ≤ ∞ and r > /p it holds e m (I : H rp ( T ) → L ∞ ( T )) ≍ d m (I : H rp ( T ) → L ∞ ( T )) ≍ m − r (cid:0) log m (cid:1) r . The result for entropy numbers is true for ≤ p ≤ ∞ , r > /p , see [30, Thm. 7.8.4]. Note,that for the W rp classes the correct order of decay for d = 2 of the Kolmogorov and entropynumbers is only known in case of large smoothness r > / , see [34, Chapt. 11] and also [10]. (ii) The upper bound in (6.12) also includes the error of best m -term trigonometricapproximation σ m . Together with the lower bounds from [28, Thm. 3.3] we have in case d = 2 , ≤ p ≤ ∞ and r > /pm − r log m . σ m (I : H rp ( T ) → L ∞ ( T )) . m − r (cid:0) log m (cid:1) r . In this section we comment on applications of the above results and add a discussion onpossible future research and open problems motivated by our considerations. We can sayright here that we already have made progress on the Outstanding Open Problems in [7],especially 1.3, 1.6, 1.7. In addition, we discuss consequences for sampling recovery in L .Furthermore, we comment on the use of ﬁnite dimensional subspaces generated by hyperbolicwavelets as buidling blocks and wavelet type dictionaries for best m -term approximation. Entropy and Kolmogorov numbers.

Entropy numbers for mixed smoothness embed-dings have been investigated by several authors in the literature, see [7, Chapt. 6]. Amongmany others, Vybíral [36] investigated the behavior of entropy numbers in B sp,q -spaces, seeRemark 4.5, using wavelet building blocks. In addition, the authors in [10] managed to prove16 counterpart of the corner result in Theorem 5.2, (i), for p = 2 , s m = d m and S Q n replacedby the corresponding hyperbolic Haar wavelet projection.Let us comment on this technique here and how it can be applied for the uniform normestimates. Technically, instead of trigonometric polynomials one may also use a univariatewavelet system { ψ I = ψ (( · − x I ) / | I | ) : I ∈ I , | I | ≤ } , where I is the set of dyadic intervals I with midpoints x I = k − j , k, j ∈ Z . We further consider the corresponding multivariate(tensorized) system D = n ψ I ( x ) = d Y j =1 ψ I j ( x j ) : I = I × · · · × I d , | I j | ≤ , j = 1 , ..., d o (7.1)and deﬁne the orthogonal projection on the hyperbolic layers ˜ S n f := X I ⊂ [0 , d | I | =2 − n h f, ψ I i ψ I . This operator replaces the above S ∆ Q n . Then we have the decomposition of the identityoperator I = P ∞ n =0 ˜ S n . We assume, that the wavelet system is suﬃciently smooth, compactlysupported and has good decay properties. We also need the ﬁnite-dimensional block result s m (id : ℓ Np → ℓ N ∞ ) ≍ h log( eN/m ) m i /p , ≤ m ≤ N , (7.2)with “ s m ∈ { d m , e m } ” . The result for entropy numbers in case < p < ∞ is folklore, see[7, Thm. 6.1.3] and the references given there. For the corresponding result for Kolmogorovnumbers in case ≤ p < ∞ see [11, Thm. 1.1], where the sharp dual version in terms ofGelfand numbers is proved. This result in combination with the proof in [36, Thm. 3.19]gives for all ≤ p ≤ ∞ , r > /ps m (I : B rp, ∞ → L ∞ ) ≤ s m (I : B rp, ∞ → B ∞ , ) . m − r (log m ) ( d − r +1) . It turns out that in case d = 2 we recover the small smoothness result in Theorem 6.2 aswell as Belinskii’s “large smoothness” result in [34], 11.3.5. In case d = 2 the above resultis sharp, see the discussion in Remark 6.4. In addition, the result gives an indication thatthe log log m term in Theorem 6.3 is probably not needed. In case of small smoothness and d > the result is worse than our result in Theorem 6.2 and Theorem 6.3.Additionally, in [36] the author pointed out some gaps between upper and lower boundsin a certain range of small smoothness. This was the starting point of the recent paperMayer, Ullrich [15, Cor. 23, (iii)], where the sharp behavior e m (I : B rp,q → L ∞ ) ≍ m − r (7.3)is shown in case < p ≤ ∞ , < q ≤ / , and /p < r ≤ / for all dimensions d . The proofrelies on a reﬁnement of (7.2) for mixed ℓ nq ( ℓ Np ) -norms, see [15, Thm. 13]. Roughly speaking,reﬁning the spaces H rp by decreasing the third parameter q , see Remark 4.5, allows us to getrid of the logarithmic term. A combination of the technique in [15] with the technique used inthis paper may allow to extend the range of parameters for the result (7.3). A correspondingresult for Kolmogorov numbers is not known. However, a similar phenomenon occurs for the σ m numbers associated to a wavelet type dictionary (see below). Note that the space B rp,q is a quasi-Banach space. 17 avelet type dictionaries. In the context of function spaces with mixed smoothnessnot only best m -term trigonometric approximation has been considered. Also hyperbolicwavelet type dictionaries D , as deﬁned in (7.1), gained substantial interest, see for instance[27] or [7, Sect. 7.2] and the references therein. It turned out that the order of decay of thecorresponding error quantities (modify the deﬁnition of ( Y n ) n in (5.6) accordingly) is oftensubstantially better than for the trigonometric system, see [7, Sect. 7] and the referencestherein. In fact, the gain is not only in the logarithmic term but sometimes also in the mainrate. This is certainly not the case in our setting. A reasonable question would be: Does awavelet system perform comparably well with respect to the decay of the associated σ m inthe L ∞ norm? Note that a fundamental diﬀerence between wavelets and the trigonometricsystem is the lack of a universal L ∞ -bound of the L -normalized wavelet system. As aconsequence, some of the techniques used by Belinskii, see for instance [34, 11.2.5], cannot be directly adapted to wavelets. A strong indication that wavelet dictionaries may notperform worse than the trigonometric system is the following observation. In [4, Th. 6.15]it is proved that for < p ≤ ∞ , < q ≤ / , and /p < r ≤ / we have σ m (I : B rp,q ( T d ) → L ∞ ( T d ); D ) . m − r . This result is sharp for all dimensions d if we use the tensorized Faber Schauder system asdictionary D . Sampling recovery.

We introduce the notion of sampling numbers of an operator T : F → G between two Banach spaces F and G of functions on D . We assume that point evaluationsare linear functionals on F . This would be the case if F is continuously embedded into C ( D ) ,the space of continuous functions on D . Let us deﬁne the m -th sampling numbers of anoperator T ∈ L ( F , G ) as follows ̺ m ( T : F → G ) := inf x ,..., x n ∈ D inf ϕ : C n → G linear sup k f k F ≤ k T f − ϕ ( f ( x ) , ..., f ( x n )) k G . In many cases the embedding F ֒ → G and the corresponding embedding operator I : F → G is considered. A particular situation is the case when G = L ( D ) . In this situationit has been proven in [31] that there are two positive absolute constants b, B > such that ̺ bm (I : F → L ( D )) ≤ Bd m (I : F → L ∞ ( D )) , m ∈ N . (7.4)In case that F represents a reproducing kernel Hilbert space H ( K ) embedded into L ( D ) we even know that, see [16], ̺ m (I : H ( K ) → L ) ≤ c log( m ) m ∞ X k = ⌊ c m ⌋ d k (I : H ( K ) → L ( D )) , m ∈ N . with (precisely given) absolute constants c , c > . Similar results have been recentlyestablished for non-Hilbert function spaces, see [14]. However, in both these settings thesquare summability of the corresponding Kolmogorov numbers d k (I : F → L ) is crucial. Wedo not have this property in the “small smoothness setting” studied in this paper. Hence,results on Kolmogorov numbers in the uniform norm together with (7.4) serve as a powerfultool to investigate the sampling recovery problem in L for the case of small smoothness.From (7.4) together with Theorem 6.2 we obtain in case /p < r < / ̺ m (I : H rp ( T d ) → L ( T d )) . m − r (log m ) d − r . (7.5)18n addition, the endpoint result from Theorem 6.3 have direct counterparts for samplingnumbers. Let us point out that the so far best-known upper bounds in the above situationhave been obtained by the use of sparse grid (Smolyak) recovery algorithms, see [26], [25],[6, 9], resulting in ̺ m (I : H rp ( T d ) → L ( T d )) . m − r (log m ) ( d − r ) (7.6)for all r > /p . It is obvious, that (7.5) improves on (7.6) in case d > if /p < r < / .Note also that [8, Thm. 5.1] shows, when restricting to sparse grid methods, the bound in(7.6) can not be improved. Hence, in the case of small smoothness sparse grid methods cannot be optimal in the above situation when d > .As for I : W rp ( T d ) → L ( T d ) we obtain by (7.4) and Theorem 6.1 the bound ̺ m (I : W rp ( T d ) → L ( T d )) . m − r (log m ) ( d − − r )+ r (7.7)if /p < r < / . Clearly, the bound in Theorem 6.3 on the endpoint case also carries overto the sampling numbers. By the results in [25] together with complex interpolation we ﬁndthe bound ̺ m (I : W rp ( T d ) → L ( T d )) . m − r (log m ) ( d − ε ) , (7.8)for any ε > in the case of small smoothness by using a sparse grid method. Clearly,(7.7) improves on (7.8). However, here it is not clear whether the analysis for the sparse gridmethod can be improved or not. A valid lower bound comes from the embedding B rp, ֒ → W rp since p > . Hence, [8, Thm. 5.1] shows that any sparse grid method is asymptotically worsethan m − r (log m ) ( d − r +1 / . This yields in case / < r < / and large enough d that thesparse grid methods can not be optimal since the bound in (7.7) is better. However, thesampling method behind the bounds in this paper is highly non-constructive, whereas thesparse grid methods are constructive and can be implemented. From this point of view,our results show that there might exist further constructive methods which improve on thesparse grid methods regarding the asymptotic error decay.Let us ﬁnally mention that the correct order of the quantities ̺ m in the above situation isstill unknown. We improved on the upper bounds. A good source for lower bounds are theerror quantities with respect to numerical integration, which has been observed by Novak[17]. This in connection with the lower bounds in [30] and [8] gives ̺ m (I : W rp ( T d ) → L ( T d )) & m − r (log m ) ( d − / (7.9)and ̺ m (I : H rp ( T d ) → L ( T d )) & m − r (log m ) d − . (7.10)Comparing (7.10) to (7.5) we observe a diﬀerence in the log -exponent of only r and thereforenot growing in d . Comparing (7.9) to (7.7) the diﬀerence is growing in d in case r < / . If r = 1 / we are close to the lower bound coming from numerical integration. Acknowledgment.

The ﬁrst author was supported by the Russian Federation GovernmentGrant N o o eferences [1] E. Belinskii. Decomposition theorems and approximation by a “ﬂoating" system ofexponentials. Transactions of the American Mathematical Society , 350:43–53, 1998.[2] E. S. Belinsky. Estimates of entropy numbers and Gaussian measures for classes offunctions with bounded mixed derivative.

J. Approx. Theory , 93(1):114–127, 1998.[3] J. Bergh and J. Löfström.

Interpolation spaces. An introduction . Springer-Verlag, Berlin-New York, 1976. Grundlehren der Mathematischen Wissenschaften, No. 223.[4] G. Byrenheid.

Sparse representation of multivariate functions based on discrete pointevaluations . Dissertation, Institut f ür Numerische Simulation, Universität Bonn, 2018.[5] B. Carl. Entropy numbers, s -numbers, and eigenvalue problems. J. Funct. Analysis ,41:290–306, 1981.[6] D. D˜ung. B-spline quasi-interpolant representations and sampling recovery of functionswith mixed smoothness.

J. Complexity , 27(6):541–567, 2011.[7] D. D˜ung, V. N. Temlyakov, and T. Ullrich.

Hyperbolic cross approximation . AdvancedCourses in Mathematics. CRM Barcelona. Birkhäuser/Springer, 2019.[8] D. D˜ung and T. Ullrich. Lower bounds for the integration error for multivariate functionswith mixed smoothness and optimal Fibonacci cubature for functions on the square.

Math. Nachr. , 288(7):743–762, 2015.[9] D. D˜ung. Sampling and cubature on sparse grids based on a B-spline quasi- interpola-tion.

Found. Comput. Math. , 16(5):1193–1240, 2016.[10] T. Dunker, T. Kühn, M. Lifshits, and W. Linde. Metric entropy of integration operatorsand small ball probabilities for the Brownian sheet.

Journ. Approx. Theory , 101:63–77,1999.[11] S. Foucart, A. Pajor, H. Rauhut, and T. Ullrich. The Gelfand widths of ℓ p -balls for < p ≤ . J. Complexity , 26(6):629–640, 2010.[12] A. Hinrichs, L. Markhasin, J. Oettershagen, and T. Ullrich. Optimal quasi-MonteCarlo rules on order digital nets for the numerical integration of multivariate periodicfunctions. Numer. Math. , 134(1):163–196, 2016.[13] D. Krieg and M. Ullrich. Function values are enough for L -approximation. arXiv:math/1905.02516v3 , 2019.[14] D. Krieg and M. Ullrich. Function values are enough for L -approximation: Part (II). arXiv:2011.01779 , 2020.[15] S. Mayer and T. Ullrich. Entropy numbers of ﬁnite dimensional mixed-norm balls andfunction space embeddings with small mixed smoothness. Constr. Approx. , to appear.[16] N. Nagel, M. Schäfer, and T. Ullrich. A new upper bound for sampling numbers. arXiv:2010.00327 , 2020.[17] E. Novak. Quadrature and widths.

J. Approx. Theory , 47:195–202, 1986.2018] A. Pajor and N. Tomczak-Jaegermann. Subspaces of small codimension of ﬁnite-dimensional Banach spaces.

Proc. Amer. Math. Soc. , 97(4):637–642, 1986.[19] A. Pietsch.

Operator ideals . North-Holland, 1980.[20] A. Pietsch. Approximation spaces.

J. Approx. Theory , 32(2):115–134, 1981.[21] A. Pietsch.

Eigenvalues and s -numbers , volume 13 of Cambridge Studies in AdvancedMathematics . Cambridge University Press, Cambridge, 1987.[22] A. S. Romanyuk. Best m -term trigonometric approximations of Besov classes of periodicfunctions of several variables. Izvestia RAN, Ser. Mat. , 67:61–100, 2003.[23] A. S. Romanyuk. Entropy numbers and widths for the Nikol’skij-Besov classes of func-tions of many variables in the space L ∞ . Analysis Math. , 45(1):133–151, 2019.[24] A. Seeger and W. Trebels. Low regularity classes and entropy numbers.

Archiv derMathematik , 92:147–157.[25] W. Sickel and T. Ullrich. The Smolyak algorithm, sampling on sparse grids and functionspaces of dominating mixed smoothness.

East J. Approx. , 13(4):387–425, 2007.[26] V. N. Temlyakov.

Approximation of periodic functions . Computational Mathematicsand Analysis Series. Nova Science Publishers Inc., Commack, NY, 1993.[27] V. N. Temlyakov. Greedy algorithms with regard to multivariate systems with specialstructure.

Constr. Approx. , 16:399–425, 2000.[28] V. N. Temlyakov. Constructive sparse trigonometric approximation and other problemsfor functions with mixed smoothness.

Matem. Sb. , 206:131–160, 2015.[29] V. N. Temlyakov. Constructive sparse trigonometric approximation for functions withsmall mixed smoothness.

Constr. Approx. , 45:467–495, 2017.[30] V. N. Temlyakov.

Multivariate approximation , volume 32 of

Cambridge Monographson Applied and Computational Mathematics . Cambridge University Press, Cambridge,2018.[31] V. N. Temlyakov. On optimal recovery in L . arXive:2010.03103 , 2020.[32] V. N. Temlyakov. Sampling discretization of integral norms of the hyperbolic crosspolynomials. arXiv:2005.05967v1 , 2020.[33] V. N. Temlyakov and T. Ullrich. Bounds on Kolmogorov widths of classes with smallmixed smoothness. arXiv:2012.09925v1 , 2020.[34] R. M. Trigub and E. S. Bellinsky. Fourier analysis and approximation of functions .Kluwer Academic Publishers, Dordrecht, 2004. [Belinsky on front and back cover].[35] M. Ullrich and T. Ullrich. The role of Frolov’s cubature formula for functions withbounded mixed derivative.

SIAM J. Numer. Anal. , 54(2):969–993, 2016.[36] J. Vybíral. Function spaces with dominating mixed smoothness.