Approximation of functions with small mixed smoothness in the uniform norm
aa r X i v : . [ m a t h . F A ] D ec Approximation of functions with small mixedsmoothness in the uniform norm
Vladimir N. Temlyakov, Tino Ullrich ∗ University of South Carolina, Steklov Institute of Mathematics,Lomonosov Moscow State University,and Moscow Center for Fundamental and Applied Mathematics;Faculty of Mathematics, 09107 Chemnitz, GermanyDecember 23, 2020
Abstract
In this paper we present results on asymptotic characteristics of multivariate func-tion classes in the uniform norm. Our main interest is the approximation of functionswith mixed smoothness parameter not larger than / . Our focus will be on the behav-ior of the best m -term trigonometric approximation as well as the decay of Kolmogorovand entropy numbers in the uniform norm. It turns out that these quantities share afew fundamental abstract properties like their behavior under real interpolation, suchthat they can be treated simultaneously. We start with proving estimates on finite rankconvolution operators with range in a step hyperbolic cross. These results imply boundsfor the corresponding function space embeddings by a well-known decomposition tech-nique. The decay of Kolmogorov numbers have direct implications for the problemof sampling recovery in L in situations where recent results in the literature are notapplicable since the corresponding approximation numbers are not square summable. Keywords and phrases : Best m − term trigonometric approximation, Kolmogorov num-bers, entropy numbers, small smoothness, uniform norm : 41A10, 41A25, 41A60, 41A63, 42A10,68Q25, 94A20 In this paper we provide new upper bounds for the best m − term trigonometric approxi-mation ( σ m ), the Kolmogorov numbers ( d m ), and the entropy numbers ( e m ) of multivariatefunction classes in the uniform norm. It is nowadays widely believed that the target space L ∞ ( T d ) comes with additional difficulties and often requires new and involved techniques.Another challenge is the treatment of classes of periodic functions with small mixed smooth-ness (derivative or difference), where several questions concerning approximation and inte-gration have not yet been settled. We make progress towards the solution of the Outstanding ∗ Corresponding author: [email protected] s m ( T ) , where “ s m ( T ) ∈ { d m ( T ) , e m ( T ) , σ m ( T ) } ” and T denotes an operator mapping into L ∞ ( T d ) . We follow the classical approach and startwith new results for finite rank convolution operators T = S Q n , the orthogonal projectiononto the trigonometric polynomials with frequencies in the dyadic step hyperbolic cross Q n ⊂ Z d , defined by ̺ ( s ) := (cid:8) k ∈ Z d : [2 s j − ] ≤ | k j | < s j , j = 1 , . . . , d (cid:9) , (1.1) Q n := [ k s k ≤ n ̺ ( s ) . (1.2)Namely, for ≤ p < ∞ it holds s m ( S Q n : L p ( T d ) → L ∞ ( T d )) . (cid:18) n m (cid:19) p n ( d − (cid:16) − p (cid:17) + p , m ≤ | Q n | . The result is based on the common real interpolation properties of all three asymptoticcharacteristics in connection with a “corner result” due to Pajor, Tomczak-Jaegermann [18],Belinskii [2] and Dunker, Kühn, Linde, Lifshits [10]. A corresponding corner result for thebest m -term trigonometric approximation σ m in the univariate case was obtained by Belinskii[1], who used a probabilistic technique, and in the multivariate case by Temlyakov [28], whoused the greedy approximation technique.It is well-known that the analysis of approximation problems for function classes withsmall mixed smoothness involves several technical difficulties, see for instance [36, Rem. 4.10]and [15] for the study of entropy numbers. Similar difficulties have been already observed forthe quantities of numerical integration, see [35], where the bounds look similar. Indeed, thesequantities serve as lower bounds for Kolmogorov numbers in L ∞ , which has been observedby Novak [17]. In this paper we give the asymptotic bounds s m (I : W rp ( T d ) → L ∞ ( T d )) . m − r (log m ) ( d − − r )+ r as well as s m (I : H rp ( T d ) → L ∞ ( T d )) . m − r (log m ) d − r for function classes with small mixed smoothness /p < r < / . In the endpoint situation r = 1 / we encounter an additional (log log m ) / factor, see Theorem 6.3 below. It is stillopen whether these bounds are sharp when d > . The reader can find a brief discussion ofthe case d = 2 in the Remark 6.4 below. Thus, we obtain new results on three asymptoticcharacteristics – Kolmogorov numbers d m , entropy numbers e m , and best m -term approxi-mations σ m – for two kinds of classes W rp and H rp in the case of small smoothness r ≤ / ,when the error is evaluated in the uniform norm L ∞ . There is an extensive history of study-ing each of the above asymptotic characteristics. They were studied for large smoothness r > / , for classes W rp , H rp , and for Besov classes B rp,θ , where the error is evaluated in the L q norm, ≤ q ≤ ∞ . We refer the reader for a detailed historical discussion to the tworecent books [7] and [30]. For the d m see [7, Sect. 4.3], and [30, Sect. 5.3]. For the e m see[7, Chapt. 6] and [30, Chapt. 7]. Finally, for the σ m see [7, Chapt. 7] and [30, Chapt. 9]. Inaddition to the above books we mention the recent paper Romanyuk [23].2e continue the investigation of asymptotic characteristics of classes of multivariate func-tions with small mixed smoothness started in [33] on this topic. There we concentrated onthe study of asymptotic characteristics from linear approximation theory – the Kolmogorovwidths. We pointed out some applications of new results on the Kolmogorov widths to thesampling recovery problem. In this paper focus is set on the study of asymptotic charac-teristics from nonlinear approximation theory – sparse approximation with respect to thetrigonometric system and entropy numbers. Those are interpreted as pseudo s − numberssharing a few fundamental properties. We use a classical decomposition machinery (sim-ilar to the one used in [33]), where we rely on finite rank operators ranging in subspacesof trigonometric polynomials with frequencies from hyperbolic crosses as building blocks.However, in contrast to [33] we heavily apply well-known tools from interpolation theoryof operators to analyze the finite rank operators. In [33] an elementary approach is usedto estimate widths of function classes , which is based on the application of a standard cut-off operator to dyadic building blocks. Certainly, deeply at the roots both approaches arerelated, since the cutoff operator is also used for computing K -functionals in real interpo-lation theory. However, technical realizations of these approaches are different and may beinteresting for different communities.Recent observations regarding the problem of optimal sampling recovery of functionsin L bring classes with small mixed smoothness to the focus again. Since several newlydeveloped techniques only work for Hilbert-Schmidt operators [13], [16] or, more generally,in situations where certain asymptotic characteristics (approximation numbers) are squaresummable [14], we need new techniques in situations where this is not the case. Especiallyin the range of small smoothness we are far away from square summability. Nevertheless,multivariate function classes of this type are of interest, since for instance a mixed Hölder-Zygmund regularity r ≤ / falls into this scope. Recently, see [31], the sampling recoveryerror in L was directly related to the Kolmogorov numbers in L ∞ . It seems that, especiallyfor the case of small smoothness, this represents the only available tool at the moment apartfrom sparse grid methods. Surprisingly, as an application of our results on Kolmogorovnumbers we show that any sparse grid technique performs asymptotically worse by a log -factor with exponent growing with the dimension d . This motivates further research infinding better constructive sampling algorithms.The paper is organized as follows. In Sections 2 and 3 we define the asymptotic character-istics of interest in a framework of operators and pseudo s -numbers. This notion goes backto Pietsch [19]. We particularly pay attention to the real interpolation properties. Section 4deals with the relevant function spaces with bounded mixed derivative or difference. Here wealso give a new real interpolation formula. Afterwards in Section 5 we establish first resultsfor the orthogonal projection operators with respect to the (trigonometric) step hyperboliccrosses. These estimates are used to obtain the main results in Section 6 for function spaceembeddings into L ∞ ( T d ) . Finally, in Section 7 we discuss the obtained results and giveapplications for the problem of sampling recovery. s -numbers In this section we introduce the asymptotic characteristics of interest, namely the Kol-mogorov and entropy numbers as well as the error of best approximation with respect to anapproximation scheme. 3 efinition 2.1 (Kolmogorov numbers) . For Banach spaces
A, B and a linear operator T : A → B , we define the m -th Kolmogorov number as d m ( T : A → B ) := inf dim L m A, B, C be Banach spaces and S, T ∈L ( A, B ) , R ∈ L ( B, C ) . We have the following properties.(K1) k T k L ( A,B ) = d ( T ) ≥ d ( T ) ≥ · · · ≥ ,(K2) For all m , m ∈ N , it holds d m + m − ( R ◦ S ) ≤ d m ( R ) d m ( S ) . (K3) For all m , m ∈ N , it holds d m + m − ( S + T ) ≤ d m ( S ) + d m ( T ) . (K4) d m ( T ) = 0 whenever rank( T ) < m .Note that, except for (K4), these properties are shared by dyadic entropy numbers ( e m ) m which we define below. To incorporate also dyadic entropy numbers into the frameworkPietsch introduced the notion of pseudo s -numbers . We may use this notion here in aslightly different way. Definition 2.3 (Entropy numbers) . Let T : A → B be a linear operator between twoBanach spaces A, B . The entropy numbers of T are defined as e m ( T : A → B ) := inf n ε > T ( U A ) ⊂ m − [ k =1 ( b k + ε · U B ) o , m ∈ N . Let us finally recall the definition of the asymptotic quantity measuring the best approx-imation with respect to an approximation scheme. This notion goes back to Pietsch [20] andincludes the case of the best m -term approximation with respect to a dictionary D . We willuse it later for the multivariate trigonometric system. Let X, Y denote arbitrary Banachspaces and let ( Y n ) n ∈ N denote a sequence of subsets of Y satisfying(Y1) Y = { } ,(Y2) Y n ⊂ Y n +1 , n ∈ N ,(Y3) λY n ⊂ Y n for all n ∈ N and all scalars λ , and finally(Y4) Y n + Y m ⊂ Y m + n . Definition 2.4 (Error of best approximation, [20]) . Let X and Y be as above. Let further T : X → Y denote a linear and bounded operator. Then we define the asymptotic characteristic σ m ( T : X → Y ; ( Y n ) n ) := sup k x k X ≤ inf y ∈ Y m − k T x − y k Y . 4t turns out that counterparts of (K1), (K3) and (K4) hold true. (K2) has to be replacedby a weaker version (S2) which, however, is sufficient for our approach. Lemma 2.5 (Properties of σ m ) . Let Z, X, Y be Banach spaces and S ∈ L ( Z, X ) , R, T ∈L ( X, Y ) . Let further ( Y m ) m a sequence of subsets in Y fulfilling (Y1),...,(Y4) above. Wehave the following properties for σ m ( T : X → Y ; ( Y k ) k ) .(S1) k T k L ( X,Y ) = σ ( T ) ≥ σ ( T ) ≥ · · · ≥ ,(S2) For all m ∈ N , it holds σ m ( T ◦ S ) ≤ σ m ( R ) k S k . (S3) For all m , m ∈ N , it holds σ m + m − ( R + T ) ≤ σ m ( R ) + σ m ( T ) . (S4) If ran T ⊂ Y m − then σ m ( T ) = 0 . s -numbers We first need the K -functional of a Banach couple embedded into one joint Hausdorff space A . Definition 3.1 ( K -functional, [3]) . For two Banach spaces A , A which are jointly embed-ded into a common Hausdorff space A , we define for a ∈ A + A K ( t, a ; A , A ) = inf a = a + a (cid:16) k a k A + t k a k A (cid:17) . The following interpolation results are well-known, see [19, Sect. 11.6.8, 12.1.11]. Thebelow condition on the intermediate space A θ is called K -type θ with respect to the couple ( A , A ) . Theorem 3.2 (Interpolation of entropy and Kolmogorov numbers, [19]) . Let A , A and A θ be embedded into the same Hausdorff space A . A so-called intermediate space A θ issupposed to satisfy sup t> t − θ K ( t ; a ) ≤ C k a k A θ . ( Θ )Then, we have for any linear operators T : A → B , T : A → B that e n + m − ( T : A θ → B ) ≤ C · e n ( T : A → B ) − θ e m ( T : A → B ) θ and d n + m − ( T : A θ → B ) ≤ C · d n ( T : A → B ) − θ d m ( T : A → B ) θ . The counterpart for the ( σ m ( T )) m numbers is straight-forward. Since we did not findsuch a result in the literature we decided to state it here explicitly and give a proof. Theorem 3.3 (Best approximation and interpolation) . Let X , X and X θ be embeddedinto the same Hausdorff space A . The intermediate space X θ is supposed to satisfy ( Θ )with respect to the couple ( X , X ) . Then, we have for any linear operator T : X → Y , T : X → Y with Y and ( Y k ) k as in Definition 2.4 σ n + m − ( T : X θ → Y ; ( Y k ) k ) ≤ C · σ n ( T : X → Y ; ( Y k ) k ) − θ σ m ( T : X → Y ; ( Y k ) k ) θ . roof. Let us abbreviate σ n := σ n ( T : X → Y ; ( Y k ) k ) , σ m := ( T : X → Y ; ( Y k ) k )) . We clearly have for any ε > , x ∈ X and x ∈ X elements y ∈ Y n − , y ∈ Y m − such that k T x − y k Y ≤ (1 + ε ) σ n k x k X , k T x − y k Y ≤ (1 + ε ) σ m k x k X . (3.1)Let now x ∈ X θ and t > . Then, for any δ > there exist x , x such that x = x + x and k x k X + t k x k X ≤ Ct θ k x k X θ (1 + δ ) . Put t := σ m /σ n in the sequel (assuming σ n > , otherwise there is nothing to prove). Hence,due to (3.1), there are y , y such that (cid:13)(cid:13) T x − ( y + y ) (cid:13)(cid:13) Y ≤ k T x − y k Y + k T x − y k Y ≤ (1 + ε ) (cid:16) σ n k x k X + σ m k x k X (cid:17) ≤ (1 + ε ) σ n k x k X + σ m σ n k x k X ! = C (1 + ε )(1 + δ ) σ n σ m σ n ! θ k a k X θ . Put y = y + y and observe by property (Y4) that y ∈ Y m + n − . Since ε, δ can be chosenarbitrarily small, we have σ n + m − ( T : X θ → B ) ≤ Cσ − θn ( T : X → B ) · σ θm ( T : X → B ) . Let us introduce the function spaces of interest. Define for x ∈ T the univariate Bernoullikernel F r,α ( x ) := 1 + 2 ∞ X k =1 k − r cos( kx − απ/ and define the multivariate Bernoulli kernels as the corresponding tensor products F r, α ( x ) := d Y j =1 F r,α j ( x j ) , x = ( x , . . . , x d ) ∈ T d , α = ( α , . . . , α d ) . (4.1) Definition 4.1. Let r > , α ∈ R d and ≤ p ≤ ∞ . Then W rp, α is defined as the normedspace of all f ∈ L p ( T d ) such that f = F r, α ∗ ϕ := (2 π ) − d Z T d F r, α ( x − y ) ϕ ( y ) d y for some ϕ ∈ L p ( T d ) , equipped with the norm k f k W rp, α := k ϕ k p .6or the Littlewood-Paley characterization we need the building blocks δ s ( f, x ) , definedwith (1.1) by δ s ( f, x ) := X k ∈ ̺ ( s ) ˆ f ( k ) e i k · x . (4.2) Lemma 4.2. If < p < ∞ and r > then the norms k f k W rp, α with different α are allequivalent to the Littlewood-Paley type norm k f k W rp ( T d ) ≍ (cid:13)(cid:13)(cid:13)(cid:16) X s ∈ N d r k s k (cid:12)(cid:12) δ s ( f, x ) (cid:12)(cid:12) (cid:17) (cid:13)(cid:13)(cid:13) p . We now proceed with spaces with bounded mixed difference. Let e be any subset of { , ..., d } . For multivariate functions f : T d → C and h ∈ [0 , d the mixed first orderdifference operator ∆ e h is defined by ∆ e h := Y i ∈ e ∆ eh i ,i and ∆ ∅ h = I , where I f = f and ∆ h i ,i is the univariate first order difference operator ∆ h g := g ( · + h ) − g ( · ) applied to the i -th variable of f with the other variables kept fixed. We first introducespaces/classes H rp of functions with bounded mixed difference. We restrict to first orderdifference operators since in this paper we are only interested in small smoothness. Definition 4.3. Let < r < and ≤ p ≤ ∞ . We define the space H rp as the set of all f ∈ L p ( T d ) such that for any e ⊂ { , ..., d } (cid:13)(cid:13) ∆ e h ( f, · ) (cid:13)(cid:13) p ≤ C Y i ∈ e | h i | r for some positive constant C , and introduce the norm in this space k f k H rp := X e ⊂{ ,...,d } | f | H rp ( e ) , where | f | H rp ( e ) := sup < | h i |≤ π, i ∈ e (cid:16) Y i ∈ e | h i | − r (cid:17) (cid:13)(cid:13) ∆ e h ( f, · ) (cid:13)(cid:13) p . For the purpose of the paper a characterization in terms of Fourier analytic buildingblocks is necessary. Since we also need to deal with p = 1 and p = ∞ the blocks δ ( f, x ) will not be sufficient. We need the counterparts based on the classical de la Vallée Poussinmeans, see [7, Chapt. 2]. Denote with V m ( t ) the univariate de la Vallée Poussin kernel V m ( t ) = 1 m m − X k = m D k ( t ) = sin( mt/ 2) sin(3 mt/ m sin ( t/ , m ∈ N . We further denote for s ∈ N A s ( t ) := ( V s ( t ) − V s − ( t ) : s ≥ , V ( t ) : s = 0 . 7n the multivariate case we use the tensorized version and define for s ∈ N d A s ( x ) := d Y i =1 A s i ( x i ) , x = ( x , ..., x d ) . Finally, the convolution operator A s ( f, · ) is given by A s ( f, · ) := f ∗ A s . (4.3)It holds f = P s ∈ N d A s ( f, · ) and for ≤ p ≤ ∞k A s : L p ( T d ) → L p ( T d ) k ≍ , s ∈ N d . (4.4) Lemma 4.4. Let < r < . We have the following equivalent characterizations for f ∈ L p ( T d ) . (i) If ≤ p ≤ ∞ we have k f k H rp ( T d ) ≍ sup s ∈ N d (cid:13)(cid:13) A s ( f, · ) (cid:13)(cid:13) p r k s k . (4.5) (ii) If < p < ∞ we have with (4.2) k f k H rp ( T d ) ≍ sup s ∈ N d (cid:13)(cid:13) δ s ( f, · ) (cid:13)(cid:13) p r k s k . (4.6) Remark 4.5. We will also need the refinement spaces B rp,q ( T d ) , ≤ q ≤ ∞ , for technicalreasons k f k B rp,q ( T d ) := (cid:16) X s ∈ N d r k s k q (cid:13)(cid:13) A s ( f, · ) (cid:13)(cid:13) qp (cid:17) q . (4.7)In this notation we have H rp ( T d ) = B rp, ∞ ( T d ) in the sense of equivalent norms. Note also,that in case < p < ∞ we may replace A s ( f, · ) by δ s ( f, · ) in (4.7). This together withLemma 4.2 yields the identity B r , ( T d ) = W r ( T d ) in the sense of equivalent norms.Let us finally state a result on real interpolation of classes with bounded mixed differencewhich may be of interest on its own. Since the focus on the paper is on small smoothnesswe restrict here to smoothness parameters r less than one. The below theorem also worksfor higher smoothness (using an isomorphism different from the Faber-Schauder system inthe proof). Theorem 4.6. Let < p < ∞ , < r < / and r = r + 1 / . Then ( B r ∞ , ∞ ( T d ) , W r ( T d )) θ,p = B rp,p ( T d ) (in the sense of equivalent norms) if θ = 2 /p and r = r + 1 /p . Proof. First note that W r ( T d ) = B r , ( T d ) . The interpolation formula is an easy conse-quence of the classical interpolation formula ( ℓ ∞ , ℓ ) θ,p = ℓ p with < p < ∞ , θ = 2 /p and the fact that we have a common isomorphism J r (dependingon r ), mapping all three occurring function spaces to either ℓ ∞ , ℓ or ℓ p . As an isomorphismwe may use the periodic Faber-Schauder representation given in [12, (3.5), (3.6)] togetherwith [12, Prop 3.4 and 3.5]. 8 Convolution operators onto the step hyperbolic cross Let us refer to the definitions (4.2) and (4.3) of the dyadic block operators δ s ( f, x ) and A s ( f, x ) based on the tensorized dyadic Dirichlet kernel and the tensorized de la ValléePoussin kernel, respectively. We further define for n ∈ N the hyperbolic cross operators S Q n f := X k s k ≤ n δ s ( f, · ) and A Q n f = X k s k ≤ n A s ( f, · ) . The image of S Q n represents the space of trigonometric polynomials with frequencies sup-ported on the step hyperbolic cross Q n . We denote this space with T ( Q n ) := range( S Q n ) . We may also need the operators S ∆ Q n := S Q n − S Q n − and A ∆ Q n := A Q n − A Q n − for n ∈ N (we set S ∆ Q := S Q and A ∆ Q := A Q ). Let us first observe that S Q n = A Q n ◦ S Q n = S Q n ◦ A Q n (5.1)and, for some integer b ∈ N , A Q n = S Q n + b ◦ A Q n = A Q n ◦ S Q n + b . (5.2)It is well-known that in case < p < ∞k S Q n : L p ( T d ) → L p ( T d ) k ≍ k A Q n : L p ( T d ) → L p ( T d ) k ≍ , n ∈ N . (5.3)Moreover, in case p = ∞ we have k A Q n : L ∞ ( T d ) → L ∞ ( T d ) k . n d − , n ∈ N , (5.4)see [30, Lem. 4.2.3]. We are interested in the Kolmogorov and entropy numbers of such afinite rank operator. More generally, we define for a finite set of frequencies E ⊂ Z d thecorresponding projection operator S E : L ( T d ) → L ∞ ( T d ) given by S E f ( x ) := X k ∈ E ˆ f ( k ) e i k · x . In this context it is natural to study the best approximation error with respect to themultivariate trigonometric system with free spectrum, i.e., the best m − term trigonometricapproximation of an operator T defined by σ m ( T ) := σ m ( T : L p ( T d ) → L ∞ ( T d ); ( Y n ) n ) (5.5)with Y n := n t ( x ) = X k ∈ Λ c k e i k · x : | Λ | ≤ n, c k ∈ C o . (5.6)If not stated otherwise the quantity σ m will always be used in the context of best m -termtrigonometric approximation in the sequel, see (5.5), (5.6). We may drop the ( Y n ) n in thenotation then.In order to make use of the above interpolation results in Theorems 3.2 and 3.3, we mayneed an appropriate “corner result”. The below bounds for entropy and Kolmogorov numbersare due to Pajor and Tomczak-Jaegermann [18], Belinskii [2], see also 11.2.2 and 11.3.1 in[34], and Dunker, Kühn, Linde, Lifshits [10]. The version for ( σ m ) m is due to Temlyakov,see Theorem 2.6 in [28]. For a univariate version of this result we refer to Belinskii [1].9 heorem 5.1 ([18], [2], [10], [28]) . Let E ⊂ Z d be a finite set such that E ⊂ B ∞ ( R ) = { x ∈ R d : k x k ∞ ≤ R } for some R ≥ . Then we have (i) for “ s m ∈ { d m , σ m } ” s m ( S E : L ( T d ) → L ∞ ( T d )) . (cid:16) | E | log Rm (cid:17) / , m ≤ | E | . (ii) For the entropy numbers it holds e m ( S E : L ( T d ) → L ∞ ( T d )) . (cid:16) | E | log Rm (cid:17) / : m ≤ | E | , − m/ | E | √ log R : m > | E | . The following bounds are direct consequences of Theorem 5.1 in connection with Theorem3.2. Note, that (i) below for s m = e m is already known. It was obtained in [32] to prove theMarcinkiewicz type discretization theorems for the hyperbolic cross polynomials. The proofthere is based on a different technique. Theorem 5.2. Let ≤ p < ∞ . Then, it holds for ≤ m ≤ | Q n | and “ s m ∈ { d m , e m , σ m } ” (i) s m ( S Q n : L p ( T d ) → L ∞ ( T d )) . (cid:18) n m (cid:19) p n ( d − (cid:16) − p (cid:17) + p . (ii) If r ≥ then s m ( S ∆ Q n : W rp ( T d ) → L ∞ ( T d )) . − rn (cid:18) n m (cid:19) p n ( d − (cid:16) − p (cid:17) + p , (iii) and if r > /p s m ( A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d )) . − rn (cid:18) n m (cid:19) p n d − p . Proof. Let us start proving (i). It is well-known for the real interpolation method ( · , · ) θ,q that (cid:0) L ∞ ( T d ) , L ( T d ) (cid:1) θ,p = L p ( T d ) whenever p = 1 − θ ∞ + θ , which means θ = 2 /p . Hence, we have with A θ = L p ( T d ) the condition ( Θ ) fulfilled. So,we may interpolate the numbers s m according to Theorems 3.2 and 3.3. This gives for theoperator A Q n s m (cid:0) A Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) . (cid:13)(cid:13)(cid:13) A Q n : L ∞ ( T d ) → L ∞ ( T d ) (cid:13)(cid:13)(cid:13) − θ · s m (cid:0) A Q n : L ( T d ) → L ∞ ( T d ) (cid:1) θ . n ( d − − θ ) s m (cid:0) A Q n : L ( T d ) → L ∞ ( T d ) (cid:1) θ , S n + b : L ( T d ) → L ∞ ( T d ) and A Q n : L ( T d ) → L ( T d ) . Then, (K2) and (5.3) yield s m (cid:0) A Q n : L ( T d ) → L ∞ ( T d ) (cid:1) θ . s m (cid:0) S Q n + b : L ( T d ) → L ∞ ( T d ) (cid:1) θ . Applying Theorem 5.1 to the right-hand side together with θ = 2 /p yields for m ≤ | Q n | s m (cid:0) A Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) . (cid:16) n m (cid:17) p n ( d − − p )+ p . (5.7)Using the first identity in (5.1) together with the properties (K2), (S2) and (5.3) gives s m (cid:0) S Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) . s m (cid:0) A Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) , where the right-hand side can be bounded by (5.7). This proves (i).For (ii) observe that by Lemma 4.2 s m (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) ≍ − rn s m (cid:0) S ∆ Q n : L p ( T d ) → L ∞ ( T d ) (cid:1) , which will be bounded using (i).As for (iii), we have by the real interpolation formula in Theorem 4.6 with r = r − /p , r = r − /p + 1 / and θ = 2 /p (cid:16) B r ∞ , ∞ ( T d ) , B r , ( T d ) (cid:17) θ,p = B rp,p ( T d ) , < p < ∞ . As a consequence, we obtain the condition ( Θ ) for A θ = B rp,p ( T d ) with respect to the abovecouple. Interpolating Kolmogorov numbers according to Theorem 3.2 gives s m (cid:0) A ∆ Q n : B rp,p ( T d ) → L ∞ ( T d ) (cid:1) . k A ∆ Q n : B r ∞ , ∞ ( T d ) → L ∞ ( T d ) k − θ (5.8) · s m (cid:0) A ∆ Q n : B r , ( T d ) → L ∞ ( T d ) (cid:1) θ . By Lemma 4.4 together with (4.4) and θ = 2 /p we find k A ∆ Q n : B r ∞ , ∞ ( T d ) → L ∞ ( T d ) k − θ . n ( d − (cid:16) − p (cid:17) − r n (cid:16) − p (cid:17) . (5.9)Since B r , ( T d ) = W r ( T d ) in the sense of equivalent norms, we may use (5.7) and plug theresult together with (5.9) into (5.8). This yields s m (cid:0) A ∆ Q n : B rp,p ( T d ) → L ∞ ( T d ) (cid:1) . n ( d − (cid:16) − p (cid:17) − r n (cid:16) − p (cid:17) − r n p " n · n d − nm p ≍ − rn (cid:16) n m (cid:17) p n ( d − − p )+ p . (5.10)Finally, by Lemma 4.4, (ii), Remark 4.5 and (4.4), we see that k S Q n + b − S Q n − b : B rp, ∞ ( T d ) → B rp,p ( T d ) k ≍ n d − p . Using A ∆ Q n = A ∆ Q n ◦ ( S Q n + b − S Q n − b ) together with (K2), (S2) gives s m (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . n d − p s m (cid:0) A ∆ Q n : B rp,p ( T d ) → L ∞ ( T d ) (cid:1) . − rm (cid:16) n m (cid:17) p n d − p , where we used (5.10) in the last step. 11 Embeddings into L ∞ ( T d ) Let us present here our main results for embeddings of Sobolev and Hölder-Nikolskii spaceswith small mixed smoothness into L ∞ ( T d ) . Theorem 6.1. Let < p < ∞ , p < r < . Then, for “ s m ∈ { d m , e m , σ m } ” we have s m (cid:0) I : W rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (log m ) ( d − − r )+ r . Proof. We decompose the identity operator I = ∞ X n =0 S ∆ Q n , where the S ∆ Q n are the operators defined above. Using (K3), (S3) we have that s m (I) ≤ ∞ X n =0 s m n (cid:0) S ∆ Q n (cid:1) with m = P ∞ n =0 m n . We decompose the sum into three parts s m (I) ≤ n X n =0 s m n (cid:0) S ∆ Q n (cid:1) + n X n = n s m n (cid:0) S ∆ Q n (cid:1) + ∞ X n = n s m n (cid:0) S ∆ Q n (cid:1) . (6.1)Let us consider the first sum in (6.1). The following argument only works for “ s m ∈{ d m , σ m } ” since a counterpart of (K4) or (S4) is not available for entropy numbers. Wewill indicate the necessary modification for s m = e m below. Let n be the largest numbersuch that n X n =0 rank (cid:0) S ∆ Q n (cid:1) ≤ m and put m n := rank (cid:0) S ∆ Q n (cid:1) + 1 . Due to property (K4) and (S4) in Lemma 2.2, we make thefirst sum disappear. As for the second sum, we choose m n := ⌊ n ( n − n ) κ n − ( d − ⌋ and n such that n n d − ≍ m . (6.2)Clearly, n X n = n n ( n − n ) κ n − ( d − ≍ n n d − ≍ m . Here, κ is chosen such that r < κ < . Let us decompose as follows S ∆ Q n = S ∆ Q n ◦ I with I : W rp ( T d ) → W r ( T d ) and get s m n (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) . (cid:13)(cid:13)(cid:13) I : W rp ( T d ) → W r ( T d ) (cid:13)(cid:13)(cid:13) · s m n (cid:0) S ∆ Q n : W r ( T d ) → L ∞ ( T d ) (cid:1) . − rn (cid:0) − ( n − n ) κ n d − (cid:1) n d − + , n ≤ n ≤ n gives n X n = n s m n (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) . − rn n d − n d − + . (6.3)Using the fact that n n d − ≍ m , we have(6.2) . m − r n − ( d − r + d − + d − + ≍ m − r n ( d − − r )+ r . (6.4)Now we care for the third sum and choose m n = m · ( n − n ) ζ , where ζ is chosen such that ζp < r − p . Clearly, P ∞ n = n m n ≍ m . By the results from Theorem 5.2, we obtain s m n (cid:0) S ∆ Q n : W rp ( T d ) → L ∞ ( T d ) (cid:1) . (cid:20) n m · ( n − n ) ζ (cid:21) p − rn n ( d − (cid:16) − p (cid:17) + p . Summing over n in the range n = n , n + 1 , . . . gives ∞ X n = n s m n (cid:0) S ∆ Q n (cid:1) . (cid:18) n m (cid:19) p − rn · n ( d − (cid:16) − p (cid:17) + p . (6.5)Because of (6.2), we have (6.5) . n d − p − rn n ( d − (cid:16) − p (cid:17) + p ≍ − rn n d − ≍ m − r n d − n − ( d − r ≍ m − r (cid:0) log m (cid:1) ( d − − r )+ r . This, combined with (6.4), gives the result of the theorem for “ s m ∈ { d m , σ m } ” .We finally comment on the estimate of the first sum in (6.1) in case of entropy numbers.We modify the argument as follows: Instead of choosing m n = rank( S ∆ Q n ) + 1 we choose m n := rank( S ∆ Q n )2 ( n − n ) ε , n = 1 , ..., n , (6.6)with < ε < . This gives n X n =0 m n ≍ rank( S ∆ Q n ) ≍ n n d − , where we choose n such that n n d − ≍ m . By (K2) and Theorem 5.1 we obtain (note that m n > rank( S ∆ Q n ) ) e m n ( S ∆ Q n : W rp ( T d ) → L ∞ ( T d )) . − rn − ( n − n ) ε n / . n = 0 , ..., n yields n X n =0 e m n ( S ∆ Q n : W rp ( T d ) → L ∞ ( T d )) . − rn n / ≍ m − r (log m ) r ( d − / . m − r (log m ) ( d − − r )+ r . This finishes the proof. Theorem 6.2. Let < p ≤ ∞ and p < r < . Then, for “ s m ∈ { d m , e m , σ m } ” s m (cid:0) I : H rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (cid:0) log m (cid:1) d − r . Proof. This time, we decompose the identity using the operators A ∆ Q n . Let us first dealwith the case p < ∞ . Applying again property (K3) and (S3) we find s m (cid:0) I : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . n X n =0 s m n (cid:0) A ∆ Q n (cid:1) + n X n = n s m n (cid:0) A ∆ Q n (cid:1) + ∞ X n = n s m n (cid:0) A ∆ Q n (cid:1) . (6.7)We argue analogously as in the proof of Theorem 6.1 for the first sum. For the second sum,we choose m n = ⌊ n ( n − n ) κ n ⌋ (6.8)with r < κ < and n such that n n ≍ m . Hence, n X n = n m n ≍ n n ≍ m . (6.9)Then we decompose A ∆ Q n = S Q n + b ◦ A ∆ Q n . (6.10)This gives s m n (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . (cid:13)(cid:13)(cid:13) A ∆ Q n : B rp, ∞ ( T d ) → B r , ( T d ) (cid:13)(cid:13)(cid:13) · s m n ( S Q n + b : B r , ( T d ) → L ∞ ( T d )) . − rn (cid:18) n m n (cid:19) n d − , where we used Theorem 5.2, (ii) for estimating s m n . To estimate k A ∆ Q n k we used Lemma4.4 together with (4.4). Inserting (6.8) yields s m n (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . − rn h ( n − n ) κ · n − i n d − . Summation over n = n , . . . , n leads to n X n = n s m n (cid:0) A ∆ Q n (cid:1) . − rn √ n n d − . Since m ≍ n n , due to (6.9), we get n X n = n s m n (cid:0) A ∆ Q n (cid:1) . − rn n d − r ≍ m − r (cid:0) log m (cid:1) d − r . (6.11)14e finally deal with the last sum in (6.7). Indeed, by choosing m n := ⌊ m · ( n − n ) ζ ⌋ with r − p − ζp > , we have by Theorem 5.2 s m n (cid:0) A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d ) (cid:1) . − rn (cid:18) n m n (cid:19) p n d − p ≍ (cid:18) n m ( n − n ) ζ (cid:19) p − rn n d − p . Summing up over n = n , n + 1 , . . . yields ∞ X n = n s m n (cid:0) A ∆ Q n (cid:1) . − rn · (cid:18) n m (cid:19) p n d − p . By (6.9) we get n m ≍ n . Hence, we obtain ∞ X n = n s m n (cid:0) A ∆ Q n (cid:1) . m − r (cid:0) log m (cid:1) d − r . Together with (6.11), this proves the theorem in case p < ∞ . For the case p = ∞ we use thebounded embedding I : B r ∞ , ∞ ( T d ) → B rp ∗ , ∞ ( T d ) , where p ∗ < ∞ is chosen such that r > p ∗ .This gives s m (cid:0) I : B r ∞ , ∞ ( T d ) → L ∞ ( T d ) (cid:1) ≤ (cid:13)(cid:13)(cid:13) I : B r ∞ , ∞ ( T d ) → B rp ∗ , ∞ ( T d ) (cid:13)(cid:13)(cid:13) · s m (cid:0) I : B rp ∗ , ∞ ( T d ) → L ∞ ( T d ) (cid:1) . m − r (cid:0) log m (cid:1) d − r , where we used the result for p ∗ < ∞ .Again, we comment on the necessary modifications in case of s m = e m . Let us considerthe first sum in (6.7) again and use (6.10) . We choose m n and n as after (6.6). By thecounterpart of (K2) for entropy numbers we find e m n ( A ∆ Q n : B rp, ∞ ( T d ) → L ∞ ( T d )) ≤ k A ∆ Q n : B rp, ∞ ( T d ) → B r , ( T d ) k e m ( S Q n + b : B r , ( T d ) → L ∞ ( T d )) ≤ n ( d − / − rn − ( n − n ) ε n / , where we applied Theorem 5.1, (ii). Summing over n = 0 , ..., n yields n X n =0 e m n ( A ∆ Q n ) . − rn n ( d − / / ≍ m − r (log m ) ( d − r +1 / / . m − r (log m ) d − r . This concludes the proof.For the endpoint situation r = 1 / we obtain an additional (log log m ) / factor in theupper bounds. 15 heorem 6.3 (Endpoint cases) . Let “ s m ∈ { d m , e m , σ m } ” . (i) If < p < ∞ and r = 1 / then s m (cid:0) I : W rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (log m ) ( d − − r )+ r (log log m ) r +1 . (ii) If < p ≤ ∞ and r = 1 / then s m (cid:0) I : H rp ( T d ) → L ∞ ( T d ) (cid:1) . m − r (cid:0) log m (cid:1) d − r (log log m ) r +1 . Proof. We use the same decomposition of the identity operator as above. The first andthird sum will be treated analogously. In the second sum it is not possible to choose the κ < . We choose κ = 1 but pay a log( n ) in both summations. Rephrasing the final boundin terms of m yields an additional (log log m ) r +1 factor. Remark 6.4 ( d = 2 ) . (i) We would like to emphasize that in Theorem 6.2, when d = 2 ,we actually do not need the middle sum ranging over [ n , n ] in (6.7). Hence, the restriction r ≤ / does not play a role here. This results in s m (I : H rp ( T ) → L ∞ ( T )) . m − r (cid:0) log m (cid:1) r (6.12)for all r > /p and ≤ p ≤ ∞ . Compared to Theorem 6.3, (ii) we do not have a log log -termhere for r = 1 / . In addition, together with Theorem 7.8.4 from [30] (see also [7, Thm. 6.3.4]and the references therein) and Carl’s inequality [5] we get the correct order in case d = 2 for Kolmogorov and entropy numbers. Namely for ≤ p ≤ ∞ and r > /p it holds e m (I : H rp ( T ) → L ∞ ( T )) ≍ d m (I : H rp ( T ) → L ∞ ( T )) ≍ m − r (cid:0) log m (cid:1) r . The result for entropy numbers is true for ≤ p ≤ ∞ , r > /p , see [30, Thm. 7.8.4]. Note,that for the W rp classes the correct order of decay for d = 2 of the Kolmogorov and entropynumbers is only known in case of large smoothness r > / , see [34, Chapt. 11] and also [10]. (ii) The upper bound in (6.12) also includes the error of best m -term trigonometricapproximation σ m . Together with the lower bounds from [28, Thm. 3.3] we have in case d = 2 , ≤ p ≤ ∞ and r > /pm − r log m . σ m (I : H rp ( T ) → L ∞ ( T )) . m − r (cid:0) log m (cid:1) r . In this section we comment on applications of the above results and add a discussion onpossible future research and open problems motivated by our considerations. We can sayright here that we already have made progress on the Outstanding Open Problems in [7],especially 1.3, 1.6, 1.7. In addition, we discuss consequences for sampling recovery in L .Furthermore, we comment on the use of finite dimensional subspaces generated by hyperbolicwavelets as buidling blocks and wavelet type dictionaries for best m -term approximation. Entropy and Kolmogorov numbers. Entropy numbers for mixed smoothness embed-dings have been investigated by several authors in the literature, see [7, Chapt. 6]. Amongmany others, Vybíral [36] investigated the behavior of entropy numbers in B sp,q -spaces, seeRemark 4.5, using wavelet building blocks. In addition, the authors in [10] managed to prove16 counterpart of the corner result in Theorem 5.2, (i), for p = 2 , s m = d m and S Q n replacedby the corresponding hyperbolic Haar wavelet projection.Let us comment on this technique here and how it can be applied for the uniform normestimates. Technically, instead of trigonometric polynomials one may also use a univariatewavelet system { ψ I = ψ (( · − x I ) / | I | ) : I ∈ I , | I | ≤ } , where I is the set of dyadic intervals I with midpoints x I = k − j , k, j ∈ Z . We further consider the corresponding multivariate(tensorized) system D = n ψ I ( x ) = d Y j =1 ψ I j ( x j ) : I = I × · · · × I d , | I j | ≤ , j = 1 , ..., d o (7.1)and define the orthogonal projection on the hyperbolic layers ˜ S n f := X I ⊂ [0 , d | I | =2 − n h f, ψ I i ψ I . This operator replaces the above S ∆ Q n . Then we have the decomposition of the identityoperator I = P ∞ n =0 ˜ S n . We assume, that the wavelet system is sufficiently smooth, compactlysupported and has good decay properties. We also need the finite-dimensional block result s m (id : ℓ Np → ℓ N ∞ ) ≍ h log( eN/m ) m i /p , ≤ m ≤ N , (7.2)with “ s m ∈ { d m , e m } ” . The result for entropy numbers in case < p < ∞ is folklore, see[7, Thm. 6.1.3] and the references given there. For the corresponding result for Kolmogorovnumbers in case ≤ p < ∞ see [11, Thm. 1.1], where the sharp dual version in terms ofGelfand numbers is proved. This result in combination with the proof in [36, Thm. 3.19]gives for all ≤ p ≤ ∞ , r > /ps m (I : B rp, ∞ → L ∞ ) ≤ s m (I : B rp, ∞ → B ∞ , ) . m − r (log m ) ( d − r +1) . It turns out that in case d = 2 we recover the small smoothness result in Theorem 6.2 aswell as Belinskii’s “large smoothness” result in [34], 11.3.5. In case d = 2 the above resultis sharp, see the discussion in Remark 6.4. In addition, the result gives an indication thatthe log log m term in Theorem 6.3 is probably not needed. In case of small smoothness and d > the result is worse than our result in Theorem 6.2 and Theorem 6.3.Additionally, in [36] the author pointed out some gaps between upper and lower boundsin a certain range of small smoothness. This was the starting point of the recent paperMayer, Ullrich [15, Cor. 23, (iii)], where the sharp behavior e m (I : B rp,q → L ∞ ) ≍ m − r (7.3)is shown in case < p ≤ ∞ , < q ≤ / , and /p < r ≤ / for all dimensions d . The proofrelies on a refinement of (7.2) for mixed ℓ nq ( ℓ Np ) -norms, see [15, Thm. 13]. Roughly speaking,refining the spaces H rp by decreasing the third parameter q , see Remark 4.5, allows us to getrid of the logarithmic term. A combination of the technique in [15] with the technique used inthis paper may allow to extend the range of parameters for the result (7.3). A correspondingresult for Kolmogorov numbers is not known. However, a similar phenomenon occurs for the σ m numbers associated to a wavelet type dictionary (see below). Note that the space B rp,q is a quasi-Banach space. 17 avelet type dictionaries. In the context of function spaces with mixed smoothnessnot only best m -term trigonometric approximation has been considered. Also hyperbolicwavelet type dictionaries D , as defined in (7.1), gained substantial interest, see for instance[27] or [7, Sect. 7.2] and the references therein. It turned out that the order of decay of thecorresponding error quantities (modify the definition of ( Y n ) n in (5.6) accordingly) is oftensubstantially better than for the trigonometric system, see [7, Sect. 7] and the referencestherein. In fact, the gain is not only in the logarithmic term but sometimes also in the mainrate. This is certainly not the case in our setting. A reasonable question would be: Does awavelet system perform comparably well with respect to the decay of the associated σ m inthe L ∞ norm? Note that a fundamental difference between wavelets and the trigonometricsystem is the lack of a universal L ∞ -bound of the L -normalized wavelet system. As aconsequence, some of the techniques used by Belinskii, see for instance [34, 11.2.5], cannot be directly adapted to wavelets. A strong indication that wavelet dictionaries may notperform worse than the trigonometric system is the following observation. In [4, Th. 6.15]it is proved that for < p ≤ ∞ , < q ≤ / , and /p < r ≤ / we have σ m (I : B rp,q ( T d ) → L ∞ ( T d ); D ) . m − r . This result is sharp for all dimensions d if we use the tensorized Faber Schauder system asdictionary D . Sampling recovery. We introduce the notion of sampling numbers of an operator T : F → G between two Banach spaces F and G of functions on D . We assume that point evaluationsare linear functionals on F . This would be the case if F is continuously embedded into C ( D ) ,the space of continuous functions on D . Let us define the m -th sampling numbers of anoperator T ∈ L ( F , G ) as follows ̺ m ( T : F → G ) := inf x ,..., x n ∈ D inf ϕ : C n → G linear sup k f k F ≤ k T f − ϕ ( f ( x ) , ..., f ( x n )) k G . In many cases the embedding F ֒ → G and the corresponding embedding operator I : F → G is considered. A particular situation is the case when G = L ( D ) . In this situationit has been proven in [31] that there are two positive absolute constants b, B > such that ̺ bm (I : F → L ( D )) ≤ Bd m (I : F → L ∞ ( D )) , m ∈ N . (7.4)In case that F represents a reproducing kernel Hilbert space H ( K ) embedded into L ( D ) we even know that, see [16], ̺ m (I : H ( K ) → L ) ≤ c log( m ) m ∞ X k = ⌊ c m ⌋ d k (I : H ( K ) → L ( D )) , m ∈ N . with (precisely given) absolute constants c , c > . Similar results have been recentlyestablished for non-Hilbert function spaces, see [14]. However, in both these settings thesquare summability of the corresponding Kolmogorov numbers d k (I : F → L ) is crucial. Wedo not have this property in the “small smoothness setting” studied in this paper. Hence,results on Kolmogorov numbers in the uniform norm together with (7.4) serve as a powerfultool to investigate the sampling recovery problem in L for the case of small smoothness.From (7.4) together with Theorem 6.2 we obtain in case /p < r < / ̺ m (I : H rp ( T d ) → L ( T d )) . m − r (log m ) d − r . (7.5)18n addition, the endpoint result from Theorem 6.3 have direct counterparts for samplingnumbers. Let us point out that the so far best-known upper bounds in the above situationhave been obtained by the use of sparse grid (Smolyak) recovery algorithms, see [26], [25],[6, 9], resulting in ̺ m (I : H rp ( T d ) → L ( T d )) . m − r (log m ) ( d − r ) (7.6)for all r > /p . It is obvious, that (7.5) improves on (7.6) in case d > if /p < r < / .Note also that [8, Thm. 5.1] shows, when restricting to sparse grid methods, the bound in(7.6) can not be improved. Hence, in the case of small smoothness sparse grid methods cannot be optimal in the above situation when d > .As for I : W rp ( T d ) → L ( T d ) we obtain by (7.4) and Theorem 6.1 the bound ̺ m (I : W rp ( T d ) → L ( T d )) . m − r (log m ) ( d − − r )+ r (7.7)if /p < r < / . Clearly, the bound in Theorem 6.3 on the endpoint case also carries overto the sampling numbers. By the results in [25] together with complex interpolation we findthe bound ̺ m (I : W rp ( T d ) → L ( T d )) . m − r (log m ) ( d − ε ) , (7.8)for any ε > in the case of small smoothness by using a sparse grid method. Clearly,(7.7) improves on (7.8). However, here it is not clear whether the analysis for the sparse gridmethod can be improved or not. A valid lower bound comes from the embedding B rp, ֒ → W rp since p > . Hence, [8, Thm. 5.1] shows that any sparse grid method is asymptotically worsethan m − r (log m ) ( d − r +1 / . This yields in case / < r < / and large enough d that thesparse grid methods can not be optimal since the bound in (7.7) is better. However, thesampling method behind the bounds in this paper is highly non-constructive, whereas thesparse grid methods are constructive and can be implemented. From this point of view,our results show that there might exist further constructive methods which improve on thesparse grid methods regarding the asymptotic error decay.Let us finally mention that the correct order of the quantities ̺ m in the above situation isstill unknown. We improved on the upper bounds. A good source for lower bounds are theerror quantities with respect to numerical integration, which has been observed by Novak[17]. This in connection with the lower bounds in [30] and [8] gives ̺ m (I : W rp ( T d ) → L ( T d )) & m − r (log m ) ( d − / (7.9)and ̺ m (I : H rp ( T d ) → L ( T d )) & m − r (log m ) d − . (7.10)Comparing (7.10) to (7.5) we observe a difference in the log -exponent of only r and thereforenot growing in d . Comparing (7.9) to (7.7) the difference is growing in d in case r < / . If r = 1 / we are close to the lower bound coming from numerical integration. Acknowledgment. The first author was supported by the Russian Federation GovernmentGrant N o o eferences [1] E. Belinskii. Decomposition theorems and approximation by a “floating" system ofexponentials. Transactions of the American Mathematical Society , 350:43–53, 1998.[2] E. S. Belinsky. Estimates of entropy numbers and Gaussian measures for classes offunctions with bounded mixed derivative. J. Approx. Theory , 93(1):114–127, 1998.[3] J. Bergh and J. Löfström. Interpolation spaces. An introduction . Springer-Verlag, Berlin-New York, 1976. Grundlehren der Mathematischen Wissenschaften, No. 223.[4] G. Byrenheid. Sparse representation of multivariate functions based on discrete pointevaluations . Dissertation, Institut f ür Numerische Simulation, Universität Bonn, 2018.[5] B. Carl. Entropy numbers, s -numbers, and eigenvalue problems. J. Funct. Analysis ,41:290–306, 1981.[6] D. D˜ung. B-spline quasi-interpolant representations and sampling recovery of functionswith mixed smoothness. J. Complexity , 27(6):541–567, 2011.[7] D. D˜ung, V. N. Temlyakov, and T. Ullrich. Hyperbolic cross approximation . AdvancedCourses in Mathematics. CRM Barcelona. Birkhäuser/Springer, 2019.[8] D. D˜ung and T. Ullrich. Lower bounds for the integration error for multivariate functionswith mixed smoothness and optimal Fibonacci cubature for functions on the square. Math. Nachr. , 288(7):743–762, 2015.[9] D. D˜ung. Sampling and cubature on sparse grids based on a B-spline quasi- interpola-tion. Found. Comput. Math. , 16(5):1193–1240, 2016.[10] T. Dunker, T. Kühn, M. Lifshits, and W. Linde. Metric entropy of integration operatorsand small ball probabilities for the Brownian sheet. Journ. Approx. Theory , 101:63–77,1999.[11] S. Foucart, A. Pajor, H. Rauhut, and T. Ullrich. The Gelfand widths of ℓ p -balls for < p ≤ . J. Complexity , 26(6):629–640, 2010.[12] A. Hinrichs, L. Markhasin, J. Oettershagen, and T. Ullrich. Optimal quasi-MonteCarlo rules on order digital nets for the numerical integration of multivariate periodicfunctions. Numer. Math. , 134(1):163–196, 2016.[13] D. Krieg and M. Ullrich. Function values are enough for L -approximation. arXiv:math/1905.02516v3 , 2019.[14] D. Krieg and M. Ullrich. Function values are enough for L -approximation: Part (II). arXiv:2011.01779 , 2020.[15] S. Mayer and T. Ullrich. Entropy numbers of finite dimensional mixed-norm balls andfunction space embeddings with small mixed smoothness. Constr. Approx. , to appear.[16] N. Nagel, M. Schäfer, and T. Ullrich. A new upper bound for sampling numbers. arXiv:2010.00327 , 2020.[17] E. Novak. Quadrature and widths. J. Approx. Theory , 47:195–202, 1986.2018] A. Pajor and N. Tomczak-Jaegermann. Subspaces of small codimension of finite-dimensional Banach spaces. Proc. Amer. Math. Soc. , 97(4):637–642, 1986.[19] A. Pietsch. Operator ideals . North-Holland, 1980.[20] A. Pietsch. Approximation spaces. J. Approx. Theory , 32(2):115–134, 1981.[21] A. Pietsch. Eigenvalues and s -numbers , volume 13 of Cambridge Studies in AdvancedMathematics . Cambridge University Press, Cambridge, 1987.[22] A. S. Romanyuk. Best m -term trigonometric approximations of Besov classes of periodicfunctions of several variables. Izvestia RAN, Ser. Mat. , 67:61–100, 2003.[23] A. S. Romanyuk. Entropy numbers and widths for the Nikol’skij-Besov classes of func-tions of many variables in the space L ∞ . Analysis Math. , 45(1):133–151, 2019.[24] A. Seeger and W. Trebels. Low regularity classes and entropy numbers. Archiv derMathematik , 92:147–157.[25] W. Sickel and T. Ullrich. The Smolyak algorithm, sampling on sparse grids and functionspaces of dominating mixed smoothness. East J. Approx. , 13(4):387–425, 2007.[26] V. N. Temlyakov. Approximation of periodic functions . Computational Mathematicsand Analysis Series. Nova Science Publishers Inc., Commack, NY, 1993.[27] V. N. Temlyakov. Greedy algorithms with regard to multivariate systems with specialstructure. Constr. Approx. , 16:399–425, 2000.[28] V. N. Temlyakov. Constructive sparse trigonometric approximation and other problemsfor functions with mixed smoothness. Matem. Sb. , 206:131–160, 2015.[29] V. N. Temlyakov. Constructive sparse trigonometric approximation for functions withsmall mixed smoothness. Constr. Approx. , 45:467–495, 2017.[30] V. N. Temlyakov. Multivariate approximation , volume 32 of Cambridge Monographson Applied and Computational Mathematics . Cambridge University Press, Cambridge,2018.[31] V. N. Temlyakov. On optimal recovery in L . arXive:2010.03103 , 2020.[32] V. N. Temlyakov. Sampling discretization of integral norms of the hyperbolic crosspolynomials. arXiv:2005.05967v1 , 2020.[33] V. N. Temlyakov and T. Ullrich. Bounds on Kolmogorov widths of classes with smallmixed smoothness. arXiv:2012.09925v1 , 2020.[34] R. M. Trigub and E. S. Bellinsky. Fourier analysis and approximation of functions .Kluwer Academic Publishers, Dordrecht, 2004. [Belinsky on front and back cover].[35] M. Ullrich and T. Ullrich. The role of Frolov’s cubature formula for functions withbounded mixed derivative. SIAM J. Numer. Anal. , 54(2):969–993, 2016.[36] J. Vybíral. Function spaces with dominating mixed smoothness.