[PDF] Sharp moment estimates for martingales with uniformly bounded square functions

Abstract

We provide sharp bounds for the exponential moments and p-moments, 1\leqslant p \leqslant 2, of the terminate distribution of a martingale whose square function is uniformly bounded by one. We introduce a Bellman function for the corresponding extremal problem and reduce it to the already known Bellman function on \mathrm{BMO}([0,1]). In the case of tail estimates, a similar reduction does not work exactly, so we come up with a fine supersolution that leads to sharp tail estimates.

Full PDF

SSharp moment estimates for martingaleswith uniformly bounded square functions ∗Dmitriy Stolyarov Vasily Vasyunin Pavel Zatitskiy Ilya ZlotnikovFebruary 24, 2021

Abstract

We provide sharp bounds for the exponential moments and p -moments, (cid:54) p (cid:54) , of the terminatedistribution of a martingale whose square function is uniformly bounded by one. We introduce aBellman function for the corresponding extremal problem and reduce it to the already known Bellmanfunction on BMO([0 , . In the case of tail estimates, a similar reduction does not work exactly, sowe come up with a ﬁne supersolution that leads to sharp tail estimates. Let ( X, Σ , P ) be an atomless complete probability space equipped with a discrete time ﬁltration F = {F n } n (cid:62) . Let F = { ∅ , X } and let F generate Σ . Assume for simplicity that each σ -algebra F n is ﬁnite,i. e., consists of a ﬁnite number of sets. Consider a real-valued martingale ϕ = { ϕ n } n adapted to F anddeﬁne its square function Sϕ by the formula Sϕ = (cid:16) ∞ (cid:88) n =0 ( ϕ n +1 − ϕ n ) (cid:17) . (1.1)In what follows we will always talk about real martingales adapted to ﬁltrations as above unless otherwisespeciﬁed. We call a martingale ϕ simple if ϕ n +1 = ϕ n for n suﬃciently large. In this paper, we make anattempt to describe the distribution of ϕ ∞ (which is the limit value of the martingale, ϕ ∞ = lim n →∞ ϕ n )under the assumption that Sϕ is uniformly bounded. From general theory (see (1.5) below), ϕ is a BMO -martingale provided Sϕ ∈ L ∞ . Thus, by the John–Nirenberg inequality, ϕ ∞ is a subexponential randomvariable. Namely, there exist positive constants c and c such that P ( ϕ ∞ − ϕ (cid:62) t ) (cid:54) c e − c t (cid:107) Sϕ (cid:107) L ∞ , t > , (1.2)for any martingale ϕ . We focus on sharp estimates of this kind. In particular, we aim to compute thebest possible values of c and c (see Corollary 1.14 below). According to the knowledge of the authors,such sharp estimates are not known. ∗ Supported by the Russian Science Foundation Grant 19-71-10023. a r X i v : . [ m a t h . P R ] F e b n the case where F is a dyadic ﬁltration (by that we mean that any atom in F n is split into two atomsof equal mass in F n +1 ), a much better estimate exists. The famous Chang–Wilson–Wolﬀ inequality (seeTheorem 3.1 in [2]) says that the distribution of ϕ is subgaussian: P ( ϕ ∞ − ϕ (cid:62) t ) (cid:54) e − t (cid:107) Sϕ (cid:107) L ∞ , t > . (1.3)In a recent paper [8], Ivanisvili and Treil generalized this result to the case where the ﬁltration F hasbounded distortion α , which means that each atom in F n has at least α mass of its parental atom. Inthis case, P ( ϕ ∞ − ϕ (cid:62) t ) (cid:54) e − αt (cid:107) Sϕ (cid:107) L ∞ , t > . (1.4)This result hints us that the distribution of ϕ ∞ may no longer be subgaussian if we do not make as-sumptions about regularity of the ﬁltration. As we will see later, this is indeed the case (for example, theinequality (1.2) is sharp for certain choice of c and c , see Corollary 1.14 below).Since we focus on sharp estimates, it is natural to consider not only tail estimates, but also inequalitiesfor the exponential moments and p -moments. In particular, one may wonder what are the largest possiblevalues of the quantities E e λϕ ∞ , or E | ϕ ∞ | p under the assumption (cid:107) Sϕ (cid:107) L ∞ (cid:54) . We will partially answerthis question, see Corollaries 1.11 and 1.12 below. One may go further, pick an arbitrary function f , andask about the largest possible value of E f ( ϕ ∞ ) under the same assumption. We will study this problemfor the cases when f (cid:48)(cid:48)(cid:48) either does not change sign or changes sign from + to − once.Some of the results of the present paper were announced in the short report [20]. We also providedsome proofs there. The present paper contains the remaining proofs. In a sense, [20] contains thereasoning that do not depend on the geometry of speciﬁc Bellman functions. They are much shorter thanthe treatment of Bellman functions we present here.For the reader who is not interested in the Bellman function technique, Corollaries 1.11, 1.12, and 1.14may be considered as the main results of the paper. Lemma 1.8 and Theorems 1.10 and 1.13 are moreimportant from the Bellman function point of view. BMO functions

The space

BMO is pivotal for our considerations. There are several equivalent norms in this space. Sincewe are dealing with sharp estimates, the choice of a speciﬁc norm is crucial. The space

BMO m called thespace of martingales of bounded mean oscillation is deﬁned as follows (see, e. g., Chapter II in [9]) (cid:107) ϕ (cid:107) m = sup (cid:110)(cid:13)(cid:13) E (cid:0) ( ϕ ∞ − ϕ τ ) | F τ (cid:1)(cid:13)(cid:13) L ∞ (cid:12)(cid:12)(cid:12) τ is a stopping time (cid:111) . A simple orthogonality argument E (cid:0) ( ϕ ∞ − ϕ τ ) | F τ (cid:1) = E (cid:16)(cid:16) (cid:88) n (cid:62) τ ( ϕ n +1 − ϕ n ) (cid:17) (cid:12)(cid:12)(cid:12) F τ (cid:17) = E (cid:16) (cid:88) n (cid:62) τ ( ϕ n +1 − ϕ n ) (cid:12)(cid:12)(cid:12) F τ (cid:17) (cid:54) E (cid:0) ( Sϕ ) | F τ (cid:1) , (1.5)leads to the inequality (cid:107) ϕ (cid:107) BMO m (cid:54) (cid:107) Sϕ (cid:107) L ∞ .The space BMO m has its real analysis counterpart (see, e. g., Chapter IV in [19] for more information).The BMO space on the unit interval is deﬁned with the help of the seminorm (cid:107) ψ (cid:107) , = sup (cid:110) | J | (cid:90) J (cid:16) ψ ( x ) − | J | (cid:90) J ψ (cid:17) dx (cid:12)(cid:12)(cid:12) J is a subinterval of [0 , (cid:111) . (1.6)2e note that this deﬁnition is not the most common in real analysis. A version based on the L norminstead of L is more widespread (the two BMO seminorms are equivalent). The L based version isclosely related to the martingale BMO m space. We denote the non-increasing rearrangement (the inversefunction to the distribution function of ξ ) of a random variable ξ by ξ ∗ : ξ ∗ ( t ) = inf { α | P ( ξ > α ) (cid:54) t } , t ∈ [0 , . Theorem 1.1.

The inequality (cid:107) ϕ ∗∞ (cid:107) BMO([0 , (cid:54) (cid:107) Sϕ (cid:107) L ∞ holds for any martingale ϕ and is sharp. Here and in what follows the notation ϕ ∗∞ means the monotonic rearrangement of ϕ ∞ . This theoremwas proved in [20]. Though Theorem 1.1 says there is a certain relationship between martingales ϕ whosesquare function is bounded and functions ψ on the unit interval that belong to the BMO space, we warnthe reader against identiﬁcation of these classes of objects, which have diﬀerent nature and origin.

Remark 1.2.

The estimate (cid:107) ϕ ∗∞ (cid:107) BMO([0 , (cid:54) (cid:107) ϕ (cid:107) BMO m is not true in general for discrete time ﬁltrations.To see that , consider the case where F is dyadic. The inequality (cid:107) ϕ ∗∞ (cid:107) BMO([0 , (cid:54) √ (cid:107) ϕ (cid:107) BMO m , F is dyadic , is sharp ( and true ), see Corollary in [21] . What is more , for any C > there exist a discrete ﬁltration F and a martingale ϕ adapted to it such that the inequality (cid:107) ϕ ∗∞ (cid:107) BMO([0 , (cid:54) C (cid:107) ϕ (cid:107) BMO m fails. Theorem 1.1 leads to many nice inequalities. In particular, it says that sup (cid:110) E f ( ϕ ∞ ) (cid:12)(cid:12)(cid:12) ϕ = x, (cid:107) Sϕ (cid:107) L ∞ (cid:54) (cid:111) (cid:54) sup (cid:110) (cid:90) f ( ψ ) (cid:12)(cid:12)(cid:12) (cid:90) ψ = x, (cid:107) ψ (cid:107) BMO([0 , (cid:54) (cid:111) (1.7)for any non-negative function f : R → R . It is reasonable to ﬁx the expectation of our martingale since ϕ does not aﬀect the square function, but has strong inﬂuence on the quantity E f ( ϕ ∞ ) . There are twosurprising facts about formula (1.7). The ﬁrst one is that the inequality turns into equality quite often(in particular, for the important cases f ( t ) = e λt and f ( t ) = | t | p , (cid:54) p (cid:54) ). The second fact is that thesupremum on the right hand side may be computed exactly for arbitrary f , which satisﬁes some mildregularity assumptions. We brieﬂy describe these results.We ﬁx the second moment as well and write the deﬁnition of the Bellman function b ε : ω ε → R , b ε ( x, y ) = sup (cid:26) (cid:90) f ( ψ ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:90) ψ = x, (cid:90) ψ = y, (cid:107) ψ (cid:107) BMO([0 , (cid:54) ε (cid:27) ,ω ε = { ( x, y ) ∈ R | x (cid:54) y (cid:54) x + ε } . (1.8)This Bellman function satisﬁes the boundary condition b ε ( x, x ) = f ( x ) . It appears that one may computethe function b ε for arbitrary f . The answer (algorithm) is quite complicated. We refer the reader to thepaper [7] for treatment of the general case. The paper [6] considers less general case (the authors makeadditional assumptions on the structure of f ), however, provides a much shorter presentation. The shortreport [5] outlines the results. In fact, the particular cases that are important for applications werecomputed in earlier papers [17], [18], [23], and [25].The main reason why b ε is a tractable object is that it can be described geometrically, namely, interms of locally concave functions. By a locally concave function on a domain we mean a function whoserestriction to any segment lying in the domain entirely, is concave.3 heorem 1.3 (Main theorem and Corollary 5.4 in [22]) . Let f be bounded from below. The function b ε can be described as the pointwise minimal function among all locally concave functions G : ω ε → R thatsatisfy the boundary condition G ( x, x ) = f ( x ) . The fact behind Theorem 1.3 is that the minimal locally concave function has a good probabilisticrepresentation, see Theorem 2.21 in [22]. We cite a deﬁnition introduced in [22] (in fact, [22] deals witha more general situation; in the case of

BMO and the parabolic strip ω ε the continuous time version ofthe deﬁnition below had appeared in the literature before [22], see, e. g., [9] and [16]; as the present papershows, the discrete time deﬁnition is more convenient in some contexts). Deﬁnition 1.4. An R -valued martingale M = { M n } n adapted to {F n } n is called an ω ε -martingale ifit satisﬁes the conditions listed below. F = { ∅ , X } . There exists a random variable M ∞ with values in { ( t, t ) | t ∈ R } such that E | M ∞ | < ∞ and M n = E ( M ∞ | F n ) . For every n ∈ Z + and every atom σ in F n conv { M n +1 ( z ) } z ∈ σ ⊂ ω ε . The third requirement should be understood properly: we deﬁne M n = E ( M ∞ | F n ) everywhere andthus, consider the convex hull of a ﬁnite number of points. By conv A we denote the convex hull of aset A .The following lemma plays a crucial role in the proof of Theorem 1.1. Lemma 1.5 (Theorem 3.4 in [22]) . Let M be an ω ε martingale. The random variable M ∞ ( the ﬁrstcoordinate of the R -valued random variable M ∞ ) satisﬁes the inequality (cid:107) ( M ∞ ) ∗ (cid:107) BMO([0 , (cid:54) ε. (1.9) The function f : R → R will be subject to some requirements. We will always assume f is measurableand non-negative. Of course, one may use a slightly weaker assumption that f is uniformly boundedfrom below (or replacing f with − f , that f is bounded from above). Sometimes we will need a regularityassumption. Deﬁnition 1.6.

We say that f satisﬁes the standard requirements if it is a non-negative twicecontinuously diﬀerentiable function , its third distributional derivative is a signed measure , which changessign only ﬁnite number of times , and (cid:90) R e − | t | ε | df (cid:48)(cid:48) ( t ) | < ∞ for some ε > . (1.10)These requirements for f are slightly stronger than in [7] (the authors of that paper did not requirethe positivity of f ). Note that the choices f ( t ) = | t | p , p ∈ [1 , , and f ( t ) = χ [0 , ∞ ) ( t ) do not satisfy thestandard requirements (the ﬁrst one is quite close, while the second function is very far from being C -smooth). 4e introduce the main character: B ( x, y, z ) = sup (cid:110) E H f ( ϕ ∞ , ( Sϕ ) + z ) (cid:12)(cid:12)(cid:12) E ϕ ∞ = x, E ϕ ∞ = y (cid:111) , z (cid:62) , (1.11)where H f ( s, t ) = (cid:40) −∞ , t / ∈ [0 , f ( s ) , t ∈ [0 , , and the supremum is taken over all simple martingales ϕ adapted to a discrete time ﬁltration. We consideronly simple martingales here to avoid technicalities. Note that it suﬃces to work with simple martingalesto obtain sharp constants in the inequalities (1.15), (1.16), and (1.22) below.This Bellman function will help us to ﬁnd sharp constants in several inequalities. The reader familiarwith the Burkholder method (see the original papers [1] and [10] or the books [12], [24]) may say thatthe y -coordinate is redundant. However, we prefer to keep it, because it “tracks” the Hilbert spaceidentities that link the square function to the martingale itself. Remark 1.7.

For any x, y ﬁxed , the function z (cid:55)→ B ( x, y, z ) is non-increasing. This follows fromformula (1.11), more speciﬁcally , from an equivalent formula B ( x, y, z ) = sup (cid:110) E f ( ϕ ∞ ) (cid:12)(cid:12)(cid:12) E ϕ ∞ = x, E ϕ ∞ = y, Sϕ (cid:54) (cid:112) − z a. s. (cid:111) . (1.12)As we will prove a little bit later (see Lemma 2.1 below), the natural domain for B is Ω = (cid:110) ( x, y, z ) ∈ R (cid:12)(cid:12)(cid:12) x (cid:54) y (cid:54) − z + x , z ∈ [0 , (cid:111) . (1.13)We start with the Bellman function counterpart of Theorem 1.1. Recall the function b ε deﬁnedin (1.8). Lemma 1.8.

Let f be a non-negative function. The inequality B ( x, y, z ) (cid:54) b √ − z ( x, y ) is true for anytriple ( x, y, z ) ∈ Ω . This lemma implies (1.7) (plug z = 0 and optimize with respect to y ). It has already appeared in [20].We present its proof in Section 2 for completeness (in fact, the arguments are quite elementary here). Corollary 1.9.

Let a measurable function h : R → R + satisfy (cid:88) k ∈ Z e −| k | sup x ∈ [ k − ,k +2] h ( x ) < ∞ . (1.14) Then the quantity E h ( ϕ ∞ ) is ﬁnite for any martingale ϕ such that Sϕ (cid:54) almost surely. The bound isuniform with respect to ϕ as long as ϕ is ﬁxed. This corollary will be proved in Section 2. It is sharp in certain sense. For example, one may constructa C -smooth function h such that h (cid:48)(cid:48)(cid:48) (cid:62) and h ( x ) = e x /x when x is suﬃciently large. Theorem 1.10below then says B ( x, y, z ) = b √ − z ( x, y ) if both these functions are constructed for f := h . However, onemay see that with this function h the Bellman function b is inﬁnite since the integral (cid:82) h ( ψ ) divergesif one plugs ψ ( t ) = − log t into (1.8) (the function log t has BMO -norm equal to one).As we have said, the inequality in Lemma 1.8 often turns into equality.

Theorem 1.10.

Assume f satisﬁes the standard requirements and either f (cid:48)(cid:48) is monotone or f (cid:48)(cid:48) increasesup to some point and then decreases. Then , B ( x, y, z ) = b √ − z ( x, y ) for all ( x, y, z ) ∈ Ω . D εj Theorem 1.10 was also stated in [20], but was not proved. Its proof is presented in Subsection 3.3.Note that particular choices f ( t ) = e λt and f ( t ) = | t | p , (cid:54) p (cid:54) (this function does not satisfy thestandard requirements, however, we will be able to cope with this diﬃculty), ﬁt the assumptions ofTheorem 1.10. The corresponding functions b ε were computed in [17] and [18] respectively. These resultswill lead us to the corollaries below. Corollary 1.11.

The optimal constant c p in the inequality (cid:107) ϕ ∞ − ϕ (cid:107) L p (cid:54) c p (cid:107) Sϕ (cid:107) L ∞ (1.15) equals when (cid:54) p (cid:54) . Corollary 1.12.

The optimal constant C ( ε ) in the inequality E e ϕ − ϕ (cid:54) C ( ε ) , Sϕ (cid:54) ε. (1.16) equals e − ε − ε when ε < . Sometimes the inequality in Lemma 1.8 is strict on a subdomain of Ω . We present the followingexample corresponding to the choice f ( t ) = χ [0 , ∞ ) ( t ) . Note that this function does not fulﬁll the standardrequirements (however, this is not the reason for failure of the equality between the Bellman functions;we consider this example since it leads to sharp constants in the inequality (1.2)). In this case, thefunction b ε was computed in [23]. The domain ω ε is split into four parts (see Figure 1) D ε = { ( x, y ) ∈ ω ε | y (cid:62) εx, x (cid:62) ε } ∪ { ( x, y ) ∈ ω ε | y (cid:54) εx } ; D ε = { ( x, y ) ∈ ω ε | | x | (cid:54) ε, y (cid:62) ε | x |} ; D ε = { ( x, y ) ∈ ω ε | y (cid:54) − εx } ; D ε = { ( x, y ) ∈ ω ε | x (cid:54) − ε, y (cid:62) − εx } , (1.17)and the function is deﬁned by the formula: b ε ( x, y ) =  , ( x, y ) ∈ D ε − y − εx ε , ( x, y ) ∈ D ε − x y , ( x, y ) ∈ D ε e (cid:18) − (cid:113) − y − x ε (cid:19) e xε + (cid:113) − y − x ε , ( x, y ) ∈ D ε . (1.18)6n Section 4, the function B will be computed on the upper boundary of Ω , namely, we will identifythe restriction of B to Ω R = Ω ∩ { z = (cid:112) − y + x } . (1.19)The set Ω R naturally splits into four parts, each of which is projected onto the corresponding domainin (1.17). Theorem 1.13.

Let f ( t ) = χ [0 , ∞ ) ( t ) , t ∈ R . The equality B ( x, y, (cid:112) − y + x ) = b √ y − x ( x, y ) (1.20) holds true whenever ( x, y ) ∈ D √ y − x j and j = 1 , , . If ( x, y ) ∈ D √ y − x , we have B ( x, y, (cid:112) − y + x ) = 1 − (cid:112) − ρ − ρ √ e − arcsin ρ − π , where ρ = ρ ( x, y ) = x (cid:112) y − x ) . (1.21) Corollary 1.14.

The best possible constant c in (1.2) equals . The optimal constant c in the inequality P ( ϕ ∞ − ϕ (cid:62) λ ) (cid:54) ce − λ (cid:107) Sϕ (cid:107) L ∞ , λ > , (1.22) equals e . Note that the sharp constant c in the weak type form of the John–Nirenberg inequality (cid:12)(cid:12)(cid:12)(cid:8) t ∈ [0 , (cid:12)(cid:12) ψ ( t ) − (cid:90) ψ > λ (cid:9)(cid:12)(cid:12)(cid:12) (cid:54) ce − λ (cid:107) ψ (cid:107) BMO[0 , , λ > , (1.23)also equals e , as it was shown in [23]. Even though for this choice of f the inequality in Lemma 1.8 isstrict at some points of Ω , the sharp constants in the tail estimates (1.22) and (1.23) for the consideredproblems coincide.Though the square function is a very common martingale operator, there are less sharp inequalitiesknown about it than about the martingale transform or the maximal function. Even the expression forits L p → L p norm is known only in the range p ∈ (1 , (and in fact, is due to Burkholder in [1], seeSection . in [12]). The sharp constant in the weak type (1 , inequality was found by Cox in [3] (seealso [15] for another approach and [11] and [4] for related results) while other weak type constants areunknown. Sharp inequalities for various special classes of martingales (conditionally symmetric martin-gales, continuous path martingales, etc.) may be found in [14] and [26]. We also mention the article [13],where questions similar to those considered in the present paper are studied in the dyadic setting (namely,that paper studies the distribution of Sϕ under conditions ϕ ∈ L ∞ and ϕ ∈ BMO in the dyadic setting).The reader may ﬁnd many interesting sharp inequalities involving the square function in the 8th chapterof [12].In Section 2 we study simple properties of the function B and prove Lemma 1.8, Corollary 1.9, andTheorem 1.1. Section 3 contains the proofs of Theorem 1.10, Corollary 1.11, and Corollary 1.12. Section 4is devoted to the proofs Theorem 1.13 and Corollary 1.14. The lemma we present below is a standard part of the Bellman function method. One may ﬁnd a similarstatement in [12], see Chapter 8, Theorem 8.1. We provide a proof for two reasons: completeness andslight diﬀerence between the traditional notation and ours. The subscript R in the formula below designates the “roof” of the domain Ω . emma 2.1. Let f (cid:62) . (i) The function B is non-negative on the domain Ω deﬁned by (1.13) and equals −∞ outside it. (ii) The function B satisﬁes the boundary condition B ( x, x , z ) = f ( x ) when z ∈ [0 , . (iii) The function B satisﬁes the main inequality B ( x, y, z ) (cid:62) N (cid:88) j =1 α j B ( x j , y j , z j ) , whenever N (cid:88) j =1 α j = 1 , α j ∈ [0 , N (cid:88) j =1 α j x j = x ; N (cid:88) j =1 α j y j = y ; ∀ j z j = z + ( x j − x ) ; ( x j , y j , z j ) ∈ Ω , ( x, y, z ) ∈ Ω . (2.1)(iv) Let G : Ω → R be a function that satisﬁes the same boundary condition as B and also the maininequality , that is G ( x, y, z ) (cid:62) N (cid:88) j =1 α j G ( x j , y j , z j ) (2.2) whenever the points satisfy the splitting rules (2.1) . Then , B (cid:54) G pointwise.Proof of (i) . Due to the assumption f (cid:62) and (1.12), the assertion that B ( x, y, z ) is non-negative meansthat there exists at least one martingale ϕ such that ϕ = x, E ϕ ∞ = y, and ( Sϕ ) + z (cid:54) almost surely . (2.3)We ﬁrst prove that the existence of such a martingale ϕ implies ( x, y, z ) ∈ Ω . The necessity of x (cid:54) y follows from the Cauchy–Schwarz inequality. The necessity of y (cid:54) − z + x is a consequence of the L orthogonality: y − x = E ϕ ∞ − ( E ϕ ∞ ) = E ( Sϕ ) (cid:54) − z . (2.4)Second, for any ( x, y, z ) ∈ Ω , we may construct a single step martingale ϕ by the formula ϕ = x, ϕ = (cid:40) x − (cid:112) y − x , with probability ; x + (cid:112) y − x , with probability , ϕ n = ϕ , n (cid:62) . Then E ϕ ∞ = ϕ = y and Sϕ = (cid:112) y − x (cid:54) √ − z almost surely. Proof of (ii) . If y = x , then any martingale ϕ that satisﬁes (2.3) is a constant. Thus, the set ofmartingales over which we optimize in (1.11) consists of a single martingale that equals x identically. Forsuch a martingale, E f ( ϕ ∞ ) = f ( x ) . Therefore, B ( x, x , z ) = f ( x ) , whenever z ∈ [0 , . Proof of (iii) . Let η > be a small parameter to be chosen later. Pick some α j , x j , y j , and z j , j ∈{ , . . . , N } , that satisfy (2.1). By formula (1.11), for every j ∈ { , . . . , N } , there exists a simple martin-gale ϕ j such that ϕ j = x j , E ( ϕ j ∞ ) = y j , ( Sϕ j ) + z j (cid:54) almost surely , (2.5)and B ( x j , y j , z j ) (cid:54) E H f (cid:16) ϕ j ∞ , ( Sϕ j ) + z j (cid:17) + η. (2.6)8e split the probability space X into N parts X j such that P ( X j ) = α j (recall that our probabilityspace does not have atoms). We treat each ( X j , Σ | X j , α j P | X j ) as an individual probability space andmodel the martingale ϕ j on it (this equips these “small” probability spaces with some ﬁltrations). Weconstruct the simple martingale ϕ as a concatenation of these martingales: ϕ = x, ∀ n ∈ N ϕ n = N (cid:88) j =1 ϕ jn − χ Xj . The constructed process ϕ is a martingale because ϕ = E ϕ due to (2.1) and (2.5). Then, E ϕ ∞ = y and ( Sϕ ) + z = ( Sϕ j ) + z j (2.7)on X j for any j by (2.1). Therefore, B ( x, y, z ) (cid:62) E H f (cid:0) ϕ ∞ , ( Sϕ ) + z (cid:1) = N (cid:88) j =1 α j E H f (cid:0) ϕ j ∞ , ( Sϕ j ) + z j (cid:1) (cid:62) N (cid:88) j =1 α j B ( x j , y j , z j ) − η by (2.6). We complete the proof by making η arbitrarily small. Proof of (iv) . If we deﬁne Sϕ n = (cid:16) (cid:88) m

Note that (iv) says that if there exists some function G satisfying the requirements of thispart , then E f ( ϕ ∞ ) (cid:54) G ( ϕ , E ϕ ∞ , for any simple ϕ with Sϕ (cid:54) . The boundary y − x = 1 − z is somehow special for our considerations. If the inequality (2.4) turnsinto equality, then Sϕ = √ − z almost surely. Thus, B ( x, x + s , (cid:112) − s ) = sup (cid:110) E f ( ϕ ) (cid:12)(cid:12)(cid:12) ϕ = x, Sϕ = s almost surely (cid:111) . (2.10)The extremal problem on the right hand side is interesting in itself.We present a simple geometric observation that Lemma 1.8 is based upon. Recall the deﬁnition (1.8)of the domains ω ε . Lemma 2.3.

Let the point ( x, y, z ) ∈ Ω be split into the points ( x j , y j , z j ) lying inside Ω according to therules (2.1) . Then , the convex hull of the points ( x j , y j ) lies in the parabolic strip ω √ − z . roof. We will prove that the points ( x j , y j ) lie below the tangent at ( x, x +1 − z ) to the upper boundaryof ω √ − z . Note that the statement and the rules (2.1) are invariant with respect to the parabolic shift ( x j , y j , z j ) (cid:55)→ ( x j − τ, y j + τ − τ x j , z j ) , (2.11)for any τ ∈ R . So, in what follows we may assume x = 0 (otherwise x can be shifted to using the shiftwith τ = x ). For any j , y j (cid:54) − z j + x j , simply because ( x j , y j , z j ) ∈ Ω . Therefore, by the last rule in (2.1) and the assumption x = 0 , y j (cid:54) − z j + x j = 1 − z , which means exactly that ( x j , y j ) lies below the tangent to the parabola y = x + 1 − z at the point (0 , − z ) . Proof of Lemma . We have the following chain of inequalities: b √ − z ( x, y ) (cid:62) n (cid:88) j =1 α j b √ − z ( x j , y j ) (cid:62) n (cid:88) j =1 α j b √ − z j ( x j , y j ) . (2.12)The ﬁrst inequality follows from the local concavity of b √ − z and the fact that the convex hull of ( x j , y j ) lies in ω √ − z by Lemma 2.3. The second inequality is a consequence of the fact that b ε is an increasingfunction of ε (we maximize over a larger set in (1.8) when we increase ε ).So, the function ( x, y, z ) (cid:55)→ b √ − z ( x, y ) satisﬁes the ﬁrst three requirements of Lemma 2.1. Thus, bythe fourth statement in Lemma 2.1 we have B ( x, y, z ) (cid:54) b √ − z ( x, y ) . Proof of Corollary . By Theorem . . in [7], the function b (with h in the role of f ) is ﬁnite. Com-bining this information with Lemma 1.8, we obtain the ﬁniteness of B , which, in the light of Remark 2.2,means exactly the assertion of the Corollary. Proof of Theorem . Assume that (cid:107) Sϕ (cid:107) L ∞ = 1 . Let us show that in this case the R -valued martingale M n = ( ϕ n , E ( ϕ | F n )) is an ω -martingale. We verify three conditions in Deﬁnition 1.4. The secondcondition is justiﬁed by the martingale convergence theorem since ϕ ∈ L . To verify the third property,we consider an R -valued process µ n = ( ϕ n , E ( ϕ | F n ) , Sϕ n ) , (2.13)where Sϕ n is deﬁned in (2.8). Let a ∈ F n be an atom. Then, the points ( x, y, z ) = µ n ( a ) and ( x j , y j , z j ) = µ n +1 ( a j ) , where the a j are all the children of a , satisfy (2.1). Thus, by Lemma 2.3, the convex hull ofthe points M n +1 ( a j ) lies inside ω . Therefore, M is an ω -martingale.Recall M ∞ is the ﬁrst coordinate of M ∞ . By Lemma 1.5, (cid:107) ( M ∞ ) ∗ (cid:107) BMO([0 , (cid:54) since M is an ω -martingale. We notice that M ∞ coincides with ϕ ∞ and ﬁnally obtain the inequality (cid:107) ϕ ∗∞ (cid:107) BMO([0 , (cid:54) (cid:107) Sϕ (cid:107) L ∞ . The sharpness of this inequality is obtained by considering the martingale ϕ such that ϕ = 0 and ϕ is ± with equal probability.The lemma below suggests a simpler way to verify property (iii) of Lemma 2.1.10 emma 2.4. Let G : Ω → R be a function. Assume that for every point ( x, y, z ) ∈ Ω \ { y = x } thereexist numbers (cid:96) ( x, y, z ) and (cid:96) ( x, y, z ) such that the estimate G (¯ x, ¯ y, ¯ z ) (cid:54) G ( x, y, z ) + (cid:96) ( x, y, z )(¯ x − x ) + (cid:96) ( x, y, z )(¯ y − y ) (2.14) holds true for every point (¯ x, ¯ y, ¯ z ) ∈ Ω such that ¯ z = z + (¯ x − x ) . Then , G fulﬁlls the main inequality G ( x, y, z ) (cid:62) N (cid:88) j =1 α j G ( x j , y j , z j ) , (2.15) where the parameters involved satisfy the splitting rules (2.1) . Remark 2.5. If G is diﬀerentiable at ( x, y, z ) , then the natural choice for (cid:96) ( x, y, z ) and (cid:96) ( x, y, z ) wouldbe the pair of partial derivatives ∂∂x G ( x, y, z ) and ∂∂y G ( x, y, z ) . In fact , one may show a reverse statement.If G satisﬁes the main inequality as above and is C -smooth on Ω , then (2.14) is true with (cid:96) and (cid:96) being the corresponding partial derivatives of G at ( x, y, z ) .Proof of Lemma . Pick some collection of parameters that satisfy the splitting rules (2.1). Withoutloss of generality, we may assume y > x (in this case the main inequality is trivial since in this case y j = y , x j = x ). Setting (¯ x, ¯ y ) = ( x j , y j ) , we obtain G ( x j , y j , z j ) (cid:54) G ( x, y, z ) + (cid:96) ( x, y, z )( x j − x ) + (cid:96) ( x, y, z )( y j − y ) , (2.16)for every j ∈ { , . . . , N } . Multiplying (2.16) by α j and summing these products, we obtain the desiredinequality N (cid:88) j =1 α j G ( x j , y j , z j ) (cid:54) G ( x, y, z ) +  N (cid:88) j =1 α j x j − x  (cid:96) ( x, y, z ) +  N (cid:88) j =1 α j y j − y  (cid:96) ( x, y, z ) = G ( x, y, z ) . (2.17) We describe the function b ε deﬁned in (1.8) in the cases needed for the proof of Theorem 1.10. We referthe reader to [7] for details; as it has been said, some of these results were obtained in earlier papers.Consider the case f (cid:48)(cid:48) is non-increasing on the entire line (recall f is twice diﬀerentiable). For any u ∈ R , we draw the segment (cid:104) ( u, u ) , (cid:0) ( u − ε ) , ( u − ε ) + ε (cid:1)(cid:105) (3.1)that touches the upper boundary of ω ε . Note that when u runs through R these segments foliate theentire domain ω ε . We call such segments right tangents (since they lie on the right of the tangency point).For any ( x, y ) ∈ ω ε there is a unique right tangent that passes through it. We denote the correspondingpoint u by u R ( x, y ) . In other words, ( x, y ) ∈ (cid:104) ( u R , u R ) , (cid:0) ( u R − ε ) , ( u R − ε ) + ε (cid:1)(cid:105) . (3.2)11 heorem 3.1. Let f satisfy the standard requirements ( Deﬁnition 1.6 ) and let f (cid:48)(cid:48) be non-increasing.The function b ε is linear along right tangents in the sense that there exists a function m R : R → R suchthat b ε ( x, y ) = f ( u R ) + m R ( u R )( x − u R ) , u R = u R ( x, y ) , ( x, y ) ∈ ω ε . (3.3) The value of m R may be computed by the formula m R ( u ) = ε − f ( u ) − ε − (cid:90) −∞ e t/ε f ( u + t ) dt. (3.4)The case when f (cid:48)(cid:48) is non-decreasing is completely similar. In this case, we consider left tangents (cid:104) ( u, u ) , (cid:0) ( u + ε ) , ( u + ε ) + ε (cid:1)(cid:105) (3.5)and the corresponding function u L : ω ε → R such that ( x, y ) ∈ (cid:104) ( u L , u L ) , (cid:0) ( u L + ε ) , ( u L + ε ) + ε (cid:1)(cid:105) . (3.6) Theorem 3.2.

Let f satisfy the standard requirements ( Deﬁnition 1.6 ) and let f (cid:48)(cid:48) be non-decreasing.The function b ε is linear along left tangents in the sense that there exists a function m L : R → R suchthat b ε ( x, y ) = f ( u L ) + m L ( u L )( x − u L ) , u L = u L ( x, y ) , ( x, y ) ∈ ω ε . (3.7) The value of m L may be computed by the formula m L ( u ) = − ε − f ( u ) + ε − ∞ (cid:90) e − t/ε f ( u + t ) dt. (3.8)Now consider the case where there exists a point c ∈ R such that f (cid:48)(cid:48) is non-decreasing on the left of c and is non-increasing on the right. In this case, there exist unique continuous functions a, b : [0 , ε ] → R such that a is decreasing, b is increasing, and a (0) = b (0) = c ; (3.9) b ( l ) − a ( l ) = l, l ∈ [0 , ε ]; (3.10) f (cid:48) ( b ) + f (cid:48) ( a )2 = f ( b ) − f ( a ) b − a , a = a ( l ) , b = b ( l ) , l ∈ (0 , ε ] . (3.11)We split ω ε into three domains ϑ ( ε ) = (cid:110) ( x, y ) ∈ ω ε (cid:12)(cid:12)(cid:12) u L ( x, y ) (cid:54) a (2 ε ) (cid:111) ; (3.12) ϑ ( ε ) = (cid:110) ( x, y ) ∈ ω ε (cid:12)(cid:12)(cid:12) u L ( x, y ) (cid:62) a (2 ε ) , u R ( x, y ) (cid:54) b (2 ε ) (cid:111) ; (3.13) ϑ ( ε ) = (cid:110) ( x, y ) ∈ ω ε (cid:12)(cid:12)(cid:12) u R ( x, y ) (cid:62) b (2 ε ) (cid:111) , (3.14)the ﬁrst and third of them called tangent domains, the second called a cup. The identity (3.11) is calledthe cup equation. 12 heorem 3.3. Let f satisfy the standard requirements (Deﬁnition . Assume f (cid:48)(cid:48) be non-decreasingon the left of c and non-increasing on the right. The function b ε is linear along the chords (cid:104)(cid:0) a ( l ) , a ( l ) (cid:1) , (cid:0) b ( l ) , b ( l ) (cid:1)(cid:105) , l ∈ (0 , ε ] , (3.15) in the sense that b ε ( x, y ) = αf ( a ( l )) + βf ( b ( l )) , whenever x = αa ( l ) + βb ( l ) , y = αa ( l ) + βb ( l ) , α + β = 1 , α, β (cid:62) . (3.16) This deﬁnes the function b ε in the cup (3.13) foliated by the chords. On the tangent domains (3.12) and (3.14), the function b ε is deﬁned by formulas (3.7) and (3.3) respectively. The corresponding func-tions m L and m R are given by the formulas m L ( u ) = f ( b (2 ε ))+ f ( a (2 ε ))2 ε exp (cid:16) u − a (2 ε ) ε (cid:17) − f ( u ) ε + 1 ε a (2 ε ) − u (cid:90) e − t/ε f ( t + u ) dt, u ∈ ( −∞ , a (2 ε )); (3.17) m R ( u ) = f ( b (2 ε ))+ f ( a (2 ε ))2 ε exp (cid:16) b (2 ε ) − uε (cid:17) + f ( u ) ε − ε (cid:90) b (2 ε ) − u e t/ε f ( t + u ) dt, u ∈ ( b (2 ε ) , + ∞ ) . (3.18) Lemma 3.4.

Let ( x , y , z ) ∈ Ω , y − x = y − x > , and x > x . Let G : Ω → R be a function thatsatisﬁes the ﬁrst three properties in Lemma with a function f continuous on [ x − ε, x − ε ] , where ε = (cid:112) y − x . Then , we have G ( x , y , z ) (cid:62) ε − x (cid:90) x e − τ − x ε f ( τ − ε ) dτ + e − x − x ε lim inf δ → G (cid:16) x, y − δ, (cid:113) z + δ (cid:17) . (3.19) Proof.

Let N be a large number, let t = x − x N . We construct the points ( x n , y n , z n ) , n ∈ { , . . . , N } ,consecutively, starting from ( x , y , z ) : x n +1 = x n + t ; (3.20) y n +1 = x n +1 + y n − x n − t ; (3.21) z n +1 = z n + t . (3.22)We note that y n +1 − x n +1 = y n − x n − t (cid:54) − z n − t = 1 − z n +1 and y n − x n (cid:62) y N − x N = y − x − N t = y − x − ( x − x ) N > for N large enough. Thus, all the points ( x n , y n , z n ) belong to Ω . It is also convenient to introduce asequence of parameters ε n , where ε n = y n − x n . Then, ε n = ε − nt . (3.23)13he point ( x n , y n , z n ) splits into ( x n +1 , y n +1 , z n +1 ) and ( x n − ε n , ( x n − ε n ) , (cid:112) z n + ε n ) according to therules (2.1), which allows to write G ( x n , y n , z n ) (cid:62) ε n t + ε n G ( x n +1 , y n +1 , z n +1 ) + tt + ε n f ( x n − ε n ) . (3.24)If we combine these inequalities, we arrive at G ( x , y , z ) (cid:62) (cid:16) N − (cid:89) n =0 ε n ε n + t (cid:17) G (cid:16) x, y − N t , (cid:113) z + N t (cid:17) + N − (cid:88) n =0 tt + ε n (cid:16) n − (cid:89) j =0 ε j ε j + t (cid:17) f ( x + tn − ε n ) . (3.25)It remains to prove that the sum on the right hand side converges as N → ∞ to the right hand sideof (3.19). This is, in fact, a fairly lengthy calculus exercise. We comment on its proof without goingdeeply into details. The main “engine” of this eﬀect is that we have ε j = ε + O ( t ) , z j = z + O ( t ) uniformlyin j ∈ { , . . . , N } when N is large. This allows to write n − (cid:89) j =0 ε j ε j + t = e − ntε + O ( t ) (3.26)uniformly in n ∈ { , . . . , N } . Recalling N t = x − x , we get lim inf N →∞ (cid:16) N − (cid:89) n =0 ε n ε n + t (cid:17) G (cid:16) x, y − N t , (cid:113) z + N t (cid:17) (cid:62) e − x − x ε lim inf δ → G (cid:16) x, y − δ, (cid:113) z + δ (cid:17) . (3.27)The second term in (3.25) equals N − (cid:88) n =0 tt + ε n (cid:16) n − (cid:89) j =0 ε j ε j + t (cid:17) f ( x + tn − ε n ) = tε N − (cid:88) n =0 e − ntε f ( x + tn − ε n ) + O ( t )= ε − x (cid:90) x e − τ − x ε f ( τ − ε ) dτ + o (1) . (3.28) Remark 3.5.

In the case x > x, the estimate (3.19) should be replaced with G ( x , y , z ) (cid:62) ε − x (cid:90) x e τ − x ε f ( τ + ε ) dτ + e x − x ε lim inf δ → G (cid:0) x, y − δ, (cid:113) z + δ (cid:1) . (3.29) Remark 3.6.

Inequalities (3.25) and (3.27) lead to the following assertion. Let ( x , y , z ) ∈ Ω , y − x = y − x > , and x > x . Let G : Ω → R be a function that satisﬁes the ﬁrst three properties in Lemma with a function f non-negative on [ x − ε, x − ε ] , where ε = (cid:112) y − x . Then we have G ( x , y , z ) (cid:62) e − x − x ε lim inf δ → G (cid:16) x, y − δ, (cid:113) z + δ (cid:17) . (3.30) Here we require no continuity assumption on f . emma 3.7. Let G : Ω → R be a function that satisﬁes the ﬁrst three properties in Lemma . Fixsome z ∈ (0 , . Let ( x, y ) ∈ R be a point such that (cid:54) y − x < − z ; (3.31) y (cid:62) x . (3.32) Let also ( x , y ) = α ( x, y ) , where α ∈ (0 , . Then , G ( x , y , z ) (cid:62) α lim inf z → z + G ( x, y, z ) + (1 − α ) f (0) . (3.33) Proof.

Let λ = α − , A = ( x, y, z ) , and A = ( x , y , z ) . In particular, ( x, y ) = λ ( x , y ) . Let N bea large number to be speciﬁed later. Consider the points A n = ( x n , y n , z n ) , n ∈ { , . . . , N } , deﬁnedconsecutively ( x n , y n ) = λ nN ( x , y ); z n = z n − + ( x n − x n − ) . (3.34)In other words, the point A n − splits into A n and (0 , , t n ) , where t n = z n − + x n − , (3.35)according to the rules (2.1) (provided we assume A n ∈ Ω ; we will approve this assumption slightly later).We may provide an explicit formula for z n : z n = z + n − (cid:88) k =0 λ kN ( λ N − x = z + ( λ N − λ nN − λ N + 1 x . (3.36)In particular, z N → z + when N → ∞ . Therefore, A N → A . Since we have assumed strict inequalityin (3.31), we have A N ∈ Ω provided N is suﬃciently large.Since the constructed points satisfy the splitting rules (2.1) and ( x n , y n ) = λ N ( x n − , y n − ) , (3.37)we may write the inequalities G ( A n − ) (cid:62) α N G ( A n ) + (1 − α N ) f (0) , α N = λ − N , (3.38)provided we verify that the points A n and (0 , , t n ) belong to Ω for any n . We multiply (3.38) by α n − N ,sum over all n , and obtain G ( A ) (cid:62) αG ( A N ) + (1 − α ) f (0) , (3.39)which implies (3.33) since z N → z + when N → ∞ .It remains to verify the inequalities y n − x n (cid:54) − z n and t n − (cid:54) for any n ∈ { , . . . , N } . Notethat the quantity − z n is a non-increasing function of n (by (3.34)), whereas y n − x n is non-decreasing(by (3.32) and (3.34)). Thus, the inequality y n − x n (cid:54) − z n for smaller n is a consequence of the sameinequality with n = N ; the latter inequality follows from A N ∈ Ω .The same principle allows to establish the second inequality since t n deﬁned in (3.35) is an increasingfunction of n . Thus, it suﬃces to verify t N − (cid:54) , which is a consequence of z N → z + and z + x < .The latter inequality follows from (3.31) and (3.32). Remark 3.8.

We may replace the point (0 , with an arbitrary point ( t, t ) with the help of a parabolicshift (2.11) in the lemma above. Here the resulting statement is , with the same function G . Let ( x, y ) ∈ R be a point such that (cid:54) y − x < − z ; (3.40) y − t (cid:62) x − t ) . (3.41)15 et also ( x , y ) = α ( x, y ) + (1 − α )( t, t ) , where α ∈ (0 , . Then , G ( x , y , z ) (cid:62) α lim inf z → z + G ( x, y, z ) + (1 − α ) f ( t ) . (3.42) Remark 3.9.

The proof may be modiﬁed to obtain a priori stronger inequality G ( x , y , z ) (cid:62) α lim sup z → z + G ( x, y, z ) + (1 − α ) f ( t ) . (3.43) Theorem 3.10.

Let f be continuous at a, b ∈ R . Assume that the function b √ − z is linear along thesegment (cid:96) = (cid:2) ( a, a ) , ( b, b ) (cid:3) ⊂ ω √ − z . Then B ( x , y , z ) = b √ − z ( x , y ) whenever ( x , y ) ∈ (cid:96) .Proof. Let q be the midpoint of (cid:96) . Then, B ( q, z ) (cid:62) f ( a ) + f ( b )2 = b √ − z ( q ) , (3.44)since ( q, z ) might be split into the points (cid:18) a, a , (cid:113) z + ( b − a ) (cid:19) and (cid:18) b, b , (cid:113) z + ( b − a ) (cid:19) (3.45)according to the rules (2.1) (note that the said points lie in Ω ). Thus, by Lemma 1.8, we have B ( q, z ) = b √ − z ( q ) . (3.46)Let now ( x , y ) lie on (cid:96) on the left of q . Remark 3.8 implies B ( x , y , z ) (cid:62) α lim inf z → z + B ( x, y, z ) + (1 − α ) f ( a ) , α = x − ax − a , (3.47)for any point ( x, y ) ∈ (cid:96) lying arbitrarily close to q . Similar to the reasoning for the point ( q, z ) above, B ( x, y, z ) (cid:62) f ( x − (cid:112) y − x ) + f ( x + (cid:112) y − x )2 , (3.48)which implies (with the same notation α = x − ax − a ) B ( x , y , z ) (cid:62) lim inf ( x,y ) → q (cid:16) α f ( x − (cid:112) y − x ) + f ( x + (cid:112) y − x )2 + (1 − α ) f ( a ) (cid:17) = b √ − z ( x , y ) . (3.49) Theorem 3.11.

Suppose that f is continuous and non-negative , b ε ( x, y ) is continuous as a functionof ( x, y, ε ) on { x (cid:54) y (cid:54) x + ε , < ε (cid:54) } . Assume that (3.7) and (3.8) hold true for any ε ∈ (0 , .Then , B ( x , y , z ) = b √ − z ( x , y ) for all ( x , y , z ) ∈ Ω .Proof. Let us ﬁrst consider the case where y − x = 1 − z . We apply Lemma 3.4, drop the secondsummand (using the positivity of B ), and set x = ∞ : B ( x , y , z ) (cid:62) ε − ∞ (cid:90) x e − τ − x ε f ( τ − ε ) dτ, ε = y − x = 1 − z . (3.50)16y our assumptions, the right hand side of (3.50) coincides with b ε ( x , y ) , therefore B ( x , y , z ) (cid:62) b ε ( x , y ) , ε = y − x = 1 − z . (3.51)By Lemma 1.8, this inequality is, in fact, an equality.Consider now the case y − x < − z . We split ( x , y ) into the convex combination of ( u L , u L ) and P = (cid:18) u L + (cid:113) − z , (cid:16) u L + (cid:113) − z (cid:17) + 1 − z (cid:19) , (3.52)along the left tangent (cid:96) to the parabola y = x + 1 − z at P ; here u L = u L ( x , y )= x − (cid:113) − z + (cid:113) − z + x − y (3.53)is deﬁned in (3.6). Remark 3.8 implies B ( x , y , z ) (cid:62) α lim inf z → z + B ( x, y, z ) + (1 − α ) f ( u L ) , α = x − u L x − u L , (3.54)for any point ( x, y ) ∈ (cid:96) lying arbitrarily close to P . Since the function B ( x, y, · ) is non-increasing (seeRemark 1.7), we have B ( x, y, z ) (cid:62) B ( x, y, (cid:112) − y + x ) = b √ y − x ( x, y ) , (3.55)the equality holds by the already considered case. We plug this back into (3.54): B ( x , y , z ) (cid:62) αb √ y − x ( x, y ) + (1 − α ) f ( u L ) . (3.56)It remains to note that when ( x, y ) → P , the right hand side tends to b √ − z ( x , y ) since b is continuousand b √ − z is linear along (cid:96) by our assumptions. Theorem 3.12.

Suppose that f is continuous and non-negative , b ε ( x, y ) is continuous as a functionof ( x, y, ε ) on { x (cid:54) y (cid:54) x + ε , < ε (cid:54) } . Assume that (3.3) and (3.4) hold true for any ε ∈ (0 , .Then , B ( x, y, z ) = b √ − z ( x, y ) for all ( x, y, z ) ∈ Ω . Theorem 3.13.

Suppose that f is continuous and non-negative , b ε ( x, y ) is continuous as a functionof ( x, y, ε ) on { x (cid:54) y (cid:54) x + ε , < ε (cid:54) } . Assume that for any ε ∈ (0 , the function b ε has thefollowing structure : there exist some functions a and b that satisfy the properties listed in Theorem 3.3 , and on the domain (3.13) formula (3.16) holds true ; formulas (3.7) and (3.3) ( with the coeﬃcients givenin (3.18) and (3.17)) deﬁne b ε on the domains ϑ ( ε ) and ϑ ( ε ) given in (3.12) and (3.14) respectively.Then , B ( x , y , z ) = b √ − z ( x , y ) , ( x , y , z ) ∈ Ω . (3.57) Proof.

Fix z , ε = (cid:112) − z , and consider the function b ε on ω ε . By Lemma 1.8 we only need to prove B ( x , y , z ) (cid:62) b ε ( x , y ) . (3.58)If ( x , y ) ∈ ϑ ( ε ) (see (3.13)), then (3.58) follows from Theorem 3.10. In particular, for Q = (cid:16) a (2 ε ) + b (2 ε )2 , a (2 ε ) + b (2 ε )2 (cid:17) (3.59)we have B ( Q, z ) (cid:62) f ( a (2 ε )) + f ( b (2 ε ))2 = b √ − z ( Q ) . (3.60)17ow we deal with other points that satisfy y − x = ε . Let ( x , y ) with y − x = ε lie on the leftof Q (the other case is completely similar). We apply Lemma 3.4 with ( x, y ) = Q : B ( x , y , z ) (cid:62) ε − x (cid:90) x e − τ − x ε f ( τ − ε ) dτ + e − x − x ε lim inf δ → B (cid:0) x, y − δ, (cid:113) z + δ (cid:1) (3.48) (cid:62) ε − x (cid:90) x e − τ − x ε f ( τ − ε ) dτ + e − x − x ε lim inf δ → f ( x − √ ε − δ ) + f ( x + √ ε − δ )2 = ε − x (cid:90) x e − τ − x ε f ( τ − ε ) dτ + e − x − x ε f ( x − ε ) + f ( x + ε )2 . (3.61)A direct computation shows that the right hand side coincides with b √ − z ( x , y ) described by Theo-rem 3.3. Thus, we have proved (3.58) for the points satisfying y − x = ε .If ( x , y ) lies inside ϑ ( ε ) or ϑ ( ε ) (see (3.12) and (3.14)), then (3.58) is proved by the same methodas we used to prove Theorem 3.11. Proof of Theorem . If f (cid:48)(cid:48) is non-decreasing, the theorem follows from Theorems 3.2 and 3.11. If f (cid:48)(cid:48) isnon-increasing, it follows from Theorems 3.1 and 3.12. In the last case, when f (cid:48)(cid:48) changes its monotonicity,we rely upon Theorems 3.3 and 3.13. Proof of Corollary . By the very deﬁnition, c p = sup (cid:54) y (cid:54) − z B p (0 , y, z ) , (3.62)where the function B is constructed from the boundary condition f ( t ) = | t | p . Despite the fact that f does not fulﬁll the standard requirements, the corresponding Bellman functions b ε are described by thesame formulas as in Theorem 3.3 (see [18]). Therefore, by Theorem 3.13 and Remark 1.7, the supremumin (3.62) coincides with sup (cid:54) y (cid:54) B p (0 , y,

0) = sup (cid:54) y (cid:54) b p (0 , y ) . (3.63)The latter supremum equals since b (0 , y ) = y p in this case. Proof of Corollary . We consider the function B constructed for f ( t ) = e εt and observe that C ( ε ) = sup (cid:54) y (cid:54) B (0 , y, . (3.64)This case falls under the scope of Theorem 1.10. Similar to the previous proof, C ( ε ) = sup (cid:54) y (cid:54) b (0 , y ) = e − ε − ε , (3.65)as it may be derived from the exact formula for the latter function (see either Theorem 3.2, or the originalpaper [17]).Finally, we present a local version of Theorem 3.13, which may be obtained by the same proof.18 heorem 3.14. Let < ε < ε (cid:54) . Suppose that there are continuous functions a, b : [2 ε , ε ] → R such that a is decreasing , b is increasing , and b ( l ) − a ( l ) = l for l ∈ [2 ε , ε ] . Let t L , t R ∈ R satisfyinequalities t L < a (2 ε ) , b (2 ε ) < t R . Suppose that f is continuous on [ t L , a (2 ε )] ∪ [ b (2 ε ) , t R ] . Assumethat for any ε ∈ [ ε , ε ] the function b ε satisﬁes the following properties :• formula (3.7) with the coeﬃcients given in (3.17) holds on the domain ϑ ( ε ; t L ) = (cid:110) ( x, y ) ∈ ω ε (cid:12)(cid:12)(cid:12) t L (cid:54) u L ( x, y ) (cid:54) a (2 ε ) (cid:111) ; • formula (3.16) holds on any chord (cid:2) ( a ( l ) , a ( l )) , ( b ( l ) , b ( l )) (cid:3) , l ∈ [2 ε , ε ] , these chords foliate adomain we denote by ϑ ( ε ; ε ); • formula (3.3) with the coeﬃcients given in (3.18) holds on the domain ϑ ( ε ; t R ) = (cid:110) ( x, y ) ∈ ω ε (cid:12)(cid:12)(cid:12) b (2 ε ) (cid:54) u R ( x, y ) (cid:54) t R (cid:111) . Assume that b ε ( x, y ) is continuous as a function of ( x, y, ε ) on the domain ϑ = (cid:110) ( x, y ) ∈ ω ε (cid:12)(cid:12)(cid:12) ( x, y ) ∈ ϑ ( ε ; t L ) ∪ ϑ ( ε ; ε ) ∪ ϑ ( ε ; t R ) , ε (cid:54) ε (cid:54) ε (cid:111) . Then , B ( x, y, z ) = b √ − z ( x, y ) , (cid:0) x, y, (cid:112) − z (cid:1) ∈ ϑ. (3.66) f ( t ) = χ [0 , + ∞ ) ( t ) and sharp tail estimates In this section, we will present the proofs of Theorem 1.13 and Corollary 1.14. In other words, we willdescribe the trace of the Bellman function (1.11) with f ( t ) = χ [0 , + ∞ ) ( t ) on Ω R deﬁned in (1.19).The exposition is organized as follows. We start with solving an auxiliary optimization problem,which we call the model problem, in Subsection 4.1. Subsection 4.2 contains the proof of Theorem 1.13,the solution of the model problem from the previous subsection plays the crucial role there. Finally, weestablish Corollary 1.14 in Subsection 4.3. Consider the domain ω smile = { ( x, y ) ∈ R (cid:12)(cid:12) x (cid:54) y (cid:54) x + 1 , x ∈ [ − , } . (4.1)We say that a function R : ω smile → R satisﬁes the main inequality of the model problem provided R ( x, y ) (cid:62) α + R ( x + , y + ) + α − R ( x − , y − ) , where x = α + x + + α − x − , y = α + y + + α − y − , y + − y − x + − x − = 2 x,α + + α − = 1 , α , α ∈ (0 , , and ( x, y ) , ( x + , y + ) , ( x − , y − ) ∈ ω smile , (4.2)for any choice of the parameters. Geometrically, the main inequality of the model problem is the usualconvexity condition when the point ( x, y ) splits into ( x + , y + ) and ( x − , y − ) along the tangent to theparabola y = x + c passing through ( x, y ) . 19e posit the model problem: ﬁnd the pointwise minimal function R among all function R : ω smile → R that satisfy the main inequality of the model problem and the boundary conditions R ( x, x ) = h ( x ) , x ∈ [ − , , where h ( x ) = (cid:40) , x (cid:62) , x < . (4.3) Remark 4.1.

One may consider a similar homogeneous extremal problem on a larger domain { y (cid:62) x } ( with the same boundary value h ) . It is easy to see that the restriction to ω smile of the solution of thisnew problem coincides with R . Thus , R ( λx, λ y ) = R ( x, y ) , ( x, y ) ∈ ω smile , < λ (cid:54) . (4.4) The domain ω smile can be split into the parabolic arcs P c = (cid:8) ( x, x + c ) (cid:12)(cid:12) | x | (cid:54) c (cid:9) , (cid:54) c (cid:54) . (4.5)By the homogeneity relation (4.4), it suﬃces to focus on the case c = 1 and determine the values of R on P .Consider a parametrization ( v ( t ) , w ( t )) of the arc P . More speciﬁcally, we consider two functions v and w = v + 1 deﬁned on [1 , + ∞ ] such that v increases, v (1) = − , and v (+ ∞ ) = 1 . We splitevery point ( v ( t ) , w ( t )) ∈ P into ( x + ( t ) , y + ( t )) lying on the boundary { y = 2 x , x (cid:62) } and an in-ﬁnitesimally close point ( x − ( t ) , y − ( t )) , according to the rules (4.2) (see Figure 2). We will search for thefunction M : ω smile → R satisfying homogeneity relation (4.4) (with M instead of R ) for which the maininequality “turns into equality” along the said splitting. It will appear that the function M constructedin such a way satisﬁes the equation M ( v, w ) + (cid:68) ∇ M (cid:0) v, w (cid:1) , ( x + − v, y + − w ) (cid:69) = 1 . (4.6)Note that the parametrization has not been speciﬁed yet. It will be speciﬁed in Subsubsection 4.1.3below.The trace of M on P will be denoted by Ψ : Ψ( x ) = M ( x, x + 1) , x ∈ [ − , M ( x, y ) = Ψ (cid:16) x (cid:112) y − x (cid:17) , ( x, y ) ∈ ω smile . (4.7)Recall the boundary values (4.3). We will search for the functions v, w , and Ψ in the form v ( t ) = 1 t t (cid:90) ϕ ( s ) ds, w ( t ) = 1 t t (cid:90) ϕ ( s ) ds, Ψ( v ( t )) = 1 t t (cid:90) h ( ϕ ( s )) ds, t (cid:62) , (4.8)where ϕ satisﬁes the conditions ϕ : [0 , ∞ ) → R , ϕ ( t ) = − for t ∈ [0 , ϕ (1) = 0; ϕ ( t ) (cid:62) for t (cid:62) . (4.9)We also require ϕ to be a non-decreasing function. One may check that x + ( t ) = ϕ ( t ) , y + ( t ) = 2 ϕ ( t ) inthe sense that the tangent vector to the curve ( v ( t ) , w ( t )) points to ( ϕ ( t ) , ϕ ( t )) , and v (cid:48) ( t ) = − v ( t ) t + ϕ ( t ) t , w (cid:48) ( t ) = − w ( t ) t + 2 ϕ ( t ) t , (4.10) ddt M ( v ( t ) , w ( t )) = − M ( v ( t ) , w ( t )) t + h ( ϕ ( t )) t . (4.11)20igure 2: Illustration to the model problemWe rewrite the left hand side of (4.11): ddt M ( v ( t ) , w ( t )) = (cid:104)∇ M ( v ( t ) , w ( t )) , ( v (cid:48) ( t ) , w (cid:48) ( t )) (cid:105) = − t (cid:104)∇ M ( v ( t ) , w ( t )) , ( v ( t ) − ϕ ( t ) , w ( t ) − ϕ ( t ) ) (cid:105) , (4.12)then plug (4.12) into (4.11) taking into account that h ( ϕ ( t )) = 1 for t (cid:62) , and obtain (4.6). Using (4.8) and the relation w = v + 1 , we can write down the following chain of equalities: t ( v ( t ) + 1) = tw ( t ) = 2 (cid:90) t ϕ ( s ) ds (4.10) = 2 (cid:90) t (cid:0) v ( s ) + sv (cid:48) ( s ) (cid:1) ds. (4.13)We diﬀerentiate this relation and obtain v ( t ) + 1 + 2 tv ( t ) v (cid:48) ( t ) = 2( v ( t ) + tv (cid:48) ( t )) , or ( v ( t ) + 2 tv (cid:48) ( t )) = 2 − v ( t ) . We are looking for increasing functions ϕ and v . Thereby, we have to solve the following Cauchy problem tv (cid:48) ( t ) = − v ( t ) + (cid:112) − v ( t ) , v (1) = − , t (cid:62) , or dtt = 2 dv √ − v − v . Hence log t = (cid:90) v − √ − z − z dz = (cid:90) arcsin( v √ ) − π/ θ − sin θ cos θ dθ = (cid:16) θ − log(cos θ − sin θ ) (cid:17)(cid:12)(cid:12)(cid:12) arcsin( v √ ) − π/ = arcsin (cid:18) v √ (cid:19) + π − log (cid:32)(cid:114) − v − v √ (cid:33) . t = 2 √ − v − v e arcsin (cid:16) v √ (cid:17) + π . (4.14)Note that t runs from to + ∞ as v runs from − to .Now, we are able to compute Ψ( v ( t )) . Recall that ϕ ( t ) = − , when t ∈ [0 , , and ϕ ( t ) > for t > .Therefore, for t ∈ [0 , we have Ψ( v ( t )) = M ( − ,

2) = 12 . For t > we use the last formula in (4.8) and the deﬁnition of the function h in (4.3) and deduce Ψ( v ( t )) = 1 t (cid:90) ds + 1 t t (cid:90) ds = 1 − t . (4.15)If we plug here the solution found in (4.14), we get Ψ( v ) = 1 − √ − v − v e − arcsin (cid:16) v √ (cid:17) − π , v ∈ [ − , . (4.16)By the homogeneity relation (4.7) for an arbitrary point ( x, y ) ∈ ω smile we have M ( x, y ) = 1 − (cid:112) − ρ − ρ √ e − arcsin ρ − π , where ρ = ρ ( x, y ) = x (cid:112) y − x ) . (4.17)We have ﬁnished the construction of the function M and now will prove that it solves the model problem. We would like to prove that the function M deﬁned in (4.17) satisﬁes the main inequality (4.2) of themodel problem. We will not do this directly, but rather rely upon a principle similar to Lemma 2.4. Weomit the proof of the following lemma because it is completely similar to the proof of Lemma 2.4. Lemma 4.2.

Assume R : ω smile → R is diﬀerentiable on ω smile and satisﬁes the inequality R (¯ x, ¯ y ) (cid:54) R ( x , y ) + ∂R∂x ( x , y ) · (¯ x − x ) + ∂R∂y ( x , y ) · (¯ y − y ) (4.18) for every points ( x , y ) ∈ ω smile and (¯ x, ¯ y ) ∈ ω smile such that ¯ y − y = 2 x (¯ x − x ) . Then , R satisﬁes themain inequality of the model problem (4.2) . Lemma 4.3.

The function M as in (4.17) satisﬁes (4.18) in the role of R , i. e. , the inequality M (¯ x, ¯ y ) (cid:54) M ( x , y ) + ∂ M ∂x ( x , y ) · (¯ x − x ) + ∂ M ∂y ( x , y ) · (¯ y − y ) (4.19) holds true for any ( x , y ) ∈ ω smile and (¯ x, ¯ y ) ∈ ω smile such that ¯ y − y = 2 x (¯ x − x ) .Proof. Case ¯ x > x . Let c ∈ (0 , . For any point ( x, y ) such that ( x, y ) ∈ ω smile and max( − cx, x ) (cid:54) y (cid:54) x + c we ﬁndtwo numbers u R and v R such that v R (cid:54) x (cid:54) u R and u R − v R − c u R − v R = y − v R − c x − v R = 2 v R , u R and v R .see Figure 3.We deduce that v R = v R ( x, y, c ) = x − (cid:112) x + c − y, u R = u R ( x, y, c ) = v R + (cid:112) c − v R . (4.20)We introduce the function G R deﬁned on the domain (cid:110) ( x, y, c ) (cid:12)(cid:12)(cid:12) c ∈ (0 , , max( − cx, x ) (cid:54) y (cid:54) x + c (cid:111) by the formula: G R ( x, y, c ) = u R − xu R − v R Ψ (cid:16) v R c (cid:17) + x − v R u R − v R = u R − xu R − v R (cid:16) Ψ (cid:16) v R c (cid:17) − (cid:17) + 1 , (4.21)where u R = u R ( x, y, c ) , v R = v R ( x, y, c ) , and Ψ was deﬁned in (4.7) and got its explicit form in (4.16). Forany c ﬁxed the function G R ( · , · , c ) is linear along each segment connecting the points ( v R , v R + c ) and ( u R , u R ) , and coincides with M at their endpoints. Also from the construction of M (see formula (4.6))we deduce that the function G R has the following property: M ( x , y ) + ∂ M ∂x ( x , y ) · (¯ x − x ) + ∂ M ∂y ( x , y ) · x (¯ x − x ) = G R (¯ x, ¯ y, c ) (4.22)with c = (cid:112) y − x .On the other hand, for ¯ c = (cid:112) ¯ y − ¯ x we have v R (¯ x, ¯ y, ¯ c ) = ¯ x , and therefore G R (¯ x, ¯ y, ¯ c ) = M (¯ x, ¯ y ) . Itmay be seen that ¯ c < c : ¯ c = ¯ y − ¯ x = y − x − (¯ x − x ) < y − x = c . Thus, to prove (4.19) it suﬃces to show that the function G R (¯ x, ¯ y, c ) does not decrease in c . In otherwords, we wish to verify the inequality ∂G R ( x, y, c ) ∂c (cid:62) . (4.23)It follows from (4.20) that ∂v R ∂c = − c (cid:112) x + c − y = cv R − x , (4.24)23 u R ∂c = 12 (cid:32) ∂v R ∂c + 4 c − v R ∂v R ∂c (cid:112) c − v R (cid:33) = 12 (cid:32) cv R − x + 2 c − cv R v R − x u R − v R (cid:33) = c ( u R − x )( v R − x )(2 u R − v R ) . (4.25)We diﬀerentiate (4.21) and obtain ∂G R ( x, y, c ) ∂c = ∂u R ∂c ( u R − v R ) − ( ∂u R ∂c − ∂v R ∂c )( u R − x )( u R − v R ) (cid:16) Ψ (cid:16) v R c (cid:17) − (cid:17) + u R − xu R − v R · ∂v R ∂c c − v R c Ψ (cid:48) (cid:16) v R c (cid:17) . Formula (4.16) for the function Ψ implies Ψ (cid:48) ( v ) = 12 e − π/ − arcsin v √ . (4.26)Using (4.16) and (4.26), we continue the evaluation of ∂G R ( x,y,c ) ∂c and obtain that it equals to (cid:32) ∂u R ∂c ( u R − v R − u R + x ) + ∂v R ∂c ( u R − x )4( u R − v R ) (cid:32) v R c − (cid:114) − v R c (cid:33) + u R − xu R − v R · ∂v R ∂c c − v R c (cid:33) e − arcsin (cid:0) v R √ c (cid:1) − π . We may omit the exponent multiplier since we are interested in the sign of the expression ∂G R ( x,y,c ) ∂c only.Note that relation (4.20) yields v R c − (cid:114) − v R c = 2 v R − u R c . Applying (4.24) and (4.25), we continue the computation u R − v R ) (cid:18) ( x − v R ) c ( u R − x )( v R − x )(2 u R − v R ) + c ( u R − x ) v R − x (cid:19) v R − u R )4 c + u R − x c ( u R − v R ) (cid:18) c v R − x − v R (cid:19) = u R − x u R − v R ) (cid:18) v R − u R + 1 v R − x (cid:19) ( v R − u R ) − u R − x v R − u R ) (cid:18) v R − x − v R c (cid:19) = u R − x v R − u R ) (cid:18) v R − u R + v R c (cid:19) = u R − x c ( u R − v R )(2 u R − v R ) (cid:0) ( v R − u R ) + c − u R (cid:1) , which is non-negative because (cid:54) u R (cid:54) c and v R (cid:54) x (cid:54) u R . This ﬁnishes the proof of (4.23). Case ¯ x < x . We will construct another auxiliary function G L in the following way. Let c > . For any point ( x, y ) such that ( x, y ) ∈ ω smile and max(2 cx, x ) (cid:54) y (cid:54) x + c , we ﬁnd two numbers u L and v L such that x (cid:54) v L (cid:54) u L and u L − v L − c u L − v L = y − v L − c x − v L = 2 v L , see Figure 4. After some calculations, we get v L = v L ( x, y, c ) = x + (cid:112) x + c − y, u L = u L ( x, y, c ) = v L + (cid:112) c − v L . (4.27)We introduce the function G L deﬁned on the domain (cid:110) ( x, y, c ) (cid:12)(cid:12)(cid:12) c ∈ (0 , , max(2 cx, x ) (cid:54) y (cid:54) x + c (cid:111) u L and v L .by the formula: G L ( x, y, c ) = u L − xu L − v L Ψ (cid:16) v L c (cid:17) + x − v L u L − v L = u L − xu L − v L (cid:16) Ψ (cid:16) v L c (cid:17) − (cid:17) + 1 . (4.28)For any c ﬁxed the function G L ( · , · , c ) is linear on the extension of the segment connecting the points ( v L , v L + c ) and ( u L , u L ) beyond the point ( v L , v L + c ) , and coincides with M at these two points. Alsofrom the construction of M (see (4.6)) we deduce that the function G L satisﬁes the following property: M ( x , y ) + ∂ M ∂x ( x , y ) · (¯ x − x ) + ∂ M ∂y ( x , y ) · x (¯ x − x ) = G L (¯ x, ¯ y, c ) , (4.29)for c = (cid:112) y − x .On the other hand, for ¯ c = (cid:112) ¯ y − ¯ x we have G L (¯ x, ¯ y, ¯ c ) = M (¯ x, ¯ y ) . Again, we have ¯ c < c . Thus, itsuﬃces to show that the function G L (¯ x, ¯ y, c ) increases in c , i. e., the inequality ∂G L ( x, y, c ) ∂c (cid:62) . (4.30)From equations (4.27) we obtain ∂v L ∂c = c (cid:112) x + c − y = cv L − x . (4.31)We note that the right hand side of (4.31) coincides with the right hand side of (4.24) ( v R is simplyreplaced with v L ), and u L is deﬁned by v L exactly in the same way as u R was deﬁned by v R . The samecalculations as we have already done in the proof of (4.23) lead us to the fact that (4.30) is equivalent to u L − xc ( u L − v L )(2 u L − v L ) (cid:0) ( v L − u L ) + c − u L (cid:1) (cid:62) , (4.32)which holds true because x (cid:54) v L (cid:54) u L and (cid:54) u L (cid:54) c . Lemmas 4.2 and 4.3 imply M ( x, y ) (cid:62) R ( x, y ) for any ( x, y ) ∈ ω smile . Now we wish to prove the reverseinequality. 25 emma 4.4. Let R be the solution of the model problem and let M be the function deﬁned in (4.17) .Then , for any ( x, y ) ∈ ω smile , we have R ( x, y ) (cid:62) M ( x, y ) .Proof. Due to the homogeneity relation (4.4), it suﬃces to consider the case ( x, y ) ∈ P . Let r ( t ) = R ( v ( t ) , w ( t )) , here we use the parametrization ( v ( t ) , w ( t )) of P introduced in Subsubsection 4.1.2. Wewill show that r is continuous and for every t (cid:62) the inequality d − dt [ t r ( t )] (cid:62) (4.33)holds true. By d − dt we mean the lower derivative, that is d − rdt ( t ) = lim inf t → t r ( t ) − r ( t ) t − t . Once (4.33) is proved, we may use formula (4.15) that implies d − dt (cid:104) t (cid:16) r ( v ( t )) − Ψ( v ( t )) (cid:17)(cid:105) (cid:62) . (4.34)This yields the desired estimate R ( v ( t ) , w ( t )) − M ( v ( t ) , w ( t )) (cid:62) , because the two functions in questionare continuous and are equal at t = 1 .The proof of (4.33) and the continuity of r will take some time. Fix a point ( v , w ) , where w = v +1 and v = v ( t ) for some t ∈ (1 , ∞ ) . Draw the tangent line through ( v , w ) to the upper boundaryof ω smile : y = 2 v x + 1 − v . (4.35)Take two more points on this line ( x ± , y ± ) , where one of them is the right point of intersection with thelower boundary of ω smile , i. e., x + = ϕ ( t ) = v + (cid:112) − v , y + = 2 ϕ ( t ) = 1 + v (cid:113) − v , and the second is deﬁned as follows: x − = vv + (cid:112) v − v v v , y − = (cid:0) vv + (cid:112) v − v (cid:1) v , (4.36)where v = v ( t ) for some t ∈ [1 , ∞ ) , t (cid:54) = t . At these points we have R ( x + , y + ) = 1 and R ( x − , y − ) = r ( t ) .The latter identity holds true by (4.4), because the points ( x − , y − ) and ( v, v + 1) lie on the parabola y = 1 + v v x . We write down the concavity property (4.2): r ( t ) (cid:62) x + − v x + − x − · r ( t ) + v − x − x + − x − · , if t < t ; (4.37) r ( t ) (cid:62) x + − x − x + − v · r ( t ) + x − − v x + − v · , if t > t . (4.38)We may rewrite (4.37) and (4.38) as follows ( x + − v )( r ( t ) − r ( t )) (cid:54) ( x − − v )(1 − r ( t )) , if t < t ;( x + − v )( r ( t ) − r ( t )) (cid:62) ( x − − v )(1 − r ( t )) , if t > t . r ( t ) − r ( t ) t − t (cid:62) x − − v t − t · − r ( t ) x + − v (4.39)after dividing by ( t − t )( x + − v ) .Let us calculate the right hand side of this inequality. Using (4.10) we get x + − v = ϕ ( t ) − v ( t ) = t v (cid:48) ( t ) . From the deﬁnition (4.36) of x − we deduce x − − v = v − v v + v (cid:112) v − v . Therefore, (4.39) may be rewritten as r ( t ) − r ( t ) t − t (cid:62) v − v t − t · v + v v + v (cid:112) v − v · − r ( t ) t v (cid:48) ( t ) . (4.40)We see that the right hand side has a limit as t → t : lim t → t v − v t − t · v + v v + v (cid:112) v − v · − r ( t ) t v (cid:48) ( t ) = 1 − r ( t ) t , whence lim inf t → t r ( t ) − r ( t ) t − t (cid:62) − r ( t ) t , what is exactly the desired estimate (4.33).It remains to check continuity of r . First we note that (4.40) implies that r is an increasing functionbecause v is. We will write down the same property (4.2) with ( x + , y + ) being not the right but the leftpoint of intersection with the lower boundary of ω smile , i. e., x + = v − (cid:112) − v . The point ( x − , y − ) is deﬁned as before by (4.36), where v = v ( t ) and t > t . Thus, now we have x + < v < x − , R ( x + , y + ) = , R ( x − , y − ) = r ( t ) , and the concavity property (4.2) takes the form r ( t ) (cid:62) x − − v x − − x + · + v − x + x − − x + · r ( t ) . Therefore, ( x − − v )( r ( t ) − ) (cid:62) ( v − x + )( r ( t ) − r ( t )) , (4.41)whence (cid:54) r ( t ) − r ( t ) (cid:54) x − − v v − x + ( r ( t ) − ) (cid:54) x − − v v − x + · = v − v v + v (cid:112) v − v · v + (cid:112) − v (cid:54) v − v v + (cid:112) − v . v is a continuous function, this inequality proves that r is continuous at any point t , t > (i. e., v > − ). It remains to verify continuity of the function r at the point t = 1 from the right. We knowthat r is an increasing function, therefore, there exists a limit lim t → r ( t ) def = r . (4.42)Due to (4.41) we have r ( t ) (cid:62) for every t > , whence r (cid:62) . At the same time we have alreadyproved that r ( t ) (cid:54) Ψ( v ( t )) = 1 − t , i. e., r (cid:54) . So, r = and we have proved continuity of r on [1 , ∞ ) .Summarizing all preceding consideration, we conclude that the solution of the model problem is givenby the following formula: R ( x, y ) = 1 − (cid:112) − ρ − ρ √ e − arcsin ρ − π , where ρ = ρ ( x, y ) = x (cid:112) y − x ) . (4.43) The set Ω R deﬁned in (1.19) has special relationship with the splitting rules (2.1). It follows fromLemma 2.3 that if ( x, y, z ) ∈ Ω R is split into some points ( x j , y j , z j ) according to the rules (2.1), then,ﬁrst, all the points ( x j , y j ) lie on the tangent line to the parabola y − x = 1 − z , and second, all thepoints ( x j , y j , z j ) belong to Ω R . One may say that Ω R has separate dynamics.Thus, if we denote B ( x, y, (cid:112) − y + x ) by (cid:98) B ( x, y ) , then (cid:98) B may be described as the minimal amongfunctions G : ω → R that satisfy the boundary conditions G ( x, x ) = χ [0 , ∞ ) ( x ) and the main inequality G ( x, y ) (cid:62) N (cid:88) j =1 α j G ( x j , y j ) , where x = N (cid:88) j =1 α j x j , y = N (cid:88) j =1 α j y j , y j − yx j − x = 2 x, N (cid:88) j =1 α j = 1 , α j (cid:62) , and ( x, y ) , ( x j , y j ) ∈ ω . (4.44)Note that the main inequality (or the splitting rules) almost coincides with the main inequality (4.2) ofthe model problem. The only diﬀerence is that the two extremal problems are set on diﬀerent domains(the splitting into N points with arbitrary N may be reduced to many splittings into points; formally,we will not use this principle). Lemma 4.5.

The function (cid:98) B satisﬁes the following equality : (cid:98) B ( x, y ) = b √ y − x ( x, y ) , x (cid:54) y (cid:54) min(2 x , x + 1) . Proof.

Lemma 1.8 implies (cid:98) B ( x, y ) (cid:54) b √ y − x ( x, y ) , x (cid:54) y (cid:54) min(2 x , x + 1) , (4.45)therefore, it suﬃces to prove the reverse inequality. Note that here we cannot use theorems from Section 3directly due to the discontinuity of f .First, let x (cid:54) y (cid:54) min(2 x , x + 1) and x > . Then, by (3.48) we have (cid:98) B ( x, y ) (cid:62) .28econd, let ( x , y ) ∈ ω with x (cid:54) y (cid:54) x and x < . Let ε = (cid:112) y − x . Take any small θ > and apply Remark 3.6 with ( x, y ) = ( − ε + θ, ( − ε + θ ) + ε ) : (cid:98) B ( x , y ) (cid:62) e − x − x ε lim inf δ → (cid:98) B ( x, y − δ ) (3.48) (cid:62) e − x − x ε lim inf δ → f ( x − (cid:112) y − δ − x ) + f ( x + (cid:112) y − δ − x )2 = 12 e − x − x ε . Considering arbitrarily small θ > , we obtain (cid:98) B ( x , y ) (cid:62) e x ε +1 (1.18) = b ε ( x , y ) . (4.46)Lemma 4.5 implies that (cid:98) B ( x, y ) = b √ y − x ( x, y ) on ω \ ω smile . Moreover, (cid:98) B | ω smile satisﬁes the boundaryconditions (4.3) and the main inequality (4.2) of the model problem. Thus, (cid:98) B ( x, y ) (cid:62) R ( x, y ) , ( x, y ) ∈ ω smile . (4.47)To prove Theorem 1.13, it suﬃces to show that the function G deﬁned as G ( x, y ) = (cid:40) b √ y − x ( x, y ) , ( x, y ) ∈ ω \ ω smile , R ( x, y ) , ( x, y ) ∈ ω smile , (4.48)satisﬁes the main inequality (4.44). This is our target for the remaining part of the subsection. It isconvenient to introduce the domains ω R = { ( x, y ) ∈ ω | y (cid:54) x , x (cid:62) } ; ω L = { ( x, y ) ∈ ω | y (cid:54) x , x (cid:54) } . (4.49)The function G is homogeneous: G ( λ x , λ y ) = G ( x , y ) for λ ∈ (0 , , therefore, without loss of generalitywe may assume that y = x + 1 . If ( x, y ) / ∈ ω smile then G ( x, y ) = b √ y − x ( x, y ) (2.12) (cid:62) N (cid:88) j =1 α j b √ yj − x j ( x j , y j ) (cid:62) N (cid:88) j =1 α j G ( x j , y j ) . In what follows we consider only ( x, y ) ∈ ω smile such that y (cid:54) = 2 x . Instead of verifying (4.44) we willprove the inequality G (¯ x, ¯ y ) (cid:54) G ( x, y ) + ∂G∂x ( x, y ) · (¯ x − x ) + ∂G∂y ( x, y ) · (¯ y − y ) (4.50)for (¯ x, ¯ y ) ∈ ω ∩ (cid:96) , where (cid:96) = { ( x , y ) | y − y = 2 x ( x − x ) } . Indeed, one may argue as in the proof ofLemma 2.4 to show that (4.50) yields (4.44).The right hand side of (4.50) is linear with respect to ¯ x when (¯ x, ¯ y ) ∈ (cid:96) and is equal to L (¯ x ) = M ( x, y ) + ∂ M ∂x ( x, y ) · (¯ x − x ) + ∂ M ∂y ( x, y ) · x (¯ x − x ) since G = M on ω smile . Lemma 4.3 implies that (4.50) holds true for (¯ x, ¯ y ) ∈ ω smile ∩ (cid:96) because G (¯ x, ¯ y ) = M (¯ x, ¯ y ) . The point ( x + √ − x , x √ − x ) is the intersection of (cid:96) with the common boundary of ω R ω smile , therefore, L ( x + √ − x ) = 1 by the construction of M (see (4.6)). Also, we know that L ( x ) = M ( x, y ) < , therefore L (¯ x ) (cid:62) for (¯ x, ¯ y ) ∈ ω R ∩ (cid:96) .Thus, it remains to prove that (4.50) holds for (¯ x, ¯ y ) ∈ ω L ∩ (cid:96) : G (¯ x, ¯ y ) (cid:54) L (¯ x ) = 1 + 2¯ x − ( x + √ − x ) x − √ − x (cid:0) G ( x, x + 1) − (cid:1) . (4.51)Recall that G (¯ x, ¯ y ) = b √ ¯ y − ¯ x (¯ x, ¯ y ) is given by (1.18) for (¯ x, ¯ y ) ∈ ω L : G (¯ x, ¯ y ) = 12 e ¯ x √ ¯ y − ¯ x = 12 e ¯ x √ − ( x − ¯ x )2 , (4.52)here we have used that ¯ y = 2 x (¯ x − x ) + x + 1 . The value G ( x, x + 1) equals to Ψ( x ) deﬁned in (4.16): G ( x, x + 1) = Ψ( x ) = 1 − √ − x − x e − arcsin (cid:16) x √ (cid:17) − π . (4.53)We rewrite (4.51) using (4.52) and (4.53): e ¯ x √ − ( x − ¯ x )2 (cid:54) x − ( x + √ − x )4 e − arcsin (cid:16) x √ (cid:17) − π . (4.54)We introduce the variables α = π x √ , γ = arcsin( x − ¯ x ) . (4.55)Then, x = √ (cid:16) α − π (cid:17) = sin α − cos α, ¯ x = x − sin γ = sin α − cos α − sin γ,x + (cid:112) − x = √ (cid:16) α − π (cid:17) + √ (cid:16) α − π (cid:17) = 2 sin α. Note that α, γ ∈ [0 , π ] . The condition (¯ x, ¯ y ) ∈ ω L implies that ¯ x (cid:54) x −√ − x , i. e., x − ¯ x (cid:62) x + √ − x .From this we obtain (cid:54) α (cid:54) γ (cid:54) π . Rewrite (4.54) in the variables α, γ : e sin α − cos α − sin γ cos γ (cid:54) − cos α + sin γ e − α . (4.56)It suﬃces to show that for the ﬁxed parameter γ , (cid:54) γ (cid:54) π , the function F ( α ) = e sin α − cos α − sin γ cos γ − αe − α + sin γe − α attains only non-positive values for (cid:54) α (cid:54) γ. Our next step is to prove the convexity of F . Its ﬁrst andsecond derivatives are written below F (cid:48) ( α ) = e − α ( − sin α − cos α − sin γ ) + e sin α − cos α − sin γ cos γ (cid:18) cos α + sin α cos γ (cid:19) ; F (cid:48)(cid:48) ( α ) = e − α (2 sin α + sin γ ) + e sin α − cos α − sin γ cos γ (cid:18) cos α + sin α cos γ (cid:19) + e sin α − cos α − sin γ cos γ (cid:18) cos α − sin α cos γ (cid:19) . F is also always non-negative. Indeed, it followsfrom the estimate (cid:62) sin α cos γ and that all other expressions involved are positive. We have shownthat the function F is convex. To estimate its values from above on [0 , γ ] , it suﬃces to show F (0) (cid:54) and F ( γ ) (cid:54) . We start with the case α = 0 : F (0) = (1 + sin γ ) + e − γ cos γ − . The statement we need to prove is equivalent to the fact that the function Φ( γ ) = 1 − γ cos γ − log(1 − sin γ ) takes only non-positive values when γ ∈ [0 , π/ . It should be noted that Φ(0) = 0 , while Φ (cid:48) ( γ ) = − cos γ + sin γ (1 + sin γ )cos γ + cos γ − sin γ = (cos γ − γ )cos γ (cid:54) . We have obtained that Φ( γ ) (cid:54) , so the estimate F (0) (cid:54) follows. Now, we verify the inequality for theright endpoint of the segment, i. e., for α = γ : F ( γ ) = e − γ (sin γ + cos γ ) − (cid:54) . One may easily see that for γ = 0 the inequality turns into equality. Taking the derivative of the functionon the left side of this inequality, we get − γe − γ . The last term is negative for γ ∈ (0 , π ] and theestimate F ( γ ) (cid:54) follows. In this section, we present the proof of Corollary 1.14. Recall that our goal is to ﬁnd the optimal constantin (1.22). Since the square function is homogeneous and vanishes on constants, it suﬃces to ﬁnd the bestpossible constant c opt in the estimate P ( ϕ ∞ (cid:62) (cid:54) ce − λ , where ϕ = − λ and (cid:107) Sϕ (cid:107) L ∞ = 1 . (4.57)Recall that the Bellman function B ( x, y, z ) was deﬁned by formula (1.11), and in Lemma 1.8 we haveshown that the inequality B ( x, y, z ) (cid:54) b √ − z ( x, y ) is true. Thus, the optimal constant c opt may beestimated as follows c opt = sup (cid:8) e λ B ( − λ, y, | λ ∈ R , λ (cid:54) y (cid:54) λ + 1 (cid:9) (cid:54) sup (cid:8) e − x b ( x, y ) | ( x, y ) ∈ ω (cid:9) . (4.58)Recall that we have split the domain ω into the subdomains D , D , D , D , and the function b ( x, y ) was deﬁned on them by (1.18). We will continue the argument by the estimation of s ( x, y ) := e − x b ( x, y ) in each subdomain.Clearly, s ( x, y ) (cid:54) , when ( x, y ) ∈ D .For ( x, y ) ∈ D , we have s ( x, y ) = e − x (cid:18) − y − x (cid:19) (cid:54) e − x (cid:18) − | x | − x (cid:19) , since y (cid:62) | x | . The function on the right hand side decreases on [ − , and takes value e at − , therefore s ( x, y ) (cid:54) e on D . 31ext, consider ( x, y ) ∈ D . The relations y (cid:54) − x and − (cid:54) x (cid:54) imply s ( x, y ) = (cid:18) − x y (cid:19) e − x (cid:54) (cid:16) x (cid:17) e − x (cid:54) e . Finally, we take ( x, y ) ∈ D and set t = (cid:112) − y + x to get s ( x, y ) = e − t ) e t (cid:54) e , (4.59)since t ∈ [0 , . Thus, we have proved c opt (cid:54) e .Now we notice that for x = − and y = 2 we have B ( − , ,

0) = b ( − ,

2) = , and therefore, (4.58)implies that c opt (cid:62) e , which means c opt = e . References [1] D. L. Burkholder,

Boundary value problems and sharp inequalities for martingale transforms , Ann.Prob. (1984), no. 3, 647–702.[2] S.-Y. A. Chang, J. M. Wilson, and T. H. Wolﬀ, Some weighted norm inequalities concerning theSchr¨odinger operators , Comment. Math. Helv. (1985), no. 2, 217–246.[3] D. C. Cox, The best constant in Burkholder’s weak- L inequality for the martingale square function ,Proc. Amer. Math. Soc. (1982), no. 3, 427–433.[4] I. Holmes, P. Ivanisvili, and A. Volberg, The sharp constant in the weak (1,1) inequality for thesquare function: a new proof , Rev. Mat. Iberoam. (2020), no. 3, 741–770.[5] P. Ivanishvili, N. N. Osipov, D. M. Stolyarov, V. I. Vasyunin, and P. B. Zatitskiy, On Bellmanfunction for extremal problems in

BMO , C. R. Math. Acad. Sci. Paris (2012), no. 11, 561–564.[6] P. Ivanishvili, N. N. Osipov, D. M. Stolyarov, V. I. Vasyunin, and P. B. Zatitskiy,

Bellman functionfor extremal problems in

BMO , Trans. Amer. Math. Soc. (2016), 3415–3468.[7] P. Ivanishvili, D. M. Stolyarov, V. I. Vasyunin, and P. B. Zatitskiy,

Bellman function for extremalproblems on

BMO

II: evolution , Mem. Amer. Math. Soc. (2018), no. 1220.[8] P. Ivanisvili and S. Treil,

Superexponential estimates and weighted lower bounds for the square func-tion , Trans. Amer. Math. Soc. (2019), 1139–1157.[9] N. Kazamaki,

Continuous exponential martingales and BMO , Springer-Verlag, 1994.[10] F. L. Nazarov and S. R. Treil,

The hunt for a Bellman function: applications to estimates for singularintegral operators and to other classical problems of harmonic analysis , Algebra i Analiz (1996),no. 5, 32–162, translation in St. Petersburg Math. J. 8 (1997), no. 5, 721–824.[11] A. Os¸ekowski, On the best constant in the weak type inequality for the square function of a condi-tionally symmetric martingale , Statist. Probab. Lett. (2009), no. 13, 1536–1538.[12] A. Os¸ekowski, Sharp martingale and semimartingale inequalities , Monograﬁe Matematyczne IM-PAN , Springer Basel, 2012.[13] A. Os¸ekowski, Sharp inequalities for the dyadic square function in the BMO setting , Acta Math.Hungar. (2013), 85–105. 3214] ,

Functional equations and sharp weak-type inequalities for the martingale square function ,Math. Ineq. Appl. (2014), 1499–1513.[15] , A weak-type inequality for the martingale square function , Statist. Probab. Lett. (2014),139–143.[16] , Sharp maximal estimates for BMO martingales , Osaka J. Math. (2015), 1125–1143.[17] L. Slavin and V. Vasyunin, Sharp results in the integral form John–Nirenberg inequality , Trans.Amer. Math. Soc. (2011), no. 8, 4135–4169.[18] ,

Sharp L p estimates on BMO , Indiana Univ. Math. J. (2012), no. 3, 1051–1110.[19] E. M. Stein, Harmonic analysis: real-variable methods, orthogonality, and oscillatory integrals ,Princeton University Press, 1993.[20] D. Stolyarov, V. Vasyunin, P. Zatitskiy, and I. Zlotnikov,

Distribution of martingales with boundedsquare functions , C. R. Math. Acad. Sci. Paris (2019), no. 8, 671–675.[21] D. M. Stolyarov, V. I. Vasyunin, and P. B. Zatitskiy,

Monotonic rearrangement of functions withsmall mean oscillation , Studia Math. (2015), no. 3, 257–268.[22] D. M. Stolyarov and P. B. Zatitskiy,

Theory of locally concave functions and its applications to sharpestimates of integral functionals , Adv. Math. (2016), 228–273.[23] V. Vasyunin and A. Volberg,

Sharp constants in the classical weak form of the John–Nirenberginequality , Proc. Lond. Math. Soc. (2014), no. 6, 1417–1434.[24] ,

The Bellman function technique in Harmonic Analysis , Cambridge University Press, 2020.[25] V. I. Vasyunin,

The sharp constant in the John–Nirenberg inequality

Sharp square-function inequalities for conditionally symmetric martingales , Trans. Amer.Math. Soc.382