Time fractional gradient flows: Theory and numerics
aa r X i v : . [ m a t h . A P ] J a n TIME FRACTIONAL GRADIENT FLOWS: THEORY AND NUMERICS
WENBO LI AND ABNER J. SALGADO
Abstract.
We develop the theory of fractional gradient flows: an evolution aimed at the mini-mization of a convex, l.s.c. energy, with memory effects. This memory is characterized by the factthat the negative of the (sub)gradient of the energy equals the so-called Caputo derivative of thestate. We introduce the notion of energy solutions, for which we provide existence, uniquenessand certain regularizing effects. We also consider Lipschitz perturbations of this energy. For theseproblems we provide an a posteriori error estimate and show its reliability. This estimate dependsonly on the problem data, and imposes no constraints between consecutive time-steps. On thebasis of this estimate we provide an a priori error analysis that makes no assumptions on thesmoothness of the solution. Introduction
In recent times problems involving fractional derivatives have garnered considerable attention,as it is claimed that they better describe certain fundamental relations between the processes ofinterest; see, for instance [29, 15, 46]. In this, and many other references the models considered arelinear. However, it is well known that real world phenomena are not linear, not even smooth. It isonly natural then to consider nonlinear/nonsmooth models with fractional derivatives.The purpose of this work is to develop the theory and numerical analysis of so-called time-fractional gradient flows: an evolution equation aimed at the minimization of a convex and lowersemicontinuous (l.s.c.) energy, but where the evolution has memory effects. This memory is charac-terized by the fact that the negative of the (sub)gradient of the energy equals the so-called Caputoderivative of the state.The Caputo derivative, introduced in [11], is one of the existing models of fractional derivatives.It is defined, for α ∈ (0 , D αc w ( t ) = 1Γ(1 − α ) ˆ t ˙ w ( r )( t − r ) α d r, where Γ denotes the Gamma function. This definition, from the onset, seems unnatural. To definea derivative of a fractional order, it seems necessary for the function to be at least differentiable.Below we briefly describe several attempts at circumventing this issue. We focus, in particular, onthe results developed in a series of papers by Li and Liu, see [25, 28, 26, 27], where they developed adistributional theory for this derivative; see also [16]. The authors of these works also constructed,in [26], so-called deconvolution schemes that aim at discretizing this derivative. With the help ofthis definition and the schemes that they develop the authors were able to study several classes ofequations, in particular time fractional gradient flows.Let us be precise in what we mean by this term. Let T > H be a separableHilbert space, Φ : H → R ∪ { + ∞} be a convex and l.s.c. functional, which we will call energy . Given u ∈ H , and f : (0 , T ] → H we seek for a function u : [0 , T ] → H that satisfies(1.2) ( D αc u ( t ) + ∂ Φ( u ( t )) ∋ f ( t ) , t ∈ (0 , T ] ,u (0) = u , Date : Draft version of January 5, 2021.2020
Mathematics Subject Classification.
Key words and phrases.
Caputo derivative, gradient flows, a posteriori error estimate, variable time stepping. where by ∂ Φ we denote the subdifferential of Φ. Our objectives in this work can be stated as follows:We will introduce the notion of “energy solutions” of (1.2), and we will refine the results regardingexistence, uniqueness, and regularizing effects provided in [28]. This will be done by generalizing, tonon-uniform time steps the “deconvolution” schemes of [26, 28], and developing a sort of “fractionalminimizing movements” scheme. We will also provide an a priori error estimate that seems optimalin light of the regularizing effects proved above. We also develop an a posteriori error estimate, inthe spirit of [30] and show its reliability.We comment, in passing, that nonlinear evolution problems with fractional time derivative havebeen considered in other works. From a modeling point of view, their advantages have been observedin [15, 12]. Some other types of nonlinear problems have been studied in [8, 40, 2, 24, 23, 39, 45]and [31, 38] where, for a particular type of nonlinear problem other “energy dissipation inequalities”than those we obtain are derived. Regularity properties for nonlinear problems with fractional timederivatives have been obtained in [22, 14, 21, 1, 44, 43, 42, 41]. Of particular interest to us are[28] which we described above and [3] which also considers time fractional gradient flows. Theassumptions on the data, however, are slightly different than ours. As such, some of the resultsin [3] are stronger, and some weaker than ours; in particular, we conduct a numerical analysis ofthis problem. Nevertheless, we refer to this reference for a nice historical account and particularapplications to PDEs.Our presentation will be organized as follows. We will establish notation and the framework wewill adopt in Section 2. Here, in particular, we will study several properties of a particular space,which we denote by L pα (0 , T ; H ), and that will be used to characterize the requirements on the righthand side f of (1.2). In addition, we also review the various proposed generalizations of the classicaldefinition of Caputo derivatives, with particular attention to that of [25, 28, 27]; since this is the onewe shall adopt. In Section 3 we generalize the deconvolution schemes of [26, 28] and their properties,to the case of nonuniform time stepping. Many of the simple properties of these schemes are lost inthis case, but we retain enough of them for our purposes. Section 4 introduces the notion of energysolutions for (1.2) and shows existence and uniqueness of these. This is accomplished by introducing,on the basis of our generalized deconvolution formulas, a fractional minimizing movements scheme;and showing that the discrete solutions have enough compactness to pass to the limit in the sizeof the partition. In Section 5 we provide an error analysis of the fractional minimizing movementsscheme. First, we show how an error estimate follows as a side result from the existence proof. Then,in the spirit of [30], we provide an a posteriori error estimator for our scheme and show its reliability.This estimator is then used to independently show rates of convergence. This section is concludedwith some particular instances in which the rate of convergence can be improved. Section 6 isdedicated to the case in which we allow a Lipschitz perturbation of the subdifferential. We extendthe existence, uniqueness, a priori, and a posteriori approximation results of the fractional gradientflow. Finally, Section 7 presents some simple numerical experiments that illustrate, explore, andexpand our theory. 2. Notation and preliminaries
Let us begin by presenting the main notation and assumptions we shall operate under. We willdenote by T ∈ (0 , ∞ ) our final (positive) time. By H we will always denote a separable Hilbertspace with scalar product h· , ·i and norm k · k . As it is by now customary, by C we will denote anonessential constant whose value may change at each occurrence.2.1. Convex energies.
The energy will be a convex, l.s.c., functional Φ :
H → R ∪ { + ∞} withnonempty effective domain of definition, that is D (Φ) = { w ∈ H : Φ( w ) < + ∞} 6 = ∅ . We will always assume that our energy is bounded from below, that isΦ inf = inf u ∈H Φ( u ) > −∞ . IME FRACTIONAL GRADIENT FLOW 3
As we are not assuming smoothness in our energy beyond convexity, a useful substitute for itsderivative is the subdifferential, that is, ∂ Φ( w ) = { ξ ∈ H : h ξ, v − w i ≤ Φ( v ) − Φ( w ) ∀ v ∈ H} . The effective domain of the subdifferential is D ( ∂ Φ) = { w ∈ H : ∂ Φ( w ) = ∅} . Recall that, in oursetting, we always have that D ( ∂ Φ) = D (Φ). We refer the reader to [13, 33] for basic facts on convexanalysis.In applications, it is sometimes useful to obtain error estimates on (semi)norms stronger thanthose of the ambient space, and that are dictated by the structure of the energy. For this reason,we introduce the following coercivity modulus of Φ, see [30, Definition 2.3]. Definition 2.1 (coercivity modulus) . For every w ∈ D (Φ) and w ∈ D ( ∂ Φ) , let σ ( w ; w ) ≥ be σ ( w ; w ) = Φ( w ) − Φ( w ) − sup ξ ∈ ∂ Φ( w ) h ξ, w − w i . Then for every w , w ∈ D ( ∂ Φ) we define ρ ( w , w ) = σ ( w ; w ) + σ ( w ; w ) = inf ξ ∈ ∂ Φ( w ) ,ξ ∈ ∂ Φ( w ) h ξ − ξ , w − w i . We comment that, by the definition, ρ ( · , · ) is symmetric, whereas σ ( · ; · ) might not be. Further-more, the separability of H guarantees that σ and ρ are both Borel measurable [30, Remark 2.4].One may also refer to [30, Section 2.3] for discussions and properties of σ and ρ for certain choicesof Φ. Definition 2.1 enables us to write(2.1) ξ ∈ ∂ Φ( w ) ⇐⇒ h ξ, v − w i + σ ( w ; v ) ≤ Φ( v ) − Φ( w ) , ∀ v ∈ H . Vector valued time dependent functions.
We will follow standard notation regardingBochner spaces of vector valued functions, see [32, Section 1.5]. For any w ∈ L (0 , T ; H ) and E ⊂ [0 , T ] that is measurable, we define the average by E w ( t )d t = 1 | E | ˆ E w ( t )d t, where | E | denotes the Lebesgue measure of E .Since eventually we will have to deal with time discretization, we also introduce notation fortime-discrete vector valued functions. Let P be a partition of the time interval [0 , T ](2.2) P = { t < t < . . . < t N − < t N = T } , with variable steps τ n = t n − t n − and τ = max { τ n : n ∈ { , . . . , N }} . We will always denote by N the size of a partition. For t ∈ [0 , T ] we define ⌊ t ⌋ P = max { r ∈ P : r < t } , ⌈ t ⌉ P = min { r ∈ P : t ≤ r } , and n ( t ) to be the index of ⌈ t ⌉ P , so that t ∈ ( ⌊ t ⌋ P , ⌈ t ⌉ P ] = ( t n ( t ) − , t n ( t ) ]. Given a partition P ,for W = { W i } Ni =1 ⊂ H N we define its piecewise constant interpolant with respect to P to be thefunction W P ∈ L ∞ (0 , T ; H ) defined by(2.3) W P ( t ) = W n ( t ) . The space L pα (0 , T ; H ) . To quantify the assumptions we need on the right hand side f of (1.2)we introduce the following space. Definition 2.2 (space L pα (0 , T ; H )) . Let p ∈ [1 , ∞ ) and α ∈ (0 , . We say that the function w : [0 , T ] → H belongs to the space L pα (0 , T ; H ) iff (2.4) k w k L pα (0 ,T ; H ) = sup t ∈ [0 ,T ] (cid:18) ˆ t ( t − s ) α − k w ( s ) k p d s (cid:19) /p < ∞ . Let us show some basic embedding results about this space.
W. LI AND A.J. SALGADO
Proposition 2.3 (embedding) . Let p ∈ [1 , ∞ ) , α ∈ (0 , , and q > p/α . Then we have that L q (0 , T ; H ) ֒ → L pα (0 , T ; H ) ֒ → L p (0 , T ; H ) . Proof.
The second embedding is immediate. For any t ∈ (0 , T ] ˆ t k w ( s ) k p d s ≤ sup s ∈ [0 ,t ] ( t − s ) − α ˆ t ( t − s ) α − k w ( s ) k p d s ≤ T − α k w k pL pα (0 ,T ; H ) , where we used that 1 − α > (cid:18) ˆ t ( t − s ) α − k w ( s ) k p d s (cid:19) /p ≤ (cid:18) q − pqα − p (cid:19) ( q − p ) /q t α − p/q k w k L q (0 ,t ; H ) , and hence(2.5) k w k L pα (0 ,T ; H ) ≤ (cid:18) q − pqα − p (cid:19) ( q − p ) /q T α − p/q k w k L q (0 ,T ; H ) , as we intended to show. (cid:3) When dealing with discretization we will approximate the right hand side f of (1.2) by its localaverages over a partition P . Thus, we must provide a bound on this operation that is independentof the partition. Lemma 2.4 (continuity of averaging) . Let p ∈ [1 , ∞ ) , α ∈ (0 , , f ∈ L pα (0 , T ; H ) , and P be apartition of [0 , T ] as in (2.2) . Define F = { ffl t n t n − f ( t )d t } Nn =1 ⊂ H N and let F P be defined as in (2.3) .Then, there exists a constant C > only depending on p and α such that k F P k L pα (0 ,T ; H ) ≤ C k f k L pα (0 ,T ; H ) . Proof.
Let p ∈ (1 , ∞ ). We first, for n ∈ { , . . . , N } , bound the integral ˆ t n ( t n − s ) α − k F P ( s ) k p d s. To achieve this, we decompose this integral as(2.6) ˆ t n ( t n − s ) α − k F P ( s ) k p d s = n X k =1 ˆ t k t k − ( t n − s ) α − k F P ( s ) k p d s = n X k =1 k F k k p ˆ t k t k − ( t n − s ) α − d s. We use H¨older inequality in the definition of F k to obtain that(2.7) k F k k p = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) t k t k − f ( s )d s (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) p ≤ t k t k − ( t n − s ) α − k f ( s ) k p d s t k t k − ( t n − s ) − αp − d s ! p − . Since, for every p ∈ (1 , ∞ ) the function s s α − belongs to the Muckenhoupt class A p ( R + ), see[20, Example 7.1.7], there exists a constant C p,α that only depends on p and α such that ba s α − d s ba s − αp − d s ! p − ≤ C p,α , ∀ ≤ a < b. Therefore, for any k , we have(2.8) t k t k − ( t n − s ) α − d s " t k t k − ( t n − s ) − αp − d s p − = t n − t k t n − t k − s α − d s " t n − t k t n − t k − s − αp − d s p − ≤ C p,α . IME FRACTIONAL GRADIENT FLOW 5
Substituting (2.7) and (2.8) into (2.6) we get ˆ t n ( t n − s ) α − k F P ( s ) k p d s ≤ n X k =1 C p,α ˆ t k t k − ( t n − s ) α − k f ( s ) k p d s = C p,α ˆ t n ( t n − s ) α − k f ( s ) k p d s ≤ C p,α k f k pL pα (0 ,T ; H ) . Now consider t ∈ [0 , T ]. Taking advantage of the estimate we obtained above we write(2.9) ˆ t ( t − s ) α − k F P ( s ) k p d s = ˆ ⌊ t ⌋ P ( t − s ) α − k F P ( s ) k p d s + ˆ t ⌊ t ⌋ P ( t − s ) α − k F P ( s ) k p d s = ˆ ⌊ t ⌋ P ( t − s ) α − k F P ( s ) k p d s + k F P ( ⌈ t ⌉ P ) k p ˆ t ⌊ t ⌋ P ( t − s ) α − d s ≤ ˆ ⌊ t ⌋ P ( ⌊ t ⌋ P − s ) α − k F P ( s ) k p d s + k F P ( ⌈ t ⌉ P ) k p ˆ ⌈ t ⌉ P ⌊ t ⌋ P ( ⌈ t ⌉ P − s ) α − d s ≤ C p,α k f k pL pα (0 ,T ; H ) + ˆ ⌈ t ⌉ P ( ⌈ t ⌉ P − s ) α − k F ( s ) k p d s ≤ C p,α k f k pL pα (0 ,T ; H ) . Therefore by taking supremum over t ∈ [0 , T ] and C = (2 C p,α ) /p , we finish the proof of this lemma.For p = 1, the proof proceeds almost the same way as before. The only difference worth notingis that, instead of (2.7), we have k F k k = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) t k t k − f ( s )d s (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ≤ t k t k − ( t n − s ) α − k f ( s ) k d s sup s ∈ [ t k − ,t k ] t n − s ) α − . Next, we observe that, since α − ∈ ( − , s s α − belongs to the Muckenhouptclass A ( R + ). Thus, sup s ∈ [ a,b ] s α − ba s α − d s ≤ C α , ∀ ≤ a < b. With this information, the proof proceeds without change. (cid:3)
It turns out that averaging is not only continuous, but possesses suitable approximation propertiesin this space. Namely, we have a control on the difference between fractional integrals of f ∈ L pα (0 , T ; H ) and its averages. Lemma 2.5 (approximation) . Let p ∈ [1 , ∞ ) , α ∈ (0 , , f ∈ L pα (0 , T ; H ) , and P be a partition of [0 , T ] as in (2.2) . Let p ′ be the H¨older conjugate of p , F = { ffl t n t n − f ( t )d t } Nn =1 ⊂ H N , and let F P bedefined as in (2.3) . Then we have (2.10)sup t ∈ [0 ,T ] (cid:13)(cid:13)(cid:13)(cid:13) ˆ t ( t − s ) α − (cid:0) f ( s ) − F P ( s ) (cid:1) d s (cid:13)(cid:13)(cid:13)(cid:13) ≤ Cτ α/p ′ k f − F P k L pα (0 ,T ; H ) ≤ C ′ τ α/p ′ k f k L pα (0 ,T ; H ) , where the constants C, C ′ depend only on p and α . In addition, for any β ∈ (0 , we also have (2.11) sup r ∈ [0 ,T ] ˆ r ( r − t ) α − (cid:13)(cid:13)(cid:13)(cid:13) ˆ t ( t − s ) β − (cid:0) f ( s ) − F P ( s ) (cid:1) d s (cid:13)(cid:13)(cid:13)(cid:13) p d t ≤ C τ pβ k f − F P k pL pα (0 ,T ; H ) ≤ C ′ τ pβ k f k pL pα (0 ,T ; H ) , where the constants C , C ′ depend on p , α , and β . As usual, when p = 1 , we have p ′ = ∞ and /p ′ is treated as . W. LI AND A.J. SALGADO
Proof.
We first notice that the second inequalities in both (2.10) and (2.11) follow directly fromLemma 2.4 and the triangle inequality.To show the first inequality in (2.10), given P we consider t ∈ [0 , T ]. Using that f − F P has zeromean on each subinterval of the partition, we can write(2.12) ˆ t ( t − s ) α − (cid:0) f ( s ) − F P ( s ) (cid:1) d s = ˆ t ⌊ t ⌋ P ( t − s ) α − (cid:0) f ( s ) − F P ( s ) (cid:1) d s + n ( t ) − X k =1 ˆ t k t k − ( t − s ) α − (cid:0) f ( s ) − F P ( s ) (cid:1) d s = ˆ t ⌊ t ⌋ P ( t − s ) α − (cid:0) f ( s ) − F P ( s ) (cid:1) d s + n ( t ) − X k =1 ˆ t k t k − (cid:0) ( t − s ) α − − ( t − t k − ) α − (cid:1) (cid:0) f ( s ) − F P ( s ) (cid:1) d s = I ( t ) + I ( t ) . For the first term, denoted I ( t ), we have k I ( t ) k ≤ ˆ t ⌊ t ⌋ P ( t − s ) α − (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p d s ! /p ˆ t ⌊ t ⌋ P ( t − s ) α − d s ! /p ′ ≤ k f − F P k L pα (0 ,T ; H ) (cid:18) α ( t − ⌊ t ⌋ P ) α (cid:19) /p ′ ≤ C τ α/p ′ k f − F P k L pα (0 ,T ; H ) , where C only depends on p and α . For the second term, noticing that t − t k − + τ > t − s for s ∈ ( t k − , t k ) we have k I k ≤ ˆ ⌊ t ⌋ P (cid:0) ( t − s ) α − − ( t − s + τ ) α − (cid:1) k f ( s ) − F P ( s ) k d s ≤ " ˆ ⌊ t ⌋ P ( t − s ) α − (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p d s /p ˆ ⌊ t ⌋ P ( t − s ) α − " − (cid:20) t − s + τt − s (cid:21) α − p ′ d s /p ′ ≤ k f − F P k L pα (0 ,T ; H ) ˆ ⌊ t ⌋ P ( t − s ) α − − ( t − s + τ ) α − d s ! /p ′ . Since ˆ ⌊ t ⌋ P ( t − s ) α − − ( t − s + τ ) α − d s = 1 α ( t α − ( t − ⌊ t ⌋ P ) α − ( t + τ ) α + ( t − ⌊ t ⌋ P + τ ) α ) ≤ α (( t − ⌊ t ⌋ P + τ ) α − ( t − ⌊ t ⌋ P ) α ) ≤ τ α α , we obtain k I ( t ) k ≤ C τ α/p ′ k f − F P k L pα (0 ,T ; H ) , and (2.10) follows after combining the bounds for I ( t ) and I ( t ) that we have obtained.To prove (2.11) we apply the H¨older inequality to (2.12) with α replaced by β to get (cid:13)(cid:13)(cid:13)(cid:13) ˆ t ( t − s ) β − (cid:0) f ( s ) − F P ( s ) (cid:1) d s (cid:13)(cid:13)(cid:13)(cid:13) p ≤ II ( t ) p − · (II ( t ) + II ( t )) , IME FRACTIONAL GRADIENT FLOW 7 where II ( t ) = ˆ t ⌊ t ⌋ P ( t − s ) β − d s + n ( t ) − X k =1 ˆ t k t k − (cid:2) ( t − s ) β − − ( t − t k − ) β − (cid:3) d s, II ( t ) = ˆ t ⌊ t ⌋ P ( t − s ) β − (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p d s, II ( t ) = n ( t ) − X k =1 ˆ t k t k − (cid:0) ( t − s ) β − − ( t − t k − ) β − (cid:1) (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p d s. Arguing as in the bound for I ( t )II ( t ) = 1 β ( t − ⌊ t ⌋ P ) β + ˆ ⌊ t ⌋ P (cid:2) ( t − s ) β − − ( t − s + τ ) β − (cid:3) d s ≤ β τ β . Thus, to obtain (2.11) it suffices to show that, for every r ∈ [0 , T ], ˆ r ( r − t ) α − (II ( t ) + II ( t )) d t ≤ C τ β k f − F P k pL pα (0 ,T ; H ) with some constant C only depending on p , α , and β . To estimate the fractional integral of II byFubini’s theorem we have(2.13) ˆ r ( r − t ) α − II ( t )d t = ˆ r (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p ˆ ⌈ s ⌉ P ∧ rs ( r − t ) α − ( t − s ) β − d t d s, where we set a ∧ b = min { a, b } . We claim that there exists a constant C depending on α and β such that(2.14) ˆ ⌈ s ⌉ P ∧ rs ( r − t ) α − ( t − s ) β − d t ≤ C ( r − s ) α − τ β . On the one hand, for r − s ≤ τ , we simply have ˆ ⌈ s ⌉ P ∧ rs ( r − t ) α − ( t − s ) β − d t ≤ ˆ rs ( r − t ) α − ( t − s ) β − d t = Γ( α )Γ( β )Γ( α + β ) ( r − s ) α + β − ≤ Γ( α )Γ( β )Γ( α + β ) ( r − s ) α − (2 τ ) β . On the other hand, if r − s > τ , then ˆ ⌈ s ⌉ P ∧ rs ( r − t ) α − ( t − s ) β − d t ≤ ˆ s + τs ( r − t ) α − ( t − s ) β − d t ≤ ˆ s + τs (cid:18) r − s (cid:19) α − ( t − s ) β − d t = 2 − α β ( r − s ) α − τ β . Therefore (2.14) is proved, and thus (2.14) implies that ˆ r ( r − t ) α − II ( t )d t ≤ C τ β ˆ r ( r − s ) α − (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p d s ≤ C τ β k f − F P k pL pα (0 ,T ; H ) . For II ( t ), we again apply Fubini’s theorem to obtain ˆ r ( r − t ) α − II ( t )d t = ˆ r (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p ˆ rs ( r − t ) α − (cid:0) ( t − s ) β − − ( t − s + τ ) β − (cid:1) d t d s. To conclude, we claim that(2.15) A = ˆ rs ( r − t ) α − (cid:0) ( t − s ) β − − ( t − s + τ ) β − (cid:1) d t ≤ C τ β ( r − s ) α − , W. LI AND A.J. SALGADO for a constant C depending on α and β . Indeed, if this is the case, we have ˆ r ( r − t ) α − II ( t )d t ≤ C τ β ˆ r ( r − s ) α − (cid:13)(cid:13) f ( s ) − F P ( s ) (cid:13)(cid:13) p d s ≤ C τ β k f − F P k pL pα (0 ,T ; H ) , and we combine the estimates for II ( t ) and II ( t ) together and conclude the proof of (2.11).Let us now turn to the proof of (2.15). First, if r − s ≤ τ then it suffices to observe that A ≤ ˆ rs ( r − t ) α − ( t − s ) β − d t = Γ( α )Γ( β )Γ( α + β ) ( r − s ) α + β − ≤ Γ( α )Γ( β )Γ( α + β ) τ β ( r − s ) α − . Now, if r − s > τ , we estimate as A = ˆ rs ( r − t ) α − ( t − s ) β − d t − ˆ rs ( r − t ) α − ( t − s + τ ) β − d t = Γ( α )Γ( β )Γ( α + β ) ( r − s ) α + β − − ˆ r − s − τ ( t + τ ) β − ( r − t − s ) α − d t + ˆ τ ( r − s − t + τ ) α − t β − d t = Γ( α )Γ( β )Γ( α + β ) (cid:0) ( r − s ) α + β − − ( r − s + τ ) α + β − (cid:1) + ˆ τ ( r − s − t + τ ) α − t β − d t. The first term can be bounded using that r − s > τ as follows( r − s ) α + β − − ( r − s + τ ) α + β − ≤ max { α + β − , } τ ( r − s ) α + β − ≤ τ β ( r − s ) α − . On the other hand, since for t ∈ (0 , τ ) we have that r − s + τ − t ≥ r − s , the second term can beestimated as ˆ τ ( r − s − t + τ ) α − t β − d t ≤ ( r − s ) α − ˆ τ t β − d t = 1 α ( r − s ) α − τ β . This concludes the proof. (cid:3)
We refer the reader to [27, section 4] for further results concerning the space L pα (0 , T ; H ).2.3. The Caputo derivative.
As we mentioned in the Introduction, the definition of the Caputoderivative, given in (1.1) seems unnatural. Smoothness of higher order is needed to define a fractionalderivative. Several attempts at resolving this discrepancy have been proposed in the literature andwe here quickly describe a few of them.First, one of the main reasons that motivate practitioners to use, among the many possibledefinitions, the Caputo derivative (1.1) is, first, that D αc w ( t ) = w for t ≤
0. Therefore,(2.16) D αc w ( t ) = 1Γ(1 − α ) ˆ t −∞ ˙ w ( r )( t − r ) α d r = 1Γ(1 − α ) ˆ t −∞ ( w ( r ) − w ( t ))˙( t − r ) α d r = 1Γ( − α ) ˆ t −∞ w ( r ) − w ( t )( t − r ) α +1 d r = D αm w ( t ) , where, in the last step, we integrated by parts. The expression D αm w ( t ) is known as the Marchaudderivative of order α of the function w . This is the way that the Caputo derivative has beenunderstood, for instance, in [6, 5, 7, 4]. We comment, in passing, that owing to [9] this fractionalderivative satisfies an extension problem similar to the (by now) classical Caffarelli Silvestre extension[10, 34] for the fractional Laplacian. IME FRACTIONAL GRADIENT FLOW 9
Another approach, and the one we shall adopt here, is to notice that (1.1) can be converted, forsufficiently smooth functions, into a Volterra type equation(2.17) w ( t ) = w (0) + 1Γ( α ) ˆ t ( t − s ) α − D αc w ( s )d s, ∀ t ∈ [0 , T ] . This identity is the beginning of the theory developed in [25] to extend the notion of Caputo deriv-ative. To be more specific, [25] considers the set of distributions E T = { w ∈ D ′ ( R ; H ) : ∃ M w ∈ ( −∞ , T ) , supp( w ) ⊂ [ − M w , T ) } . for a fixed time T >
0. Then the modified Riemann Liouville derivative for any distribution w ∈ E T is defined, following classical references like [18, Section 1.5.5], as D αrl w = w ∗ g − α ∈ E T where g − α ( t ) = − α ) D ( θ ( t ) t − α ), with θ being the Heaviside function, is a distribution supported in[0 , ∞ ) and the convolution is understood as the generalized definition between distributions. Here D denotes the distributional derivative. Reference [25] then uses this to define the generalized Caputoderivative of w ∈ L ([0 , T ); H ) associated with w by D αc w = D αrl ( w − w ) . If there exists w (0) ∈ H such that lim t ↓ ffl t k w ( s ) − w (0) k d s = 0, then we always impose w = w (0)in this definition. It is shown in [25, Theorem 3.7] that for such a function w , (2.17) holds forLebesgue a.e. t ∈ (0 , T ) provided that the generalized Caputo derivative D αc w ∈ L ([0 , T ); H ).We also comment that [25, Proposition 3.11(ii)] implies that for every function w ∈ L (0 , T ; H )with D αc w ∈ L (0 , T ; H ) we have(2.18) 12 D αc k w k ( t ) ≤ h D αc w ( t ) , w ( t ) i . Finally, we recall that the Mittag-Leffler function of order α ∈ (0 ,
1) is defined via E α ( z ) = ∞ X k =0 z k Γ( αk + 1) . We refer the reader to [19] for an extensive treatise on this function. Here we just mention that thisfunction satisfies, for any λ ∈ R , the identity(2.19) D αc E α ( λt α ) = λE α ( λt α ) , E α (0) = 1 . An auxiliary estimate.
Having defined the Caputo derivative of a function, we present anauxiliary result. Namely, an estimate on functions that have piecewise constant, over some partition P , Caputo derivative. Lemma 2.6 (continuity) . Let p ∈ [1 , ∞ ) ; P be a partition, as in (2.2) , of [0 , T ] ; and w ∈ L (0 , T ; H ) be such that its generalized Caputo derivative D αc w ∈ L pα (0 , T ; H ) , and it is piecewise constant over P . Then we have (2.20) sup r ∈ [0 ,T ] ˆ r ( r − t ) α − k w ( ⌈ t ⌉ P ) − w ( t ) k p d t ≤ Cτ pα k D αc w k pL pα (0 ,T ; H ) , where the constant C depends only on α .Proof. The representation (2.17) allows us to write w ( ⌈ t ⌉ P ) − w ( t ) =1Γ( α ) " ˆ t D αc w ( s ) (cid:0) ( ⌈ t ⌉ P − s ) α − − ( t − s ) α − (cid:1) d s + ˆ ⌈ t ⌉ P t D αc w ( s )( ⌈ t ⌉ P − s ) α − d s . Therefore by H¨older inequality, we have k w ( ⌈ t ⌉ P ) − w ( t ) k p ≤ p ( α ) ˆ t (cid:12)(cid:12) ( ⌈ t ⌉ P − s ) α − − ( t − s ) α − (cid:12)(cid:12) d s + ˆ ⌈ t ⌉ P t ( ⌈ t ⌉ P − s ) α − d s ! p − ˆ t k D αc w ( s ) k p (cid:12)(cid:12) ( ⌈ t ⌉ P − s ) α − − ( t − s ) α − (cid:12)(cid:12) d s + ˆ ⌈ t ⌉ P t k D αc w ( s ) k p ( ⌈ t ⌉ P − s ) α − d s ! ≤ Cτ α ( p − ˆ t k D αc w ( s ) k p (cid:12)(cid:12) ( ⌈ t ⌉ P − s ) α − − ( t − s ) α − (cid:12)(cid:12) d s + ( ⌈ t ⌉ P − t ) α α k D αc w ( t ) k p ! = C τ α ( p − ˆ t k D αc w ( s ) k p (cid:12)(cid:12) ( ⌈ t ⌉ P − s ) α − − ( t − s ) α − (cid:12)(cid:12) d s + C τ pα k D αc w ( t ) k p = I ( t ) + I ( t ) , where the constants C , C , and C depend only on p and α .For I ( t ), we simply have ˆ r ( r − t ) α − I ( t )d t ≤ Cτ pα k D αc w k pL pα (0 ,T ; H ) . Now to bound the integral for I ( t ), we use Fubini’s theorem to get ˆ r ( r − t ) α − I ( t )d t = C τ ( p − α ˆ r k D αc w ( s ) k p ˆ rs ( r − t ) α − (cid:12)(cid:12) ( ⌈ t ⌉ P − s ) α − − ( t − s ) α − (cid:12)(cid:12) d t d s. We claim that(2.21) ˆ rs ( r − t ) α − (cid:12)(cid:12) ( ⌈ t ⌉ P − s ) α − − ( t − s ) α − (cid:12)(cid:12) d t ≤ C ( r − s ) α − τ α , where C only depends α . If this is true, then we have ˆ r ( r − t ) α − I ( t )d t ≤ Cτ pα ˆ r k D αc w ( s ) k p ( r − s ) α − d s ≤ Cτ pα k D αc w k pL pα (0 ,T ; H ) . The proof of (2.21) proceeds as the one for (2.15). For brevity we skip the details. (cid:3)
Some comparison estimates.
As a final preparatory step we present some auxiliary resultsthat shall be repeatedly used and are related to differential inequalities involving the Caputo deriv-ative, and a Gr¨onwall-like lemma.First, we present a comparison principle which is similar to [17, Proposition 4.2]. The proof canbe done easily by contradiction, and therefore it is omitted here.
Lemma 2.7 (comparison) . Let g , g : [0 , T ] × R → R be both nondecreasing in their second argumentand g be measurable. Assume that v, w ∈ C ([0 , T ]; R ) satisfy v (0) < w (0) , and there is some α ∈ (0 , , for which v ( t ) ≤ g ( t, v ( t )) + 1Γ( α ) ˆ t ( t − s ) α − g ( s, v ( s ))d s,w ( t ) > g ( t, w ( t )) + 1Γ( α ) ˆ t ( t − s ) α − g ( s, w ( s ))d s, for every t ∈ [0 , T ] . Then we have v < w on [0 , T ] . We now present a result that can be interpreted as an extension of [30, Lemma 3.7] to the fractionalcase. However, unlike the classical case, here we have the restriction that λ ≥ Lemma 2.8 (fractional Gr¨onwall) . Let a ∈ C ([0 , T ]; R ) with D αc a ∈ L loc ([0 , T ); R ) , b, c, d : [0 , T ] → [0 , + ∞ ] be measurable functions, and λ ≥ . If the following differential inequality is satisfied (2.22) D αc a ( t ) + b ( t ) ≤ λa ( t ) + c ( t ) + 2 d ( t ) a ( t ) , a.e. t ∈ (0 , T ) , IME FRACTIONAL GRADIENT FLOW 11 then we have sup t ∈ [0 ,T ] a ( t ) + 1Γ( α ) k b k L α (0 ,T ; R ) ! / ≤ e D ( T ) E α (2 λT α ) + q a (0) + e C ( T ) p E α (2 λT α ) where (2.23) e C ( t ) = 1Γ( α ) k c k L α (0 ,t ; R ) , e D ( t ) = 1Γ( α ) k d k L α (0 ,t ; R ) . Proof.
From (2.22) we obtain that(2.24) a ( t ) + 1Γ( α ) ˆ t ( t − s ) α − b ( s )d s ≤ a (0) + 1Γ( α ) ˆ t ( t − s ) α − (cid:2) c ( s ) + 2 d ( s ) a ( s ) + 2 λa ( s ) (cid:3) d s ≤ a (0) + e C ( t ) + 2 e a ( t ) e D ( t ) + 2 λ Γ( α ) ˆ t ( t − s ) α − e a ( s ) d s, where e a ( t ) = max ≤ s ≤ t a ( s ) and the functions e C, e D are defined in (2.23). This immediately impliesthat e a ( t ) ≤ a (0) + e C ( t ) + 2 e a ( t ) e D ( t ) + 2 λ Γ( α ) ˆ t ( t − s ) α − e a ( s )d s. In order to bound e a , we construct a barrier function e ( t ) = K p E α (2 λt α ) where the constant K ischosen so that e ( t ) > a (0) + e C ( t ) + 2 e ( t ) e D ( t ) + 2 λ Γ( α ) ˆ t ( t − s ) α − e ( s )d s, ∀ t ∈ (0 , T ) . Indeed, owing to (2.19) we see that2 λ Γ( α ) ˆ t ( t − s ) α − E α (2 λs α ) d s = E α (2 λt α ) − E α (0) = E α (2 λt α ) − a (0) + e C ( t ) + e ( t ) e D ( t ) + 2 λ Γ( α ) ˆ t ( t − s ) α − e ( s ) d s = a (0) + e C ( t ) + 2 K p E α (2 λt α ) e D ( t ) + K ( E α (2 λt α ) − < K E α (2 λt α ) = e ( t ) , for every t ∈ (0 , T ) provided that(2.25) K > e D ( T ) p E α (2 λT α ) + q a (0) + e C ( T ) + e D ( t ) E α (2 λT α ) . Applying Lemma 2.7 we obtain that e a ( t ) ≤ e ( t ) = K p E α (2 λt α ) . Plugging this back into (2.24) and noticing that this holds for any K satisfying (2.25) we obtainthat sup t ∈ [0 ,T ] a ( t ) + 1Γ( α ) ˆ t ( t − s ) α − b ( s )d s ≤ (cid:18) e D ( T ) p E α (2 λT α ) + q a (0) + e C ( T ) + e D ( t ) E α (2 λT α ) (cid:19) E α (2 λT α ) ≤ (cid:18) e D ( T ) E α (2 λT α ) + q a (0) + e C ( T ) p E α (2 λT α ) (cid:19) which is the desired result. (cid:3) Deconvolutional discretization of the Caputo derivative
To discretize the Caputo fractional derivative, references [26, 28] consider a so-called deconvolu-tional scheme on uniform time grids and prove some properties of this discretization. In this section,we generalize this deconvolutional scheme to the variable time step setting, and prove propertiesthat will be useful in deriving a posteriori error estimates later, in Section 5.2.3.1.
The discrete Caputo derivative.
Let P be a partition as in (2.2). To motivate this dis-cretization, let us assume that w : [0 , T ] → H is such that D αc w ( t ) is piecewise constant on thepartition P , with D αc w ( t ) = V n ( t ) . Then formally by (2.17), we have(3.1) w ( t n ) = w (0) + 1Γ( α ) ˆ t n ( t n − s ) α − D αc w ( s )d s = w (0) + 1Γ( α + 1) n X i =1 (( t n − t i − ) α − ( t n − t i ) α ) V i , n ∈ { , . . . , N } . Let K P ∈ R N × N be the matrix induced by the partition P , which is defined as(3.2) K P ,ni = α + 1) (cid:16) ( t n − t i − ) α − ( t n − t i ) α (cid:17) , ≤ i ≤ n ≤ N, , ≤ n < i ≤ N. Then we can rewrite (3.1) in matrix form as W = W + K P V , where V , W , W ∈ H N with V n = V n , W n = w ( t n ), and ( W ) n = w (0). Notice that K P islower triangular and all the elements on and below the main diagonal are positive. Therefore K P isinvertible and its inverse is also lower triangular. Thus, the previous identity is equivalent to V = K − P ( W − W ) , in other words V n = n X i =1 K − P ,ni ( W i − W ) = K − P ,n W + n X i =1 K − P ,ni W i , where we set K − P ,n = − P nj =1 K − P ,nj . This motivates the following approximation of the Caputoderivative provided W ∈ H N and W ∈ H are given. For n ∈ { , . . . , N } we set(3.3) ( D α P W ) n = n X i =1 K − P ,ni ( W i − W ) = n X i =0 K − P ,ni W i = n − X i =0 K − P ,ni ( W i − W n ) . Properties of K − P . We note that, when the partition is uniform, both K P and its inversewill be Toeplitz matrices, and hence the product K P V can be interpreted as the convolution ofsequences. Consequently, multiplication by K − P is equivalent to taking a sequence deconvolution.This motivates the name of this scheme and enables [28] to apply techniques for the deconvolutionof a completely monotone sequence and prove properties of K − P .We were not successful in extending, to a general partition P , all the properties of K − P presentedin [28] for the case when the partition is uniform. This is mainly because their techniques are basedon ideas that rely on completely monotone sequences, which do not easily extend to a general P .Nevertheless we have obtained sufficient, for our purposes, properties. The following result is thecounterpart to [28, Proposition 3.2(1)]. IME FRACTIONAL GRADIENT FLOW 13
Proposition 3.1 (properties of K − P ) . Let P be a partition as in (2.2) , and K P be defined in (3.2) .The matrix K P is invertible, and its inverse satisfies: K − P ,n = − n X j =1 K − P ,nj < , n ∈ { , . . . , N } , (3.4) K − P ,ii > i ∈ { , . . . , N } , K − P ,ni < ≤ i < n ≤ N. (3.5) Proof.
We already showed that K P is nonsingular. We prove (3.4) and (3.5) separately.First, to prove that K − P ,n <
0. For this, it suffices to show that for a vector W ∈ R N such that W i = 1 for any i ≥
1, then the vector F = K − P W satisfies F n > ∀ n ≥ . We prove this by induction on n . For n = 1, clearly F = W K P , , = 1 K P , , > . Suppose that F j > ≤ j ≤ k , now we want to show that F k +1 > W k = k X j =1 K P ,k,j F j , W k +1 = k +1 X j =1 K P ,k +1 ,j F j , then taking the difference we have(3.6) 0 = k +1 X j =1 K P ,k +1 ,j F j − k X j =1 K P ,k,j F j = K P ,k +1 ,k +1 F k +1 + k X j =1 ( K P ,k +1 ,j − K P ,k,j ) F j . We claim that K P ,k +1 ,j − K P ,k,j < j . In fact, this can be seen through the definition ofthe entries of K P K P ,k +1 ,j − K P ,k,j < ⇐⇒ ( t k +1 − t j − ) α − ( t k +1 − t j ) α < ( t k − t j − ) α − ( t k − t j ) α ⇐⇒ ˆ t j − t j − ( t k +1 − t j + s ) α − d s < ˆ t j − t j − ( t k − t j + s ) α − d s. Using K P ,k +1 ,j − K P ,k,j < F j > j ∈ { , . . . , k } in (3.6), we see that K P ,k +1 ,k +1 F k +1 > F k +1 >
0. Therefore by induction we proved that K − P ,n < n ≥ K − P ,ii > K − P ,ni <
0. Consider a vector W ∈ R N that is such that W i = 1 and W j = 0 for j = i . It suffices to prove that for, F = K − P W , we have F i > n > i (3.7) F n < . Since K − P is lower triangular, we know F j = 0 for j ∈ { , . . . , i − } . From K P F = W , we see that1 = W i = ( K P F ) i = i X j =1 K P ,ij F j = K − P ,ii F i and thus F i = 1 / K P ,ii >
0. Now we prove by induction that (3.7) holds. First, when n = i + 1, wehave 0 = W i +1 = ( K P F ) i +1 = K P ,i +1 ,i F i + K P ,i +1 ,i +1 F i +1 and hence F i +1 = − K P ,i +1 ,i F i K P ,i +1 ,i +1 < . This shows that (3.7) is true for n = i + 1. Now suppose that we have already shown that F n < n satisfying n ∈ { i + 1 , . . . , k } , we want to prove F k +1 <
0. To this aim, notice that0 = W k +1 = ( K P F ) k +1 = k X j = i K P ,k +1 ,j F j + K P ,k +1 ,k +1 F k +1 , therefore we only need to show P kj = i K P ,k +1 ,j F j >
0. Recall that0 = W k = ( K P F ) k = k X j = i K P ,k,j F j , and thus, since K P ,k,i >
0, we can get k X j = i K P ,k +1 ,j F j = k X j = i K P ,k +1 ,j F j − K P ,k +1 ,i K P ,k,i k X j = i K P ,k,j F j = k X j = i +1 (cid:18) K P ,k +1 ,j − K P ,k +1 ,i K P ,k,i K P ,k,j (cid:19) F j . Since by the induction hypothesis F j < j ∈ { i + 1 , . . . , k } , it only remains to show that K P ,k +1 ,j − K P ,k +1 ,i K P ,k,i K P ,k,j < ⇐⇒ K P ,k +1 ,i K P ,k,i > K P ,k +1 ,j K P ,k,j . Applying Cauchy’s mean value theorem, there exists η ∈ ( t k − t i , t k − t i − ) such that K P ,k +1 ,i K P ,k,i = ( t k +1 − t i − ) α − ( t k +1 − t i ) α ( t k − t i − ) α − ( t k − t i ) α = α ( η + τ k +1 ) α − αη α − = (cid:18) η + τ k +1 η (cid:19) α − . Similarly there exists ξ ∈ ( t k − t j , t k − t j − ) such that K P ,k +1 ,j K P ,k,j = (cid:18) ξ + τ k +1 ξ (cid:19) α − . Due to j > i , we have ξ < η and hence K P ,k +1 ,j K P ,k,j = (cid:18) ξ + τ k +1 ξ (cid:19) α − < (cid:18) η + τ k +1 η (cid:19) α − = K P ,k +1 ,i K P ,k,i . Therefore from the arguments above we see that F k +1 <
0, and by induction K − P ,ni < n > i . (cid:3) Remark 3.2 (generalization) . The discretization of the Caputo derivative, described in (3.3) , andits properties presented in Proposition 3.1 can be extended to more general kernels. Indeed, for ageneral convolutional kernel g ∈ L (0 , T ; R ) the entries of the matrix K P will be K P ,ni = ˆ t n − t i t n − t i − g ( t )d t. The proof of (3.4) follows verbatim provided g ′ ( t ) < , as the reader can readily verify. The proofof (3.5) only requires that the function G ( t ) = ln( g ( t )) , satisfies G ′′ ( t ) > . For a uniform time grid P , [26, Theorem 2.3] proves that, for every i , the sequence {− K − P ,n + i,i } n ≥ is completely monotone. The following result holds for a general partition P , and is a direct conse-quence of [26, Theorem 2.3] for uniform time stepping. Proposition 3.3 (monotonicity) . Let P be a partition of [0 , T ] as in (2.2) , and K P be defined asin (3.2) . Then, its inverse satisfies:1. For n ∈ { , . . . , N − } , (3.8) − n X j =1 K − P ,nj = K − P ,n < K − P ,n +1 , = − n +1 X j =1 K − P ,n +1 ,j . IME FRACTIONAL GRADIENT FLOW 15
2. For ≤ i < n < N , (3.9) K − P ,ni < K − P ,n +1 ,i . Proof.
To prove (3.8) it suffices to show that for a vector W ∈ R N such that W i = 1 for any i ≥ F = K − P W satisfies F n > F n +1 ∀ n ≥ . We prove this by induction on n . For n = 1,1 = W = ( K P F ) = K P , F , W = ( K P F ) = K P , F + K P , F = ( K P , + K P , ) F + K P , ( F − F ) . Clearly, F > , K P , = ( t − t ) α < ( t − t ) α = K P , + K P , . Hence we have K P , ( F − F ) = 1 − ( K P , + K P , ) F < − K P , F = 0 , which, since K P , >
0, implies that F − F <
0, i.e. F > F . So the claim holds for n = 1.Suppose F j +1 < F j for all 1 ≤ j < k , now we want to show that F k +1 < F k as well. Notice that1 = W k = k X i =1 K P ,ki F i = k − X i =0 k X j = i +1 K P ,kj ( F i +1 − F i ) = k − X i =0 ( t k − t i ) α ( F i +1 − F i ) , W k +1 = k +1 X i =1 K P ,k +1 ,i F i = k X i =0 ( t k +1 − t i ) α ( F i +1 − F i ) , where we set F = 0 in the equations above. Therefore to show F k +1 < F k , we only need to provethat(3.10) 0 < k − X i =0 ( t k +1 − t i ) α ( F i +1 − F i ) − k − X i =0 ( t k +1 − t i ) α ( F i +1 − F i ) − k − X i =0 ( t k − t i ) α ( F i +1 − F i )= k − X i =0 (cid:0) ( t k +1 − t i ) α − ( t k − t i ) α (cid:1) ( F i +1 − F i ) . Since we also have1 = W k − = k − X i =1 K P ,k − ,i F i = k − X i =0 ( t k − − t i ) α ( F i +1 − F i ) = k − X i =0 ( t k − − t i ) α ( F i +1 − F i ) , Taking the difference between the equation above and the one for W k , we obtain that0 = W k − W k − = k − X i =0 ( t k − t i ) α ( F i +1 − F i ) − k − X i =0 ( t k − − t i ) α ( F i +1 − F i )= k − X i =0 (cid:0) ( t k − t i ) α − ( t k − − t i ) α (cid:1) ( F i +1 − F i )In light of this identity, we claim that to obtain (3.10) it suffices to show that(3.11) t αk +1 − t αk t αk − t αk − = ( t k +1 − t ) α − ( t k − t ) α ( t k − t ) α − ( t k − − t ) α > ( t k +1 − t i ) α − ( t k − t i ) α ( t k − t i ) α − ( t k − − t i ) α , i ∈ { , . . . , k − } . If this is true, letting c = (cid:0) t αk +1 − t αk (cid:1) / (cid:0) t αk − t αk − (cid:1) we have: k − X i =0 (cid:0) ( t k +1 − t i ) α − ( t k − t i ) α (cid:1) ( F i +1 − F i )= k − X i =0 (cid:16)(cid:0) ( t k +1 − t i ) α − ( t k − t i ) α (cid:1) − c (cid:0) ( t k − t i ) α − ( t k − − t i ) α (cid:1)(cid:17) ( F i +1 − F i )= k − X i =1 (cid:16)(cid:0) ( t k +1 − t i ) α − ( t k − t i ) α (cid:1) − c (cid:0) ( t k − t i ) α − ( t k − − t i ) α (cid:1)(cid:17) ( F i +1 − F i )= k − X i =1 d i ( F i +1 − F i ) , where d i = (cid:0) ( t k +1 − t i ) α − ( t k − t i ) α (cid:1) − c (cid:0) ( t k − t i ) α − ( t k − − t i ) α (cid:1) < F i +1 − F i < ≤ i ≤ k −
1, so the equation above implies (3.10), andhence F k +1 < F k is proved.To finish the proof, we focus on (3.11), fix i and define c = t k − − t i , c = t k − t i , c = t k +1 − t i and function h ( x ) = ( x + c ) α − ( x + c ) α ( x + c ) α − ( x + c ) α . Then (3.11) is equivalent to h ( t i − t ) > h (0), and it remains to show that h ( x ) is strictly increasingfor x >
0. We observe thatdd x (ln( h ( x ))) = α (cid:20) ( x + c ) α − − ( x + c ) α − ( x + c ) α − ( x + c ) α − ( x + c ) α − − ( x + c ) α − ( x + c ) α − ( x + c ) α (cid:21) . Applying Cauchy’s mean-value theorem to the two fractions above, we know there exists η ∈ ( x + c , x + c ) and ξ ∈ ( x + c , x + c ) such thatdd x (ln( h ( x ))) = α (cid:20) ( α − η α − αη α − − ( α − ξ α − αξ α − (cid:21) = ( α − (cid:0) η − − ξ − (cid:1) > , where the last inequality holds because α < ξ < x + c < η . This shows the monotonicity offunction h and confirms (3.11). This concludes the inductive step and proves (3.8).The proof of (3.9) is obtained similarly. For convenience we only write the proof for i = 1, butthe extension to general i is straightforward. Consider a vector W ∈ R N such that W j = 1 if j = 1and W j = 0 if j = 1, then it suffices to prove that vector F = K − P W satisfies(3.12) F n < F n +1 for n ∈ { , . . . , N − } . We prove (3.12) by induction on n . For n = 2, observe that W k = k X j =0 ( t k − t j ) α ( F j +1 − F j ) = k − X j =0 ( t k − t j ) α ( F j +1 − F j )from the proof of (3.8) with F = 0, we have1 = W = ( t − t ) α ( F − F )0 = W = ( t − t ) α ( F − F ) + ( t − t ) α ( F − F )0 = W = ( t − t ) α ( F − F ) + ( t − t ) α ( F − F ) + ( t − t ) α ( F − F )From the first and second equation above, we see that F > F − F <
0. Combining thesecond and the third equation we deduce that0 = W − t α t α W = (cid:20) ( t − t ) α − ( t − t ) α t α t α (cid:21) ( F − F ) + ( t − t ) α ( F − F ) . IME FRACTIONAL GRADIENT FLOW 17
Since ( t − t ) α − ( t − t ) α ( t /t ) α = ( t − t ) α − ( t − ( t t /t )) α >
0, we obtain that F − F > n = 2.It also remains to prove that when (3.12) holds for n ∈ { , . . . , k − } , then it also holds for n = k ,i.e. F k < F k +1 , provided that k < N . To this aim, we first see that(3.13) 0 = W k +1 − t αk +1 t αk W k = k X j =1 (cid:18) ( t k +1 − t j ) α − ( t k − t j ) α t αk +1 t αk (cid:19) ( F j +1 − F j ) . Therefore in order to prove F k < F k +1 , we only need to show that(3.14) k − X j =1 (cid:18) ( t k +1 − t j ) α − ( t k − t j ) α t αk +1 t αk (cid:19) ( F j +1 − F j ) < . Similar to (3.13) we also have0 = W k − t αk t αk − W k − = k − X j =1 (cid:18) ( t k − t j ) α − ( t k − − t j ) α t αk t αk − (cid:19) ( F j +1 − F j ) . Thanks to the inductive hypothesis, we know that F j +1 − F j < j = 2 and F j +1 − F j > j ∈ { , . . . , k − } , Therefore using a similar argument used in the proof for (3.8), to prove (3.14)we only need to show(3.15)( t k +1 − t ) α − ( t k − t ) α ( t k +1 /t k ) α ( t k − t ) α − ( t k − − t ) α ( t k /t k − ) α > ( t k +1 − t j ) α − ( t k − t j ) α ( t k +1 /t k ) α ( t k − t j ) α − ( t k − − t j ) α ( t k /t k − ) α , j ∈ { , . . . , k − } , which is similar to (3.11). We rewrite the inequality above as(1 − t /t k +1 ) α − (1 − t /t k ) α (1 − t /t k ) α − (1 − t /t k − ) α > (1 − t j /t k +1 ) α − (1 − t j /t k ) α (1 − t j /t k ) α − (1 − t j /t k − ) α , j ∈ { , . . . , k − } , and define the function h ( x ) = (1 − x/t k +1 ) α − (1 − x/t k ) α (1 − x/t k ) α − (1 − x/t k − ) α , then it suffices to show that h ′ ( x ) < < x < t k − . Observing thatdd x ln( h ( x )) = − αx (cid:20) ( x/t k +1 )(1 − x/t k +1 ) α − − ( x/t k )(1 − x/t k ) α − (1 − x/t k +1 ) α − (1 − x/t k ) α − ( x/t k )(1 − x/t k ) α − − ( x/t k − )(1 − x/t k − ) α − (1 − x/t k ) α − (1 − x/t k − ) α (cid:21) . Letting h ( x ) = (1 − x ) x α − , h ( x ) = x α , by Cauchy’s mean-value theorem, there exists η ∈ (1 − x/t k , − x/t k +1 ) and ξ ∈ (1 − x/t k − , − x/t k ) such thatdd x (ln( h ( x ))) = − αx (cid:18) h ′ ( η ) h ′ ( η ) − h ′ ( ξ ) h ′ ( ξ ) (cid:19) = − αx (cid:18)(cid:18) α − αη − (cid:19) − (cid:18) α − αξ − (cid:19)(cid:19) < < ξ < η . This implies that h ′ ( x ) < < x < t k − and finishes inductive step of theinduction. Hence (3.9) is proved. (cid:3) Remark 3.4 (generalization) . Notice that, for a general kernel g , property (3.8) remains validprovided G ( t ) = ln( g ( t )) satisfies G ′′ ( t ) > . α = 0 . α = 0 . α = 0 . Figure 1.
Given a partition P , the figure shows the nonlocal basis functions { ϕ P ,i } Ni =0 for different values of α . Every function whose Caputo derivative ispiecewise constant can be written as a linear combination of these functions. No-tice that, for any partition point ϕ P ,i ( t j ) = δ ij . In addition, Proposition 3.5 showsthat these functions form a partition of unity.3.3. A continuous interpolant.
Given a partition P , a sequence W ∈ H N , and W ∈ H , wedefined the discrete Caputo derivative ( D α P W ) n via (3.3). Motivated by the Volterra type equation(2.17) between a continuous function w and its Caputo derivative D αc w , it is possible, following [28],to define, over P , a natural continuous interpolant of W n by(3.16) c W P ( t ) = W + 1Γ( α ) ˆ t ( t − s ) α − V P ( s )d s where V P is defined by(3.17) V P ( t ) = ( D α P W ) n ( t ) . By definition, we have that c W P ( t n ) = W n . Moreover,(3.18) c W P ( t ) = W + 1Γ( α + 1) n − X j =1 (( t − t j − ) α − ( t − t j ) α ) ( D α P W ) j + ( t n − t ) α ( D α P W ) n = n ( t ) X i =0 W i ϕ P ,i ( t ) , where we defined(3.19) ϕ P , ( t ) = 1 + 1Γ( α + 1) n ( t ) − X j =1 (( t − t j − ) α − ( t − t j ) α ) K − P ,j + ( t n − t ) α K − P ,n ,ϕ P ,i ( t ) = 1Γ( α + 1) n ( t ) − X j = i (( t − t j − ) α − ( t − t j ) α ) K − P ,ji + ( t n − t ) α K − P ,ni , i ∈ { , . . . , N } . The functions { ϕ P ,i } Ni =0 play the role, in this context, of the standard “hat” basis functions usedfor piecewise linear interpolation over a partition P . Indeed, they are such that any function withpiecewise constant (Caputo) derivative can be written as a linear combination of them. Figure 1illustrates the behavior of these functions. As expected, and in contrast to the hat basis functions,these functions are nonlocal, in the sense that they have global support. Something worth noticingis also that the figure seems to indicate that, as α ↓
0, the functions resemble piecewise constantsand, in contrast, when α ↑ t ∈ [0 , T ] we have P n ( t ) i =0 ϕ P ,i ( t ) = 1. The following result shows that IME FRACTIONAL GRADIENT FLOW 19 ϕ P ,i ( t ) ≥
0. Thus, for any t ∈ [0 , T ], c W P ( t ) is a convex combination of its nodal values { W j } Nj =0 .This observation will be crucial to derive an a posteriori error estimate in Section 5.2. Proposition 3.5 (positivity) . Let P be a partition defined as in (2.2) . Let the functions { ϕ P ,i } Ni =0 be defined as in (3.19) . Then, for any i ∈ { , . . . , N } and t ∈ [0 , T ] , we have ϕ P ,i ( t ) ≥ . Inaddition, for t / ∈ P and i ∈ { , . . . , n ( t ) } we have ϕ P ,i ( t ) > .Proof. By definition, for t = t n , we have ϕ P ,n ( t n ) = 1 and ϕ P ,i ( t n ) = 0 for any i = n . Also, for i > n ( t ), we see that ϕ P ,i ( t ) = 0, and hence it only remains to show that ϕ P ,i ( t ) > i ≤ n ( t ).To show this, consider W i = 1 and W j = 0 for j = i , a piecewise constant V P and its interpolation c W P defined in (3.16) and (3.17). Then our goal is to show that c W P ( t ) > i = n ( t ) >
0, then it is easy to check by definition that ( D α P W ) n > D α P W ) j = 0 for j ∈ { , . . . , i − } . Therefore we obtain c W P ( t ) = 1Γ( α ) ˆ t ( t − s ) α − V ( s )d s = ( t n − t ) α Γ( α + 1) ( D α P W ) n > . If i < n ( t ), the proof is not that straightforward. The trick is to insert the time t , which isnot on the partition P , to get a new partition P ′ = P ∪ { t } and then apply Propositions 3.1 and3.3 in an appropriate way. Let us now work out the details. Let P ′ = { t ′ k } N +1 k =0 and notice that t ′ n ( t ) = t, t ′ n ( t )+1 = t n ( t ) . On the basis of this partition we define the vector W ′ ∈ H N +1 via W ′ j = c W P ( t ′ j ), then since V P is constant on ( t ′ n ( t ) − , t ′ n ( t )+1 ] = ( t n ( t ) − , t n ( t ) ], we have( D α P ′ W ′ ) n ( t ) = ( D α P ′ W ′ ) n ( t )+1 . Since the only possible nonzero components of W ′ are W ′ i = W i = 1 and W ′ n ( t ) = c W P ( t ), thereforewe deduce from the equality above that K − P ′ ,n ( t ) i W ′ i + K − P ′ ,n ( t ) n ( t ) W ′ n ( t ) = ( D α P ′ W ′ ) n ( t ) = ( D α P ′ W ′ ) n ( t )+1 = K − P ′ ,n ( t )+1 ,i W ′ i + K − P ′ ,n ( t )+1 ,n ( t ) W ′ n ( t ) , which can be rearranged as K − P ′ ,n ( t )+1 ,i − K − P ′ ,n ( t ) i = c W P ( t ) (cid:16) K − P ′ ,n ( t ) n ( t ) − K − P ′ ,n ( t )+1 ,n ( t ) (cid:17) . From Proposition 3.3 we see that K − P ′ ,n ( t )+1 ,i − K − P ′ ,n ( t ) i > K − P ′ ,n ( t ) n ( t ) − K − P ′ ,n ( t )+1 ,n ( t ) > K − P ′ ,n ( t ) n ( t ) > K − P ′ ,n ( t )+1 ,n ( t ) <
0. Thisleads to the fact that c W P ( t ) > (cid:3) Time fractional gradient flow: Theory
We have now set the stage for the study of time fractional gradient flows, which were formallydescribed in (1.2). Throughout the remaining of our discussion we shall assume that the initialcondition satisfies u ∈ D (Φ) and that f ∈ L α (0 , T ; H ). We begin by commenting that the case f = 0 was already studied in [28, Section 5] where they studied so-called strong solutions , see [28,Definition 5.4]. Here we trivially extend their definition to the case f = 0. Definition 4.1 (strong solution) . A function u ∈ L loc ([0 , T ); H ) is a strong solution to (1.2) if(i) (Initial condition) lim t ↓ t k u ( s ) − u k d s = 0 . (ii) (Regularity) D αc u ( t ) ∈ L loc ([0 , T ); H ) .(iii) (Evolution) For almost every t ∈ [0 , T ) , we have f ( t ) − D αc u ( t ) ∈ ∂ Φ( u ( t )) . Energy solutions.
Since H is a Hilbert space, we will mimic the theory for classical gradientflows and introduce the notion of energy solutions for (1.2). To motivate it, suppose that at some t ∈ (0 , T ) f ( t ) − D αc u ( t ) ∈ ∂ Φ( u ( t )) , then, by definition of the subdifferential, this is equivalent to the evolution variational inequality (EVI)(4.1) h D αc u ( t ) , u ( t ) − w i + Φ( u ( t )) − Φ( w ) ≤ h f ( t ) , u ( t ) − w i , ∀ w ∈ H . Definition 4.2 (energy solution) . The function u ∈ L (0 , T ; H ) is an energy solution to (1.2) if(i) (Initial condition) lim t ↓ t k u ( s ) − u k d s = 0 . (ii) (Regularity) D αc u ∈ L (0 , T ; H ) .(iii) (EVI) For any w ∈ L (0 , T ; H )(4.2) ˆ T [ h D αc u ( t ) , u ( t ) − w ( t ) i + Φ( u ( t )) − Φ( w ( t ))] d t ≤ ˆ T h f ( t ) , u ( t ) − w ( t ) i d t. Notice that, provided u ∈ D (Φ) we can set w ( t ) = u in (4.2) and obtain that ´ T Φ( u ( t ))d t < ∞ ,which motivates the name for this notion of solution. In addition, as the following result shows, anyenergy solution is a strong solution. Proposition 4.3 (energy vs. strong) . An energy solution of (1.2) is also a strong solution.Proof.
Evidently, it suffices to prove that that f ( t ) − D αc u ( t ) ∈ ∂ Φ( u ( t )) for almost every t ∈ (0 , T ).Let w ∈ H , t ∈ (0 , T ), and choose h > t − h, t + h ) ⊂ (0 , T ). Define w ( t ) = u ( t ) − χ ( t − h,t + h ) ( u ( t ) − w ) ∈ L (0 , T ; H )where by χ S we denote the characteristic function of the set S . This choice of test function on (4.2)gives t + ht − h h D αc u ( t ) − f ( t ) , u ( t ) − w i d t + t + ht − h (Φ( u ( t )) − Φ( w )) d t ≤ . The assumptions of an energy solution guarantee that all terms inside the integrals belong to L (0 , T ; R ) so that for almost every t we have, as h ↓
0, that h D αc u ( t ) − f ( t ) , w i + Φ( u ( t )) − Φ( w ) ≤ , which is (4.1) and, as we intended to show, is equivalent to the claim. (cid:3) Remark 4.4 (coercivity) . By introducing the coercivity modulus of Definition 2.1 one realizes thatan energy solution u satisfies, instead of (4.1) and (4.2) , the stronger inequalities (4.3) h D αc u ( t ) , u ( t ) − w i + Φ( u ( t )) − Φ( w ) + σ ( u ( t ); w ) ≤ h f ( t ) , u ( t ) − w i , ∀ w ∈ H , and, for any w ∈ L (0 , T ; H ) , (4.4) ˆ T [ h D αc u ( t ) , u ( t ) − w ( t ) i + Φ( u ( t )) − Φ( w ( t )) + σ ( u ( t ); w ( t ))] d t ≤ ˆ T h f ( t ) , u ( t ) − w ( t ) i d t. IME FRACTIONAL GRADIENT FLOW 21
Existence and uniqueness.
In this section, we will prove the following theorem on the ex-istence and uniqueness of energy solutions to (1.2) in the sense of Definition 4.2. The main resultthat we will prove reads as follows.
Theorem 4.5 (well posedness) . Assume that the energy Φ is convex, l.s.c., and with nonemptyeffective domain. Let u ∈ D (Φ) and f ∈ L α (0 , T ; H ) . In this setting, the fractional gradient flowproblem (1.2) has a unique energy solution u , in the sense of Definition 4.2. For almost every t ∈ (0 , T ) , the solution u satisfies that f ( t ) − D αc u ( t ) ∈ ∂ Φ( u ( t )) and for any t ∈ [0 , T ] we have (4.5) u ( t ) = u + 1Γ( α ) ˆ t ( t − s ) α − D αc u ( s )d s. In addition, u ∈ C ,α/ ([0 , T ]; H ) with modulus of continuity (4.6) k u ( t ) − u ( t ) k ≤ C | t − t | α/ (cid:16) k f k L α (0 ,T ; H ) + Φ( u ) − Φ inf (cid:17) / , ∀ t , t , ∈ [0 , T ] . where the constant C depends only on α . We point out that our assumptions are weaker than those in [28, Theorem 5.10]. First, we allowfor a nonzero right hand side. In addition, we do not require [28, Assumption 5.9], which is a sortof weak-strong continuity of subdifferentials.The remainder of this section will be dedicated to the proof of Theorem 4.5. To accomplish this,we follow a similar approach to [28, Section 5]. To show existence of solutions, we consider a sortof fractional minimizing movements scheme. We introduce a partition P with maximal time step τ and compute the sequence U = { U n } Nn =0 ⊂ H as follows. Assume U ∈ D (Φ) is given, the n –thiterate, for n ∈ { , . . . , N } , is defined recursively via(4.7) F n − ( D α P U ) n ∈ ∂ Φ( U n ) , where(4.8) F n = t n t n − f ( t )d t. We will usually choose U = u , but other choices of U ∈ D (Φ) are also allowed.From the approximation scheme (4.7) and the expression of the discrete Caputo derivative( D α P U ) n given in (3.3), it is clear that(4.9) U n = arg min w ∈H Φ( w ) − h F n , w i − n − X i =0 K − P ,ni k w − U i k ! . Thanks to Proposition 3.1, for i = 0 , . . . , n −
1, we have that K − P ,ni < U n is well-defined.Now, in order to define a continuous in time function from U , we use the interpolation introducedin (3.16). Let V P ( t ) = ( D α P U ) n ( t ) . Then we have(4.10) b U P ( t ) = U + 1Γ( α ) ˆ t ( t − s ) α − V P ( s )d s. Recall that F P can be defined from { F n } Nn =1 using (2.3) and that Lemma 2.4 showed that F P ∈ L α (0 , T ; H ) with a norm bounded independently of P . We now obtain some suitable bounds for b U P and V P . Lemma 4.6 (a priori bounds) . Let P be any partition. The functions b U P and V P satisfy (4.11) sup t ∈ [0 ,T ] Φ( b U P ( t )) ≤ Φ( U ) + 14Γ( α ) k F P k L α (0 ,T ; H ) ≤ Φ( U ) + C k f k L α (0 ,T ; H ) , k V P k L α (0 ,T ; H ) = sup t ∈ [0 ,T ] ˆ t ( t − s ) α − k V P ( s ) k d s ≤ C (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) , where the constant C only depends on α .Proof. Since F n − ( D α P U ) n ∈ ∂ Φ( U n ), one hasΦ( U n ) − Φ( U i ) ≤ h F n − ( D α P U ) n , U n − U i i . Therefore noticing that K − P ,ni < i ∈ { , . . . , n − } , we get(4.12) ( D α P Φ( U )) n = − n − X i =0 K − P ,ni (Φ( U n ) − Φ( U i )) ≤ − n − X i =0 K − P ,ni h F n − ( D α P U ) n , U n − U i i = h F n − ( D α P U ) n , ( D α P U ) n i , where we denoted Φ( U ) = { Φ( U n ) } Nn =0 .We can now proceed to obtain the claimed estimates. To prove the first one, we use that( D α P Φ( U )) n ≤ h F n − ( D α P U ) n , ( D α P U ) n i ≤ k F n k to obtain that for any n ,Φ( U n ) = Φ( U ) + n X i =1 K P ,ni ( D α P Φ( U )) i ≤ Φ( U ) + 14 n X i =1 K P ,ni k F i k = Φ( U ) + 14Γ( α ) ˆ t n ( t n − s ) α − k F P ( s ) k d s ≤ Φ( U ) + C k f k L α (0 ,T ; H ) , where the constant C depends only on α . Now, since Proposition 3.5 has shown that b U P is a convexcombination of the values U n , we haveΦ( b U P ( t )) = Φ N X i =0 ϕ P ,i ( t ) U i ! ≤ N X i =0 ϕ P ,i ( t )Φ ( U i ) ≤ max n Φ( U n ) ≤ Φ( U ) + C k f k L α (0 ,T ; H ) , which finishes the proof of the first claim.We now proceed to prove the second claim. Using (4.12) we getΦ inf ≤ Φ( b U P ( t )) ≤ Φ( U ) + 1Γ( α ) ˆ t ( t − s ) α − h F P ( s ) − V P ( s ) , V P ( s ) i d s ≤ Φ( U ) + 1Γ( α ) (cid:18) ˆ t ( t − s ) α − k F P ( s ) k d s (cid:19) / (cid:18) ˆ t ( t − s ) α − k V P ( s ) k d s (cid:19) / − α ) ˆ t ( t − s ) α − k V P ( s ) k d s, for any t ∈ [0 , T ]. This implies that ˆ t ( t − s ) α − k V P ( s ) k d s ≤ k F P k L α (0 ,T ; H ) + 2Γ( α )(Φ( U ) − Φ inf ) , which, using Lemma 2.4, implies the result. (cid:3) IME FRACTIONAL GRADIENT FLOW 23
Remark 4.7 (the function b Φ) . Notice that, during the course of the proof of the first estimate in (4.11) we also showed that, if we define b Φ P ( t ) = P Ni =0 ϕ P ,i ( t )Φ( U i ) , then b Φ( t ) is the interpolationof Φ P ( U ) with piecewise constant Caputo derivative. Moreover, D αc b Φ P ( t ) ≤ (cid:13)(cid:13) F P ( t ) (cid:13)(cid:13) . These estimates immediately yield a modulus of continuity estimate on the interpolant b U P whichis independent of the partition P . Lemma 4.8 (H¨older continuity) . Let P be any partition and U ∈ H N be the solution to (4.7) associated to this partition. For t , t ∈ [0 , T ] the interpolant b U P , defined in (3.16) , satisfies k b U ( t ) − b U ( t ) k ≤ C | t − t | α/ (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) / where the constant C depends only on α .Proof. As proved in [28, Lemma 5.8], D αc w ∈ L α (0 , T ; H ) guarantees w ∈ C ,α/ ([0 , T ]; H ). There-fore using D αc b U = V α ∈ L α (0 , T ; H ) and the estimate from Lemma 4.6, we obtain the result. (cid:3) Next we control the difference between discrete solutions corresponding to different partitions.
Lemma 4.9 (equicontinuity) . Let, for i = 1 , , P i be partitions of [0 , T ] with maximal step size τ i , respectively, and denote by U ( i ) the associated solutions to (4.7) . Let b U i be their interpolations,defined by (4.10) , and U i be their piecewise constant interpolations as in (2.3) . Assuming that U ( i )0 = U we have (cid:13)(cid:13)(cid:13) b U − b U (cid:13)(cid:13)(cid:13) L ∞ (0 ,T ; H ) ≤ C (cid:16) τ α/ + τ α/ (cid:17) (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) / , (4.13) sup t ∈ [0 ,T ] ˆ t ( t − s ) α − ρ ( U ( s ) , U ( s ))d s ≤ C ( τ α + τ α ) (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) , (4.14) where the constant C only depends on α .Proof. For almost every t ∈ [0 , T ], we have that(4.15) D D αc ( b U − b U ) , b U − b U E = I + II + III , where I = D ( F − D αc b U ) − ( F − D αc b U ) , U − U E ≤ − ρ ( U , U ) , II = D ( F − D αc b U ) − ( F − D αc b U ) , ( b U − U ) − ( b U − U ) E , III = D F − F , b U − b U E , where to bound I we used that F i ( t ) − D αc b U i ( t ) ∈ ∂ Φ( U i ( t )) and Definition 2.1. Define now G ( t ) = 1Γ( α ) ˆ t ( t − s ) α − (cid:0) F ( s ) − F ( s ) (cid:1) d s = 1Γ( α ) ˆ t ( t − s ) α − (cid:0) F ( s ) − f ( s ) (cid:1) d s − α ) ˆ t ( t − s ) α − (cid:0) F ( s ) − f ( s ) (cid:1) d s, so that D αc G ( t ) = F ( t ) − F ( t ) and by (2.10) of Lemma 2.5 one further has(4.16) k G k L ∞ (0 ,T ; H ) ≤ C (cid:16) τ α/ + τ α/ (cid:17) k f k L α (0 ,T ; H ) , where C is a constant that depends only on α . Using these estimates, from (4.15) we deduce that(4.17) D D αc ( b U − b U − G ) , b U − b U − G E + ρ ( U , U ) ≤ II − D D αc ( b U − b U − G ) , G E . Set w = b U − b U − G . By (2.18) we have that12 D αc k w ( t ) k + ρ ( U , U ) ≤ II − h D αc w, G i , and, using (2.17) and (4.16), we then conclude12 k b U ( t ) − b U ( t ) k + 1Γ( α ) ˆ t ( t − s ) α − ρ ( U ( s ) , U ( s ))d s ≤ α ) ˆ t ( t − s ) α − (II( s ) − h D αc w ( s ) , G ( s ) i ) d s + C (cid:16) τ α/ + τ α/ (cid:17) k f k L α (0 ,T ; H ) . It remains then to estimate the fractional integral on the right hand side. We estimate each termseparately.First, owing to Lemma 2.4 and Lemma 4.6 we have, for i = 1 ,
2, that (cid:13)(cid:13)(cid:13) F i − D αc b U i (cid:13)(cid:13)(cid:13) L α (0 ,T ; H ) ≤ C (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) / , Therefore using the Cauchy-Schwarz inequality, for any t ∈ [0 , T ], we have ˆ t ( t − s ) α − | II( s ) | d s ≤ C (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) / X i =1 (cid:13)(cid:13)(cid:13) b U i − U i (cid:13)(cid:13)(cid:13) L α (0 ,T ; H ) . Recalling that U i ( t ) = b U i ( ⌈ t ⌉ i ) we can invoke Lemma 2.6 and, again, Lemma 4.6 to arrive at ˆ t ( t − s ) α − | II( s ) | d s ≤ C ( τ α/ + τ α/ ) (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) . Finally, for the remaining term, we use the Cauchy-Schwarz inequality and get ˆ t ( t − s ) α − |h D αc w, G i ( s ) | d s ≤ (cid:18) ˆ t ( t − s ) α − k D αc w ( s ) k d s (cid:19) / (cid:18) ˆ t ( t − s ) α − k G ( s ) k d s (cid:19) / ≤ k D αc w k L α (0 ,T ; H ) k G k L α (0 ,T ; H ) To estimate the norm of G we apply (2.11) from Lemma 2.5 with β = α to obtain k G k L α (0 ,T ; H ) ≤ C ( τ α + τ α ) k f k L α (0 ,T ; H ) . Furthermore, Lemma 2.4 and Lemma 4.6 guarantee that k D αc w k L α (0 ,T ; H ) ≤ C (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) / . Combining all estimates proves the desired result. (cid:3)
We are finally able to prove Theorem 4.5. We will follow the same approach as in [28, Theorem5.10]; we will pass to the limit τ i ↓ b U i . Proof of Theorem 4.5.
Let us first prove uniqueness of energy solutions. Suppose that we have twoenergy solutions u , u to (1.2). Let t ∈ (0 , T ) be arbitrary and h > t − h, t + h ) ⊂ [0 , T ]. Setting as test function, in the EVI that characterizes u , the function w = u − χ ( t − h,t + h ) ( u − u ) and vice versa, and adding the ensuing inequalities we obtain ˆ t + ht − h h D αc u ( s ) − D αc u ( s ) , u ( s ) − u ( s ) i d s ≤ , meaning that h D αc u ( t ) − D αc u ( t ) , u ( t ) − u ( t ) i ≤ t ∈ [0 , T ].Define d ( t ) = k u ( t ) − u ( t ) k . Since u , u ∈ L (0 , T ; H ) we clearly have d ∈ L (0 , T ; R ).Furthermore, t | d ( s ) | d s ≤ t (cid:0) k ( u ( s ) − u ) k + k ( u ( s ) − u ) k (cid:1) d s → , IME FRACTIONAL GRADIENT FLOW 25 as t ↓
0, from Definition 4.2. Using (2.18) we then have D αc d ( t ) ≤ h D αc u ( t ) − D αc u ( t ) , u ( t ) − u ( t ) i ≤ d ≥ ffl t | d ( s ) | d s → d ( t ) = 0. This proves the uniqueness.We now turn our attention to existence. Let {P k } ∞ k =1 be a sequence of partitions such that τ k ↓ k → ∞ . We denote by U ( k ) the discrete solution, on partition P k , given by (4.7) with U ( k )0 = u . The symbols b U k , V k and F k carry analogous meaning. Owing to Lemma 4.9 there exists u ∈ C ([0 , T ]; H ) such that b U k converges to u in C ([0 , T ]; H ).The embedding of Proposition 2.3 and an application of Lemma 4.6 shows that there is a subse-quence for which V k j ⇀ v in L (0 , T ; H ) as j → ∞ . Moreover, we can again appeal to Lemma 4.6to see that, for every t ∈ [0 , T ], the sequence( t − · ) α − V k j ( · )is uniformly bounded in L (0 , t ; H ) so that by passing to a further, not retagged, subsequence(4.18) ( t − · ) α − V k j ( · ) ⇀ ( t − · ) α − v ( · ) in L (0 , t ; H )for any t ∈ [0 , T ]. This, in addition, shows that v ∈ L α (0 , T ; H ) so that if we define(4.19) e u ( t ) = u + 1Γ( α ) ˆ t ( t − s ) α − v ( s )d s then D αc e u = v .Recall that for any j ∈ N and any t ∈ [0 , T ] we have that b U k j ( t ) = u + 1Γ( α ) ˆ t ( t − s ) α − V k j ( s )d s. Since, for an arbitrary w ∈ H we have that ( t − · ) α − w is in L (0 , t ; H ) , we can use (4.18) to obtainthat lim j →∞ h b U k j ( t ) , w i = lim j →∞ (cid:28) u + 1Γ( α ) ˆ t ( t − s ) α − V k j ( s )d s, w (cid:29) = (cid:28) u + 1Γ( α ) ˆ t ( t − s ) α − v ( s )d s, w (cid:29) = h e u ( t ) , w i . The statement above holds for any w ∈ H and all t ∈ [0 , T ]. Thus,(4.20) b U k j ( t ) ⇀ e u ( t ) , in H . However, this implies that e u = u , as b U k j converges to u in C ([0 , T ]; H ). Therefore D αc u = v ∈ L α (0 , T ; H ) and, by Lemma 4.6, we have the estimate k v k L α (0 ,T ; H ) ≤ C (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) / , for some constant C depending on α . As in the proof of Lemma 4.8 this implies that (4.6) holds.From this, we also see that the initial condition is attained in the required sense.It remains to show that the EVI (4.2) holds for u . From the construction of discrete solutions,one derives that for any w ∈ L (0 , T ; H )(4.21) ˆ T (cid:16) Φ( b U k j ( t )) − Φ( w ( t )) (cid:17) d t ≤ ˆ T h F k j ( t ) − V k j ( t ) , b U k j ( t ) − w ( t ) i d t. We will pass to the limit in this inequality. For the right hand side, it suffices to observe that b U k j → u in C ([0 , T ]; H ), V k j ⇀ v in L (0 , T ; H ) and F k j → f in L (0 , T ; H ). Thus, ˆ T h F k j ( t ) − V k j ( t ) , b U k j ( t ) − w ( t ) i d t → ˆ T h f ( t ) − v ( t ) , u ( t ) − w ( t ) i d t. For the left hand side, the uniform convergence of b U k j and the lower semicontinuity of Φ, giveΦ( u ( t )) ≤ lim inf j →∞ Φ (cid:16) b U k j ( t ) (cid:17) , and hence ˆ T Φ( u ( t )) − Φ( w ( t ))d t ≤ ˆ T h f ( t ) − v ( t ) , u ( t ) − w ( t ) i d t. It remains to recall that D αc u = v ∈ L (0 , T ; H ) to conclude that, according to Definition 4.2, u isan energy solution. (cid:3) Remark 4.10 (other notion of solution) . The choice of u ∈ L (0 , T ; H ) and D αc u ∈ L (0 , T ; H ) inDefinition 4.2 is to guarantee that (4.2) makes sense. It is also necessary in the proof of uniqueness.However, other choices of spaces are also possible. For example, one could consider the followingdefinition instead of Definition 4.2: u ∈ L ∞ (0 , T ; H ) is a solution to (1.2) if:(i) lim t ↓ ffl t k u ( s ) − u k d s = 0 ;(ii) D αc u ∈ L (0 , T ; H ) ; and(iii) for any w ∈ L ∞ (0 , T ; H ) , (4.22) ˆ T [ h D αc u ( t ) , u ( t ) − w ( t ) i + Φ( u ( t )) − Φ( w ( t ))] d t ≤ ˆ T h f ( t ) , u ( t ) − w ( t ) i d t. Theorem 4.5 also holds for this new definition. However, at least with our techniques, the require-ments on the data u ∈ D (Φ) and f ∈ L α (0 , T ; H ) do not change. Fractional gradient flows: Numerics
Since the existence of an energy solution was proved by a rather constructive approach, namely afractional minimizing movements scheme, it makes sense to provide error analyses for this scheme.We will provide an a priori error estimate which, in light of the smoothness u ∈ C ,α/ ([0 , T ]; H )proved in Theorem 4.5, is optimal. In addition, in the spirit of [30] we will provide an a posteriorierror analysis.5.1. A priori error analysis.
The a priori error estimate reads as follows. We comment that thisresult gives us a better rate compared to [28, Theorem 5.10].
Theorem 5.1 (a priori I) . Let u be the energy solution of (1.2) . Given a partition P , of maximalstep size τ , let U ∈ H N be the discrete solution defined by (4.7) starting from U ∈ D (Φ) . Let b U P and U P be defined as in (4.10) and (2.3) , respectively. Then we have, (cid:13)(cid:13)(cid:13) u − b U P (cid:13)(cid:13)(cid:13) L ∞ (0 ,T ; H ) ≤ k u − U k + Cτ α/ (cid:16) k f k L α (0 ,T ; H ) + Φ − Φ inf (cid:17) / , (5.1) sup t ∈ [0 ,T ] ˆ t ( t − s ) α − ρ ( u ( s ) , U P ( s ))d s ≤ k u − U k + Cτ α (cid:16) k f k L α (0 ,T ; H ) + Φ − Φ inf (cid:17) , (5.2) where Φ = max { Φ( U ) , Φ( u ) } , and the constant C depends only on α .Proof. The proof can be obtained by following the same procedure employed in the proof of Lemma 4.9.In the current situation, however, instead of comparing two discrete solutions we compare the exactand discrete ones. The only difference is that we allow U = u here, but this presents no essentialdifficulty. For brevity, we skip the details. (cid:3) IME FRACTIONAL GRADIENT FLOW 27
A posteriori error analysis.
Let us now provide an a posteriori error estimate between thediscretization in (4.7) and the solution of (1.2). We will also show how, from this a posteriori errorestimator, an a priori error estimate can be derived. Let us first introduce the a posteriori errorestimator.
Definition 5.2 (error estimator) . Let P be a partition of [0 , T ] as in (2.2) , and U ∈ H N denotethe discrete solution given by (4.7) . We define the error estimator function as (5.3) E P ( t ) = E P , ( t ) + E P , ( t ) , where E P , ( t ) = h D αc b U P ( t ) − F P ( t ) , b U P ( t ) − U P ( t ) i , E P , ( t ) = Φ( b U P ( t )) − Φ( U P ( t )) . Notice that the quantity E P ( t ) is nonnegative because F P ( t ) − D αc b U P ( t ) = F n ( t ) − ( D α P U ) n ( t ) ∈ ∂ Φ( U n ( t ) ) = ∂ Φ( U P ( t )). It is also, in principle, computable since it only depends on data, and thediscrete solution U . It is then a suitable candidate for an a posteriori error estimator.The derivation of an a posteriori error estimate begins with the observation that, for any w ∈ H ,we have(5.4) h D αc b U P ( t ) − f ( t ) , b U P ( t ) − w i + Φ( b U P ( t )) − Φ( w )= E P ( t ) + h F P ( t ) − D αc b U P ( t ) , w − U P ( t ) i + Φ( U P ( t )) − Φ( w ) + h f ( t ) − F P ( t ) , w − b U P ( t ) i≤ E P ( t ) + h f ( t ) − F P ( t ) , w − b U P ( t ) i − σ ( U P ( t ); w ) . In other words, the function b U P solves an EVI similar to (4.3) but with additional terms on theright hand side. We can then compare the EVIs by a now standard approach, that is, set w = u ( t )in (5.4) and w = b U P ( t ) in (4.3), respectively, to see that(5.5) D D αc (cid:16) b U P − u (cid:17) ( t ) , b U P ( t ) − u ( t ) E + σ ( U P ( t ); u ( t )) + σ ( u ( t ); b U P ( t )) ≤E P ( t ) + h f ( t ) − F P ( t ) , u ( t ) − b U P ( t ) i for almost every t ∈ [0 , T ]. Consider the following notions of error:(5.6) E = sup t ∈ [0 ,T ] (cid:8) E H ( t ) + E σ ( t ) (cid:9)! / , E H ( t ) = k u ( t ) − b U P ( t ) k ,E σ ( t ) = (cid:18) α ) ˆ t ( t − s ) α − h σ ( u ( s ); b U P ( s )) + σ ( U P ( s ); u ( s )) i d s (cid:19) / . We have the following error estimate for E . Theorem 5.3 (a posteriori) . Let u be the energy solution of (1.2) . Let P be a partition of [0 , T ] defined as in (2.2) and let U ∈ H N be the discrete solution given by (4.7) starting from U ∈ D (Φ) .Let E and E P be defined in (5.6) and (5.3) , respectively, The following a posteriori error estimateholds (5.7) E ≤ (cid:18) k u − U k + 2Γ( α ) kE P k L α (0 ,T ; H ) (cid:19) / + 2Γ( α ) k f − F P k L α (0 ,T ; H ) . Proof.
From (2.18) we infer12 D αc k b U P − u k ( t ) ≤ D D αc (cid:16) b U P − u (cid:17) ( t ) , b U P ( t ) − u ( t ) E ≤ E P ( t ) + h f ( t ) − F P ( t ) , u ( t ) − b U P ( t ) i − σ ( U P ( t ); u ( t )) − σ ( u ( t ); b U P ( t )) . The claimed a posteriori error estimate (5.7) follows from Lemma 2.8 by setting λ = 0 , a ( t ) = k ( b U P − u )( t ) k , b ( t ) = 2 (cid:16) σ ( U P ( t ); u ( t )) + σ ( u ( t ); b U P ( t )) (cid:17) ,c ( t ) = 2 E P ( t ) , d ( t ) = k ( f − F P )( t ) k . (cid:3) Rate of convergence.
Although we have already established an optimal a priori rate ofconvergence for our scheme in Theorem 5.1, in this section we study the sharpness of the a posteriorierror estimator E P by obtaining the same convergence rates through it. We comment that neitherin Theorem 5.1 nor in our discussion here, we require any relation between time steps. We will alsoconsider some cases when the rate of convergence can be improved.5.3.1. Rate of convergence for energy solutions.
Let us now use the estimator E P to derive a conver-gence rate or order O ( τ α/ ) for the error E , defined in (5.6), when f ∈ L α (0 , T ; H ). Notice that suchregularity a priori does not give any order of convergence for k f − F P k L α (0 ,T ; H ) in (5.7). Observealso that the rate that we obtain is consistent with classical gradient flow theories, where an order O ( τ / ) is proved provided that u ∈ D (Φ) and f ∈ L (0 , T ; H ); see [30, Sec 3.2].We first bound kE P k L α (0 ,T ; H ) . Theorem 5.4 (bound on kE P k L α (0 ,T ; H ) ) . Under the assumption that U ∈ D (Φ) , the estimator E P ,defined in (5.3) , satisfies (5.8) kE P k L α (0 ,T ; H ) ≤ Cτ α (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) , where the constant C depends only on α .Proof. We bound the contributions E P , and E P , separately. The bound of E P , follows withoutchange that of the term II of (4.15) in Lemma 4.9. Thus,(5.9) kE P , k L α (0 ,T ; H ) ≤ Cτ α (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) . To bound E P , , we recall the function b Φ P , defined in Remark 4.7, and its properties. Define alsoΦ P ( t ) = Φ( U P ( t )). We have E P , ( t ) = Φ (cid:16) b U P ( t ) (cid:17) − Φ( U P ( t )) ≤ b Φ P ( t ) − Φ P ( t )= 1Γ( α ) ˆ t ( t − s ) α − D αc b Φ P ( s )d s − ˆ ⌈ t ⌉ P ( ⌈ t ⌉ P − s ) α − D αc b Φ P ( s )d s ! = 1Γ( α ) ˆ t [( t − s ) α − − ( ⌈ t ⌉ P − s ) α − ] D αc b Φ P ( s )d s − ˆ ⌈ t ⌉ P t ( ⌈ t ⌉ P − s ) α − D αc b Φ P ( s )d s ! ≤ α ) ˆ t [( t − s ) α − − ( ⌈ t ⌉ P − s ) α − ] (cid:13)(cid:13) F P ( s ) (cid:13)(cid:13) d s − α ) ˆ ⌈ t ⌉ P t ( ⌈ t ⌉ P − s ) α − D αc b Φ P ( s )d s = 14Γ( α ) ˆ t [( t − s ) α − − ( ⌈ t ⌉ P − s ) α − ] (cid:13)(cid:13) F P ( s ) (cid:13)(cid:13) d s − α + 1) ( ⌈ t ⌉ P − t ) α D αc b Φ P ( t )= I ( t ) − I ( t ) . On the one hand, proceeding as in the proof of Lemma 2.6 we obtainsup r ∈ [0 ,T ] ˆ r ( r − t ) α − I ( t )d t ≤ C τ α k F P k L α (0 ,T ; H ) . On the other hand, using − I ( t ) ≤ − α + 1) ( ⌈ t ⌉ P − t ) α (cid:18) D αc b Φ P ( t ) − (cid:13)(cid:13) F P ( t ) (cid:13)(cid:13) (cid:19) ≤ τ α Γ( α + 1) (cid:18) (cid:13)(cid:13) F P ( t ) (cid:13)(cid:13) − D αc b Φ P ( t ) (cid:19) IME FRACTIONAL GRADIENT FLOW 29 we have for any r ∈ [0 , T ] that − ˆ r ( r − t ) α − I ( t )d t ≤ τ α Γ( α + 1) ˆ r ( r − t ) α − (cid:18) (cid:13)(cid:13) F ( t ) (cid:13)(cid:13) − D αc b Φ P ( t ) (cid:19) d t = τ α α + 1) ˆ r ( r − t ) α − (cid:13)(cid:13) F P ( t ) (cid:13)(cid:13) d t − τ α α (cid:16)b Φ P ( r ) − Φ( U ) (cid:17) ≤ τ α α + 1) k F P k L α (0 ,T ; H ) + τ α α (Φ( U ) − Φ inf ) . Therefore combining the estimates for I and I we have proved thatsup r ∈ [0 ,T ] ˆ r ( r − t ) α − E P , ( t )d t ≤ C τ α (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) , which together with (5.9) proves (5.8) because E P is nonnegative. (cid:3) We next take advantage of Lemma 2.5, and derive a rate for E without additional smoothnessassumptions on the right hand side f . Theorem 5.5 (a priori II) . Let u be the energy solution of (1.2) . Let P be a partition of [0 , T ] defined as in (2.2) and U ∈ H N be the discrete solution given by (4.7) starting from U ∈ D (Φ) .Let E be defined in (5.6) . Then we have E ≤ k u − U k + Cτ α/ (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) / , where the constant C depends only on α .Proof. We follow closely the approach and notation in Lemma 4.9. Define G ( t ) = 1Γ( α ) ˆ t ( t − s ) α − (cid:0) f ( s ) − F P ( s ) (cid:1) d s and note that, by Lemma 2.5, G satisfies(5.10) τ α/ k G k L ∞ (0 ,T ; H ) + k G k L α (0 ,T ; H ) ≤ C τ α k f k L α (0 ,T ; H ) , where the constant depends only on α . Set e = u − b U P and note that (5.5) can be rewritten as h D αc ( e − G ) ( t ) , ( e − G ) ( t ) i + σ ( U P ( t ); u ( t )) + σ ( u ( t ); b U P ( t )) ≤ E P ( t ) − h D αc ( e − G ) ( t ) , G ( t ) i . Notice the resemblance with (4.17). We can thus proceed as in Lemma 4.9, and use Theorem 5.4,to deduce that, for some constant C , depending only on α k u − b U P − G k ( t ) + E σ ( t ) ≤ k u − U k + C τ α (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) . Estimate (5.10) then implies the result. (cid:3)
Rate of convergence for smooth energies.
Let us show that, at least for smoother energies, itis possible to obtain a better rate of convergence. We will, essentially, assume that the energy islocally C β for β ∈ (0 , β ∈ (0 ,
1] such that for every
R >
0, there is a constant C β,R > w ) − Φ( w ) − h ξ , w − w i ≤ C β,R k w − w k β , ∀ w , w ∈ B R , ξ ∈ ∂ Φ( w ) , where B R denotes the ball of radius R in H . Notice that, by Lemma 4.8, all the discrete solutions b U P are uniformly bounded in C ([0 , T ]; H ). Thus, we can fix ¯ R > P and all t ∈ [0 , T ], b U P ( t ) ∈ B ¯ R . Therefore, (5.11) implies that(5.12) Φ( w ) − Φ( w ) − h ξ , w − w i ≤ C β k w − w k β , ∀ w , w ∈ b U P ([0 , T ]) , ξ ∈ ∂ Φ( w ) , for some constant C β = C β, ¯ R . A particular example to which this situation applies is the following. Let H = R d and Φ( w ) = p | w | p with p >
1. In this case, (5.12) holds with β = 1 for p ≥ β = p − p ∈ (1 , p <
2, to reach β = 1, we must assume that u and b U P stay uniformly away from zero. This examplecan, of course, be generalized.In this setting, we have the following improved estimate for kE P k L α (0 ,T ; H ) . Theorem 5.6 (improved bound) . Assume that the energy Φ satisfies (5.12) . Let u be the energysolution to (1.2) , and denote by P a partition of [0 , T ] defined as in (2.2) . Denote by b U P the solutionof (4.7) starting from U ∈ D (Φ) . In this setting, the estimator E P defined in (5.3) satisfies (5.13) kE P k L α (0 ,T ; H ) ≤ CT α (1 − β ) / τ α ( β +1) (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) ( β +1) / , for some constant C that depends on α , β , and the problem data.Proof. Owing to (5.12), the estimator E P can be bounded from above by E P ( t ) = h D αc b U P ( t ) − F P ( t ) , b U P ( t ) − U P ( t ) i + Φ( b U P ( t )) − Φ( U P ( t )) ≤ C β k b U P ( t ) − U P ( t ) k β . Applying Lemma 2.6 with p = 1 + β we have kE P k L α (0 ,T ; H ) ≤ sup r ∈ [0 ,T ] C β ˆ r ( r − t ) α − k b U P ( t ) − U P ( t ) k β d t ≤ Cτ α (1+ β ) (cid:13)(cid:13)(cid:13) D αc b U (cid:13)(cid:13)(cid:13) βL βα (0 ,T ; H ) , for some constant C that depends on α, β and the problem data. Since 1 + β ∈ (1 , k w k L βα (0 ,T ; H ) ≤ k w k L α (0 ,T ; H ) (cid:18) T α α (cid:19) (1 − β ) / (2(1+ β )) , imply that (cid:13)(cid:13)(cid:13) D αc b U P (cid:13)(cid:13)(cid:13) βL βα (0 ,T ; H ) ≤ C T α (1 − β ) / (cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) (1+ β ) / , and this implies the claim. (cid:3) Now, in order to obtain a convergence rate using (5.7), we still need to control k f − F P k L α (0 ,T ; H ) .To do so, we invoke inequality (2.5) and see that k f − F P k L α (0 ,T ; H ) ≤ (cid:18) q − qα − (cid:19) ( q − /q T α − /q k f − F P k L q (0 ,T ; H ) for q > /α . Thus, if f ∈ W α (1+ β ) / ,q (0 , T ; H ), then we have k f − F P k L q (0 ,T ; H ) ≤ Cτ α (1+ β ) / | f | W α (1+ β ) / ,q (0 ,T ; H ) and hence(5.14) k f − F P k L α (0 ,T ; H ) ≤ CT α − /q τ α (1+ β ) / | f | W α (1+ β ) / ,q (0 ,T ; H ) for some constant C that depends on α and q . Combining this with Theorem 5.6, the followingconvergence rate is a direct consequence of Theorem 5.3. Theorem 5.7 (improved rate: smooth energies) . Assume that the energy Φ satisfies (5.12) . Let u be the energy solution to (1.2) , and denote by P a partition of [0 , T ] defined as in (2.2) . Denoteby b U P the solution of (4.7) starting from U ∈ D (Φ) . In this setting, if there is q > /α for which f ∈ W α ( β +1) / ,q (0 , T ; H ) then the error E , defined in (5.6) , satisfies E ≤ k u − U k + Cτ α ( β +1) / (cid:20)(cid:16) k f k L α (0 ,T ; H ) + Φ( U ) − Φ inf (cid:17) ( β +1) / + | f | W α ( β +1) / ,q (0 ,T ; H ) (cid:21) , where the constant C depends on α , β , q , T , and the problem data. IME FRACTIONAL GRADIENT FLOW 31
Rate of convergence for linear problems.
Let us now show how for certain classes of linearproblems an improved rate of convergence can be obtained. We first assume that we have a Gelfandtriple, V ֒ → H ֒ → V ′ and that(5.15) Φ( w ) = a ( w, w ) , w ∈ V , + ∞ , w / ∈ V . where a : V × V → R is a nonnegative, symmetric, bounded, and semicoercive bilinear form. In thissetting, (4.1) becomes h D αc u, w i + a ( u, w ) = h f, w i , ∀ w ∈ V . Notice that the bilinear form induces an operator A : V → V ′ given by h A v, w i V , V ′ = a ( v, w ) , ∀ v, w ∈ V , which implies that, for almost every t ∈ (0 , T ), we have a problem in V ′ which reads D αc u ( t ) + A u ( t ) = f ( t ) . So that, u ∈ D ( ∂ Φ) is equivalent to A u ∈ H . The bilinear form a also induces a semi-norm on V [ w ] V = a ( w, w ) / . We further assume that f ∈ L α (0 , T ; [ · ] V ). More essentially we also require u ∈ D ( ∂ Φ).The motivation for an improved rate of convergence is then the following, at this stage formal,calculation. From (2.18) we have12 D αc k A u ( t ) k ≤ h D αc A u ( t ) , A u ( t ) i = h A u ( t ) , A D αc u ( t ) i = h f ( t ) − D αc u ( t ) , A D αc u ( t ) i = a ( f ( t ) , D αc u ( t )) − [ D αc u ( t )] V ≤ [ f ( t )] V [ D αc u ( t )] V − [ D αc u ( t )] V . Which then shows via (2.17) thatΓ( α )2 k A u ( t ) k + ˆ t ( t − s ) α − [ D αc u ( s )] V d s ≤ Γ( α )2 k A u k + (cid:18) ˆ t ( t − s ) α − [ f ( s )] V d s (cid:19) / (cid:18) ˆ t ( t − s ) α − [ D αc u ( s )] V d s (cid:19) / . This implies that[ D αc u ] L α (0 ,T ;[ · ] V ) = sup t ∈ [0 ,T ] ˆ t ( t − s ) α − [ D αc u ( s )] V d s ≤ Γ( α ) k A u k + k f k L α (0 ,T ;[ · ] V ) , which says that D αc u is uniformly bounded in L α (0 , T ; [ · ] V ).To make these considerations rigorous, we consider the discrete problem (4.7), which in this casereduces to ( D α P U ) n + A U n = F n , Then the computations can be followed verbatim to obtain thatΓ( α )2 k A b U P ( t ) k + ˆ t ( t − s ) α − h D αc b U P ( s ) i d s ≤ Γ( α )2 k A U k + (cid:18) ˆ t ( t − s ) α − h D αc b U P ( s ) i d s (cid:19) / (cid:18) ˆ t ( t − s ) α − (cid:2) F P ( s ) (cid:3) d s (cid:19) / and(5.16) h D αc b U P i L α (0 ,T ;[ · ] V ) = sup t ∈ [0 ,T ] ˆ t ( t − s ) α − h D αc b U P ( s ) i V d s ≤ Γ( α ) k A U k + k F P k L α (0 ,T ;[ · ] V ) . Similar to Lemma 2.4, we know that k F P k L α (0 ,T ;[ · ] V ) ≤ C k f k L α (0 ,T ;[ · ] V ) and hence D αc b U P is uniformly bounded L α (0 , T ; [ · ] V ).With this additional regularity, we can obtain an improved rate of convergence. To see this, wewill use that Φ is, essentially, quadratic to observe that in this case the error estimator, defined in(5.3) reduces to(5.17) E P = 12 a ( b U P − U P , b U P − U P ) = 12 h b U P − U P i V . These ingredients together give us the following improved estimate.
Theorem 5.8 (improved rate: linear problems) . Assume that the energy Φ is given by (5.15) , thatthe initial data satisfies A u ∈ H , and that f ∈ L α (0 , T ; [ · ] V ) . Let u be the energy solution to (1.2) ,and denote by P a partition of [0 , T ] defined as in (2.2) . Denote by b U P the solution to (4.7) startingfrom U ∈ H , such that A U ∈ H . In this setting, we have that (5.18) kE P k L α (0 ,T ; H ) ≤ Cτ α (cid:16) k A U k + k f k L α (0 ,T ;[ · ] V ) (cid:17) , where the constant C depends only on α . This, immediately, implies that E ≤ k u − U k + Cτ α (cid:0) k A U k + k f k L α (0 ,T ;[ · ] V ) (cid:1) + k f − F P k L α , H ) , so that if, in addition, we further have f ∈ W α,q (0 , T ; H ) for some q > /α , then (5.19) E ≤ k u − U k + Cτ α (cid:0) k A U k + k f k L α (0 ,T ;[ · ] V ) + | f | W α,q (0 ,T ; H ) (cid:1) where the constant C depends only on α, q and T .Proof. Owing to Theorem 5.3 and equation (5.14), the convergence rate (5.19) follows directly from(5.18) in the same way as Theorem 5.7. We only need to prove (5.18) and bound kE P k L α (0 ,T ; H ) .Using (5.17), for every r ∈ (0 , T ] we have2 ˆ r ( r − t ) α − E P ( t )d t = ˆ r ( r − t ) α − h b U P − U P i V ( t )d t. Now, we invoke Lemma 2.6 with p = 2 and the semi-norm [ · ] V to obtain that ˆ r ( r − t ) α − h b U P − U P i V ( t )d t ≤ Cτ α h D αc b U P i L α (0 ,T ;[ · ] V ) . By (5.16), we have that D αc b U P ∈ L α (0 , T ; [ · ] V ) uniformly in P and thus arrive at ˆ r ( r − t ) α − h b U P − U P i V ( t )d t ≤ Cτ α (cid:16) k A U k + k f k L α (0 ,T ;[ · ] V ) (cid:17) . This implies the desired bound kE P k L α (0 ,T ; H ) ≤ Cτ α (cid:16) k A U k + k f k L α (0 ,T ;[ · ] V ) (cid:17) for kE P k L α (0 ,T ; H ) and finishes the proof. (cid:3) Lipschitz perturbations
In this section, inspired by the results of [3], we consider the analysis and approximation of afractional gradient flow with a Lipschitz perturbation. Namely, we consider the following problem(6.1) ( D αc u ( t ) + ∂ Φ( u ( t )) + Ψ( t, u ( t )) ∋ f ( t ) , t ∈ (0 , T ] ,u (0) = u . We assume that the perturbation function Ψ : (0 , T ] × H → H satisfies IME FRACTIONAL GRADIENT FLOW 33
1. (Carath´eodory) For every w ∈ H the mapping t Ψ( t, w ) is strongly measurable on (0 , T )with values in H . Moreover, there exists L > t ∈ (0 , T ) and every w , w ∈ H we have k Ψ( t, w ) − Ψ( t, w ) k ≤ L k w − w k .
2. (Integrability) There is w ∈ L α (0 , T ; H ) for which t Ψ( t, w ( t )) ∈ L α (0 , T ; H ) . We immediately comment that our assumptions can fit the case where Φ is merely λ –convex.Moreover, these assumptions also guarantee the existence of ψ ∈ L α (0 , T ; R ) for which k Ψ( t, w ) k ≤ ψ ( t ) + L k w k , ∀ w ∈ H . Consequently w Ψ( · , w ( · )) is Lipschitz continuous in L α (0 , T ; H ).We introduce the notion of energy solution of (6.1). Definition 6.1 (energy solution) . A function u ∈ L (0 , T ; H ) is an energy solution to (6.1) if(i) (Initial condition) lim t ↓ t k u ( s ) − u k d s = 0 . (ii) (Regularity) D αc u ∈ L (0 , T ; H ) .(iii) (Evolution) For almost every t ∈ (0 , T ) we have D αc u ( t ) + ∂ Φ( u ( t )) + Ψ( t, u ( t )) ∋ f ( t ) . Evidently, an energy solution to (6.1) satisfies, for almost every t ∈ (0 , T ) and all w ∈ H , the EVI(6.2) h D αc u ( t ) , u ( t ) − w i + h Ψ( t, u ( t )) , u ( t ) − w i + Φ( u ( t )) − Φ( w ) ≤ h f ( t ) , u ( t ) − w i . Existence, uniqueness, and stability.
Our main result in this direction is the following.
Theorem 6.2 (well posedness) . Assume that the energy Φ is convex, l.s.c., and with nonemptyeffective domain. Assume the the mapping Ψ satisfies conditions 1 and 2 stated above. Let u ∈ D (Φ) and f ∈ L α (0 , T ; H ) , then there is a unique energy solution to (6.1) in the sense of Definition 6.1.Moreover, we have that this solution satisfies k D αc u k L α (0 ,T ; H ) ≤ C, where the constant depends only on the problem data α , T , u , f , Φ , and Ψ .Proof. We begin by proving existence. We essentially follow the idea used for the classical ODEs.A similar argument was also used in the proof of [25, Theorem 4.4].For w ∈ L α (0 , T ; H ) we denote by S ( w ) ∈ L α (0 , T ; H ) the energy solution to D αc u ( t ) + ∂ Φ( u ( t )) ∋ f ( t ) − Ψ( t, w ( t )) , a.e. t ∈ (0 , T ] , u (0) = u . Our assumptions and the results of Theorem 4.5 guarantee that this mapping is well defined, andmoreover, S ( w ) ∈ L ∞ (0 , T ; H ). We want to show that there exists a fixed point w such that S ( w ) = w . If u i = S ( w i ) for i = 1 ,
2, then for almost every t we have12 D αc k u ( t ) − u ( t ) k ≤ −h Ψ( t, w ( t )) − Ψ( t, w ( t )) , u ( t ) − u ( t ) i . This readily implies that k u ( t ) − u ( t ) k ≤ L Γ( α ) ˆ t ( t − s ) α − k w ( s ) − w ( s ) kk u ( s ) − u ( s ) k d s ≤ L k u − u k L ∞ (0 ,t ; H ) Γ( α ) ˆ t ( t − s ) α − k w ( s ) − w ( s ) k d s which as a consequence yields that, for every t ∈ [0 , T ], k u − u k L ∞ (0 ,t ; H ) ≤ L Γ( α ) k w − w k L α (0 ,t ; H ) . We claim that by induction, we can further obtain the following stability result(6.3) k S n ( w ) − S n ( w ) k L ∞ (0 ,t ; H ) ≤ L n t αn Γ( αn + 1) k w − w k L ∞ (0 ,t ; H ) for any t ∈ [0 , T ] and positive integer n . In fact, for n = 1, we simply have k u − u k L ∞ (0 ,t ; H ) ≤ L Γ( α ) k w − w k L α (0 ,t ; H ) ≤ L t α Γ( α + 1) k w − w k L ∞ (0 ,t ; H ) . Furthermore, if (6.3) holds for n = k , then for n = k + 1 k S k +1 ( w ) − S k +1 ( w ) k L ∞ (0 ,t ; H ) ≤ L Γ( α ) k S k ( w ) − S k ( w ) k L α (0 ,t ; H ) ≤ L Γ( α ) sup ≤ r ≤ t ˆ r ( r − s ) α − L k s αk Γ( αk + 1) k w − w k L ∞ (0 ,t ; H ) d s = L k +1 t α ( k +1) Γ( α ( k + 1) + 1) k w − w k L ∞ (0 ,t ; H ) , which proves (6.3). Now consider w ∈ L α (0 , T ; H ) and the sequence of functions defined via w n = S n ( w ). It is easy to see that, for n ≥
1, we have w n ∈ L ∞ (0 , T ; H ), and P ∞ n =1 k w n − w n +1 k L ∞ (0 ,T ; H ) converges because ∞ X n =0 L n t αn Γ( αn + 1) = E α ( L t α ) . This shows that w n → u in L ∞ (0 , T ; H ) for some u . Since w n +1 = S ( w n ), it follows immediatelythat u = S ( u ). This proves the existence of solutions.As for uniqueness, assume that we have two solutions u and u , for almost every t , we have12 D αc k u ( t ) − u ( t ) k ≤ −h Ψ( t, u ( t )) − Ψ( t, u ( t )) , u ( t ) − u ( t ) i ≤ L k u ( t ) − u ( t ) k . Combining with the fact that u (0) = u (0) = u , one obtains that k u ( t ) − u ( t ) k = 0 for almostevery t , which proves uniqueness.Finally, the estimate on the Caputo derivative trivially follows from the iteration scheme. Weskip the details. (cid:3) For diversity in our arguments, we present an alternative proof. The arguments here are inspiredby those of [3, Theorem 5.1].
Alternative proof of Theorem 6.2.
Let us, for µ > L /α , define k w k µ = sup t ∈ [0 ,T ] e − µt ˆ t ( t − s ) α − k w ( s ) k d s, which by the obvious inequalities e − µT ≤ e − µt ≤
1, defines an equivalent norm in L α (0 , T ; H ).Let S : L α (0 , T ; H ) → L α (0 , T ; H ) be as before. As shown, if u i = S ( w i ) for i = 1 ,
2, then forevery t we have k u ( t ) − u ( t ) k ≤ L Γ( α ) ˆ t ( t − s ) α − k w ( s ) − w ( s ) kk u ( s ) − u ( s ) k d s, which as a consequence yields that, for every r ∈ [0 , T ], e − µr ˆ r ( r − t ) α − k u ( r ) − u ( r ) k d r ≤ L e − µr Γ( α ) I( r ) , IME FRACTIONAL GRADIENT FLOW 35 where I( r ) = ˆ r ( r − t ) α − ˆ t ( t − s ) α − k w ( s ) − w ( s ) kk u ( s ) − u ( s ) k d s d t. Obvious manipulations then yieldI( r ) ≤ k u − u k µ k w − w k µ ˆ r ( r − t ) α − e µt d t, which implies e − µr ˆ r ( r − t ) α − k u ( r ) − u ( r ) k d r ≤ L Γ( α ) ˆ r ( r − t ) α − e − µ ( r − t ) d t ≤ L µ α < , so that S is a contraction with respect to the norm k · k µ . We conclude then by invoking thecontraction mapping principle. This unique fixed point, evidently, is a energy solution in the senseof Definition 6.1.Uniqueness and stability follow as before. (cid:3) Discretization.
Let us now present the numerical scheme for problem (6.1). We follow theprevious notations and conventions regarding discretization so that, for any partition P of [0 , T ]defined as in (2.2), we can also consider the discrete solution defined recursively via(6.4) F n − ( D α P U ) n − Ψ n ( U n ) ∈ ∂ Φ( U n ) , where F n is defined in (4.8) and Ψ n : H → H is defined byΨ n ( w ) = t n t n − Ψ( t, w )d t. Clearly, for every n , Ψ n is Lipschitz continuous with Lipschitz constant L . Using the definition of D α P in (3.3) and K − P ,nn = ( K P ,nn ) − = Γ( α + 1) τ − αn , we can rewrite (6.4) asΓ( α + 1) τ − αn U n + Ψ n ( U n ) + ∂ Φ( U n ) ∋ F n − n − X i =0 K − P ,ni U i . Hence the discrete scheme can be recursively well-defined provided L τ α < Γ( α + 1). For this reason,moving forward, we will implicitly operate under this assumption.It is possible to show that the discrete solutions in (6.4) satisfy(6.5) k D αc b U P k L α (0 ,T ; H ) ≤ C, with a constant that depends on problem data but is independent of the partition P . To see this,we follow the arguments of either proof of Theorem 6.2, and realize that while the operator S maydepend on P , the estimates that we obtain do not.6.3. Error estimates.
Let us now show how to derive error estimates for the problem with Lipschitzperturbation (6.1). We recall that the energy solution u to this problem satisfies (6.2). In addition,for simplicity, we will operate under the assumption that the perturbation does not depend explicitlyon time, i.e., Ψ( t, w ) = Ψ( w ) for all w ∈ H . The general case only lengthens the discussion butbrings nothing substantive to it, as the additional terms that appear can be controlled via argumentsused to control terms of the form f ( t ) − F P ( t ) . Similar to the discussion before, we define the error estimator(6.6) E P , L ( t ) = E P ( t ) + h Ψ( U P ( t )) , b U P ( t ) − U P ( t ) i , which, as before, is nonnegative. In addition, for any w ∈ H we have h D αc b U P ( t ) + Ψ( b U P ( t )) − f ( t ) , b U P ( t ) − w i + Φ( b U P ( t )) − Φ( w )= E P , L ( t ) + h F P ( t ) − Ψ( U P ( t )) − D αc b U P ( t ) , w − U P ( t ) i + Φ( U P ( t )) − Φ( w )+ h Ψ( U P ( t )) − Ψ( b U P ( t )) + f ( t ) − F P ( t ) , w − b U P ( t ) i≤ E P , L ( t ) + h Ψ( U P ( t )) − Ψ( b U P ( t )) + f ( t ) − F P ( t ) , w − b U P ( t ) i − σ ( U P ( t ); w ) . Setting w = u ( t ) in the inequality above and setting w = b U ( t ) in (6.2) leads to(6.7) D D αc (cid:16) b U P − u (cid:17) ( t ) , b U P ( t ) − u ( t ) E + σ ( U P ( t ); u ( t )) + σ ( u ( t ); b U P ( t )) ≤E P , L ( t ) + h Ψ( U P ( t )) − Ψ( b U P ( t )) + f ( t ) − F P ( t ) , u ( t ) − b U P ( t ) i + h Ψ( b U P ( t )) − Ψ( u ( t )) , u ( t ) − b U P ( t ) i for almost every t ∈ (0 , T ). This implies the following error estimates. Theorem 6.3 (a posteriori: Lipschitz perturbations) . Let u be the unique energy solution of (6.1) .Let P be a partition of [0 , T ] defined as in (2.2) and let U ∈ H N be the discrete solution given by (6.4) starting from U ∈ D (Φ) . Let E and E P , L be defined in (5.6) and (6.6) , respectively, Thefollowing a posteriori error estimate holds (6.8) E ≤ (cid:18) k u − U k + 2Γ( α ) kE P , L k L α (0 ,T ; H ) (cid:19) / ( E α (2 L T α )) / + 2Γ( α ) (cid:16) k f − F P k L α (0 ,T ; H ) + L k U P − b U P k L α (0 ,T ; H ) (cid:17) E α (2 L T α ) . Proof.
We argue as in the proof of (5.3). To make formulas shorter we omit the coercivity terms.From (2.18) and (6.7) we infer(6.9) 12 D αc k b U P − u k ( t ) ≤ D D αc (cid:16) b U P − u (cid:17) ( t ) , b U P ( t ) − u ( t ) E ≤ E P , L ( t ) + h Ψ( t, U P ( t )) − Ψ( t, b U P ( t )) + f ( t ) − F P ( t ) , u ( t ) − b U P ( t ) i + L k b U P ( t ) − u ( t ) k ( t ) ≤ E P , L ( t ) + (cid:16) L k U P ( t ) − b U P ( t ) k + k f ( t ) − F P ( t ) k (cid:17) k b U P ( t ) − u ( t ) k + L k b U P ( t ) − u ( t ) k . Then the error estimate (6.8) follows from Lemma 2.8 with λ = L , a ( t ) = k ( b U P − u )( t ) k , b = 0 , c = 2 E P , L ( t ) , d ( t ) = L k U P ( t ) − b U P ( t ) k + k ( f − F P )( t ) k . (cid:3) We also comment here that by Lemma 2.6 k U P − b U P k L α (0 ,T ; H ) ≤ Cτ α k D αc b U P k L α (0 ,T ; H ) ≤ CT α/ α / τ α k D αc b U P k L α (0 ,T ; H ) , where the constant C only depends on α . In addition, the norm on the right hand side is boundedindependently of the partition P ; see (6.5). Hence the convergence rates proved in Theorems 5.5and 5.7 also hold for problems with a Lipschitz perturbation. Since the proofs are almost identical,we only state the theorems below without proofs. Theorem 6.4 (convergence rate: Lipschitz perturbations) . Let u be the energy solution of (6.1) .Let P be a partition of [0 , T ] defined as in (2.2) and U ∈ H N be the discrete solution given by (6.4) starting from U ∈ D (Φ) . Let E be defined in (5.6) . Then we have E ≤ k u − U k ( E α (2 L T α )) / + Cτ α/ (cid:16) k f k L α (0 ,T ; H ) + k D αc b U P k L α (0 ,T ; H ) (cid:17) , where the constant C depends only on α, L and T , but not on P . IME FRACTIONAL GRADIENT FLOW 37
Theorem 6.5 (improved rate: smooth energies and Lipschitz perturbations) . Assume that theenergy Φ satisfies (5.12) . Let u be the energy solution to (6.1) , and denote by P a partition of [0 , T ] defined as in (2.2) . Denote by b U P the solution of (6.4) starting from U ∈ D (Φ) . In this setting, ifthere is q > /α for which f ∈ W α ( β +1) / ,q (0 , T ; H ) then the error E , defined in (5.6) , satisfies (6.10) E ≤ k u − U k ( E α (2 L T α )) / + C τ α k D αc b U P k L α (0 ,T ; H ) + C τ α ( β +1) / (cid:20)(cid:16) k f k L α (0 ,T ; H ) + k D αc b U P k L α (0 ,T ; H ) (cid:17) ( β +1) / + | f | W α ( β +1) / ,q (0 ,T ; H ) (cid:21) , where the constants C and C depend only on α, β, q, L , T , and the problem data, but are independentof P . Finally we consider the setting of Section 5.3.3 with a Lipschitz perturbation. Similar to (6.5),we can show that k D αc b U P k L α (0 ,T ;[ · ] V ) is bounded uniformly with respect to the partition P . For thisreason, an improved error estimate analogous to Theorem 5.8 can be proved in this case. Theorem 6.6 (improved rate: quadratic energies and Lipschitz perturbations) . Assume that theenergy Φ is given by (5.15) , that the initial data satisfies A u ∈ H , and that f ∈ L α (0 , T ; [ · ] V ) . Let u be the energy solution to (6.1) , and denote by P a partition of [0 , T ] defined as in (2.2) . Denoteby b U P the solution to (6.4) starting from U ∈ H , such that A U ∈ H . In this setting, we have that (6.11) E ≤ k u − U k ( E α (2 L T α )) / + C k f − F k L α (0 ,T ; H ) + Cτ α (cid:16) k A U k + k f k L α (0 ,T ;[ · ] V ) + k D αc b U P k L α (0 ,T ;[ · ] V ) + k D αc b U P k L α (0 ,T ; H ) (cid:17) where the constant C depends only on α, L and T . Numerical illustrations
In this section we present some simple numerical examples aimed at illustrating, and extending,our theory. All the computations were done with an in-house code that was written in MATLAB © .7.1. Practical a posteriori estimators.
We begin by commenting that, unlike the a posterioriestimators for the classical gradient flow proposed in [30], our a posteriori estimator E P is notconstant on each subinterval of our partition P ; see (5.3). Here we mention more computationallyfriendly alternatives, and their properties.First, we define an estimator that is piecewise constant in time via D P ( t ) = max s ∈ [ ⌊ t ⌋ P , ⌈ t ⌉ P ] n h D αc b U P ( s ) − F ( s ) , b U P ( s ) − U P ( s ) i + Φ( b U P ( s )) − Φ( U P ( s )) o This is clearly an upper bound for E P ( t ).One may also consider the simpler indicator(7.1) e E P ,n = h ( D α P U ) n − F n , U n − − U n i + Φ( U n − ) − Φ( U n ) , n = 1 , . . . , N. Although it is not always true that E P ( t ) ≤ e E P ,n ( t ) , this indicator is convenient to use in practiceand gives reasonable results. In fact, this is the one that we implemented in the numerical examplesof Section 7.3 below.7.2. A linear one dimensional example.
As a first simple example we consider the one dimen-sonal fractional ODE(7.2) D αc u + λu = 0 , u (0) = 1 , with λ >
0. From (2.19) we have u ( t ) = E α ( − λt α ) . This, obviously, fits our framework with H = R ,and Φ( w ) = λ | w | . Notice also that all the assumptions of Section 5.3.2 are also satisfied with β = 1.Thus, we expect a rate of order O ( τ α ) when using (4.7) to approximate the solution over a uniformpartition with time step τ . α = 0 . α = 0 . α = 0 . τ | u (1) − U N | rate5.000 e -02 4.563 e -04 —2.500 e -02 3.702 e -04 0.3014171.250 e -02 3.005 e -04 0.3009796.250 e -03 2.440 e -04 0.3006643.125 e -03 1.981 e -04 0.3004451.563 e -03 1.609 e -04 0.3002977.813 e -04 1.307 e -04 0.3001993.906 e -04 1.061 e -04 0.3001331.953 e -04 8.619 e -05 0.3000909.766 e -05 7.001 e -05 0.3000624.883 e -05 5.686 e -05 0.3000432.441 e -05 4.619 e -05 0.300030 τ | u (1) − U N | rate5.000 e -02 2.829 e -04 —2.500 e -02 1.996 e -04 0.5030511.250 e -02 1.409 e -04 0.5023096.250 e -03 9.954 e -05 0.5017103.125 e -03 7.032 e -05 0.5012481.563 e -03 4.969 e -05 0.5009027.813 e -04 3.512 e -05 0.5006483.906 e -04 2.483 e -05 0.5004631.953 e -04 1.755 e -05 0.5003309.766 e -05 1.241 e -05 0.5002354.883 e -05 8.773 e -06 0.5001662.441 e -05 6.203 e -06 0.500118 τ | u (1) − U N | rate5.000 e -02 1.235 e -04 —2.500 e -02 7.571 e -05 0.7054171.250 e -02 4.646 e -05 0.7046206.250 e -03 2.852 e -05 0.7038713.125 e -03 1.752 e -05 0.7032071.563 e -03 1.076 e -05 0.7026387.813 e -04 6.616 e -06 0.7021603.906 e -04 4.068 e -06 0.7017641.953 e -04 2.502 e -06 0.7014379.766 e -05 1.539 e -06 0.7011704.883 e -05 9.465 e -07 0.7009522.441 e -05 5.823 e -07 0.700774 Table 1.
Convergence rate for the approximation of (7.2) using scheme (4.7) overa uniform partition of size τ . As predicted by Section 5.3.2, the rate is O ( τ α ). -9 -8 -7 -6 -5 -4 -3 Figure 2.
Adaptive time stepping for problem (7.2) with T = 1 , λ = 1 , α = is used to achieve a tolerance of ε = 10 − . The adaptive solver uses 8 ,
747 timeintervals with minimum time step 6 . × − and max time step 5 . × − .Table 1 shows, for λ = 0 .
001 and different values of α , the difference | u (1) − U N | which we use asa proxy for the error E H of (5.6). The rate of convergence is verified.7.3. Adaptive time stepping.
We now illustrate the use of the a posteriori error estimator E P given in (5.3) to drive the selection of the size of the time step. For a given tolerance ε we, at everystep, choose the local time step τ n to guarantee that2 T α Γ( α + 1) e E P ,n ≤ ε , where e E P ,n is given in (7.1). Then, by Theorem 5.3, we expect that k u − b U P k L ∞ (0 ,T ; H ) ≤ ε, provided the approximation error k f − F P k L α (0 ,T ; H ) is negligible. Notice that to drive the processwe are using the simpler estimator e E P ; see the discussion in Section 7.1.We consider the linear problem (7.2) with λ = 1 and α = and set ε = 10 − . Figure 2 shows thelocal time step τ ( t ) for t ∈ [0 , T ]. As expected, due to the weak singularity of u at t = 0 the timestep must be rather small for small times. For larger times, however, the solution is smoother andlarger local time steps can be taken. With this process we obtain that k u − b U P k L ∞ (0 ,T ; H ) ≈ . × − , IME FRACTIONAL GRADIENT FLOW 39 τ | U ( N k ) − u ( N k − ) | rate7.813 e -04 — —3.906 e -04 1.256 e -06 —1.953 e -04 6.276 e -07 1.0013079.766 e -05 3.135 e -07 1.0012984.883 e -05 1.568 e -07 0.9992722.441 e -05 7.827 e -08 1.0027741.221 e -05 3.924 e -08 0.996178 Table 2.
Convergence rate for α = 0 . p = 1 .
5, and λ = 1 in Example 1 ofSection 7.4. The rate seems to be of order O ( τ ), which is better than what thetheory predicts.and this requires N = 8 ,
747 time subintervals. For comparison, choosing a uniform time step of τ = 6 . × − we require N = 163 ,
840 time intervals. This achieves an error of ε = 4 . × − ,which is slightly higher than that obtained with our adaptive procedure. This clearly shows theadvantages and possibilities for this strategy.7.4. Some nonlinear one dimensional examples.
We now, while staying in one dimension,depart from the linear theory and illustrate the performance of our method in a series of nonlinearexamples of increasing difficulty. In all the examples we set H = R and f = 0. Thus, we will onlyspecify the energy and initial condition in each case.In all the examples, since the exact solution is not known, we compare the solutions at differenttime levels. Specifically, we let τ k = 2 − k and upon denoting by U ( N k ) the approximate solution at T = 1 computed with step size τ k , we computerate k = log ( | U ( N k − ) − U ( N k − ) | ) − log ( | U ( N k ) − U ( N k − ) | ) . Example 1.
We let p ∈ (1 ,
2) and setΦ( w ) = λp | w | p , u = 110 . Notice that this example fits the framework of Section 5.3.2 with β = p −
1. However, as mentionedthere, it is not expected that the solution reaches zero in finite time, so we do not expect a reducedrate.To compute the discrete solution, at every time step, we need to solve a nonlinear equation of theform U n + c U n | U n | p − − W n = 0 , c = λτ α Γ( α + 1)where W n is known. We found the solution to this problem using Newton’s method, which worksfor small values of τ .Table 2 shows the results for α = 0 . p = 1 .
5, and λ = 1. These clearly indicate a rate of O ( τ ).7.4.2. Example 2.
We set Φ( w ) = λ ( u ln u − u ) , u = 0 . with λ >
0, so that D (Φ) = [0 , ∞ ). Notice that u ∈ D (Φ) \ D ( ∂ Φ).At each time step one needs to solve a problem of the form U n + c ln( U n ) − W n = 0 , c = λτ α Γ( α + 1) , and W n is known. This is solved with a Newton scheme, which runs into difficulties at the initialtime step. We go around this issue by using as initial value for the iteration a very small positivenumber. τ | U ( N k ) − u ( N k − ) | rate5.000 e -02 — —2.500 e -02 6.761 e -07 —1.250 e -02 5.330 e -07 0.3429916.250 e -03 4.088 e -07 0.3829163.125 e -03 3.077 e -07 0.4095831.563 e -03 2.290 e -07 0.4263287.813 e -04 1.689 e -07 0.4392953.906 e -04 1.238 e -07 0.4478181.953 e -04 9.039 e -08 0.4539369.766 e -05 6.579 e -08 0.4583994.883 e -05 4.777 e -08 0.4617092.441 e -05 3.463 e -08 0.4642091.221 e -05 2.507 e -08 0.4661366.104 e -06 1.813 e -08 0.4676563.052 e -06 1.310 e -08 0.468883 Table 3.
Convergence rate for α = 0 . λ = 10 − in Example 2 of Section 7.4.The rate seems to be of order O ( τ α ), which is better than what the theory predicts. τ | U ( N k ) − u ( N k − ) | rate5.000 e -02 — —2.500 e -02 3.370 e -07 —1.250 e -02 1.881 e -07 0.8409966.250 e -03 1.033 e -07 0.8649443.125 e -03 5.607 e -08 0.8812861.563 e -03 3.019 e -08 0.8931687.813 e -04 1.615 e -08 0.9022813.906 e -04 8.599 e -09 0.9095741.953 e -04 4.559 e -09 0.9156069.766 e -05 2.408 e -09 0.9207234.883 e -05 1.268 e -09 0.9251492.441 e -05 6.660 e -10 0.9290391.221 e -05 3.490 e -10 0.9324976.104 e -06 1.825 e -10 0.935603 Table 4.
Convergence rate for α = 0 . λ = 10 − in Example 3 of Section 7.4.The rate seems to be of order O ( τ ), which is better than what the theory predicts.Table 3 presents the results for α = 0 . λ = 10 − . These indicate that the convergence rateis O ( τ α ). Similar results for other choices of α and λ were obtained.7.4.3. Example 3.
As a final example we considerΦ( w ) = − λ q − (1 − u ) , u = 0 . Notice that D (Φ) = [0 , ∞ ) and, once again, u ∈ D (Φ) \ D ( ∂ Φ). Table 4 presents the results for α = 0 . λ = 10 − . We, again, seem to get a rate that is better than what the theory predicts. Acknowledgement
AJS is partially supported by NSF grant DMS-1720213.
IME FRACTIONAL GRADIENT FLOW 41
References
1. E. Affili and E. Valdinoci,
Decay estimates for evolution equations with classical and fractional time-derivatives ,J. Differential Equations (2019), no. 7, 4027–4060. MR 39127102. R.P. Agarwal and B. Ahmad,
Existence theory for anti-periodic boundary value problems of fractional differentialequations and inclusions , Comput. Math. Appl. (2011), no. 3, 1200–1214. MR 28247083. G. Akagi, Fractional flows driven by subdifferentials in Hilbert spaces , Israel J. Math. (2019), no. 2, 809–862.MR 40408464. M. Allen,
H¨older regularity for nondivergence nonlocal parabolic equations , Calc. Var. Partial Differential Equa-tions (2018), no. 4, Paper No. 110, 29. MR 38267175. , A nondivergence parabolic problem with a fractional time derivative , Differential Integral Equations (2018), no. 3-4, 215–230. MR 37381966. M. Allen, L. Caffarelli, and A. Vasseur, A parabolic problem with a fractional time derivative , Arch. Ration. Mech.Anal. (2016), no. 2, 603–630. MR 34885337. ,
Porous medium flow with both a fractional potential pressure and fractional time derivative , Chin. Ann.Math. Ser. B (2017), no. 1, 45–82. MR 35921568. I. Benedetti, V. Obukhovskii, and V. Taddei, On noncompact fractional order differential inclusions with general-ized boundary condition and impulses in a Banach space , J. Funct. Spaces (2015), Art. ID 651359, 10. MR 33354539. A. Bernardis, F.J. Mart´ın-Reyes, P.R. Stinga, and J.L. Torrea,
Maximum principles, extension problem andinversion for nonlocal one-sided equations , J. Differential Equations (2016), no. 7, 6333–6362. MR 345683510. L. Caffarelli and L. Silvestre,
An extension problem related to the fractional Laplacian , Comm. Partial DifferentialEquations (2007), no. 7-9, 1245–1260. MR 235449311. M. Caputo, Linear models of dissipation whose q is almost frequency independent-ii , Geophysical Journal of theRoyal Astronomical Society (1967), no. 5, 529–539.12. A. Cernea, On a fractional differential inclusion arising from real estate asset securitization and HIV models ,Ann. Univ. Buchar. Math. Ser. (2013), no. 2, 447–453. MR 316477713. F.H. Clarke,
Optimization and nonsmooth analysis , second ed., Classics in Applied Mathematics, vol. 5, Societyfor Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1990. MR 105843614. B. de Andrade and T.S. Cruz,
Regularity theory for a nonlinear fractional reaction-diffusion equation , NonlinearAnal. (2020), 111705, 14. MR 408067515. D. del Castillo-Negrete,
Fractional diffusion models of nonlocal transport , Phys. Plasmas (2006), no. 8, 082308,16. MR 224973216. X. Feng and M. Sutton, A new theory of fractional differential calculus , arXiv:2007.10244, 2020.17. Y. Feng, L. Li, J.-G. Liu, and X. Xu,
Continuous and discrete one dimensional autonomous fractional ODEs ,Discrete Contin. Dyn. Syst. Ser. B (2018), no. 8, 3109–3135. MR 384819218. I.M. Gel’fand and G.E. ˇSilov, Obobshchennye funksii i de˘istviya iad nimi , Obobˇsˇcennye funkcii, Vypusk 1.,Gosudarstv. Izdat. Fiz.-Mat. Lit., Moscow, 1958, (In Russian). MR 009771519. R. Gorenflo, A.A. Kilbas, F. Mainardi, and S.V. Rogosin,
Mittag-Leffler functions, related topics and applications ,Springer Monographs in Mathematics, Springer, Heidelberg, 2014. MR 324428520. L. Grafakos,
Classical Fourier analysis , third ed., Graduate Texts in Mathematics, vol. 249, Springer, New York,2014. MR 324373421. T.D. Ke, N.N. Thang, and L. Tran P. Thuy,
Regularity and stability analysis for a class of semilinear nonlocaldifferential equations in Hilbert spaces , J. Math. Anal. Appl. (2020), no. 2, 123655, 23. MR 403758622. J. Kemppainen, J. Siljander, V. Vergara, and R. Zacher,
Decay estimates for time-fractional and other non-localin time subdiffusion equations in R d , Math. Ann. (2016), no. 3-4, 941–979. MR 356322923. M. Krasnoschok, V. Pata, S.V. Siryk, and N. Vasylyeva, A subdiffusive Navier-Stokes-Voigt system , Phys. D (2020), 132503, 13. MR 408735224. M. Krasnoschok, V. Pata, and N. Vasylyeva,
Semilinear subdiffusion with memory in multidimensional domains ,Math. Nachr. (2019), no. 7, 1490–1513. MR 398232525. L. Li and J.-G. Liu,
A generalized definition of Caputo derivatives and its application to fractional ODEs , SIAMJ. Math. Anal. (2018), no. 3, 2867–2900. MR 380953526. , A note on deconvolution with completely monotone sequences and discrete fractional calculus , Quart.Appl. Math. (2018), no. 1, 189–198. MR 373309927. , Some compactness criteria for weak solutions of time fractional PDEs , SIAM J. Math. Anal. (2018),no. 4, 3963–3995. MR 382885628. , A discretization of Caputo derivatives with application to time fractional SDEs and gradient flows , SIAMJ. Numer. Anal. (2019), no. 5, 2095–2120. MR 400021929. Y. Lin, X. Li, and C. Xu, Finite difference/spectral approximations for the fractional cable equation , Math. Comp. (2011), no. 275, 1369–1396. MR 278546230. R.H. Nochetto, G. Savar´e, and C. Verdi, A posteriori error estimates for variable time-step discretizations ofnonlinear evolution equations , Comm. Pure Appl. Math. (2000), no. 5, 525–589. MR 1737503
31. C. Quan, T. Tang, and Yang J.,
How to define dissipation-preserving energy for time-fractional phase-fieldequations , arXiv:2007.14855, 2020.32. T. Roub´ıˇcek,
Nonlinear partial differential equations with applications , second ed., International Series of Numer-ical Mathematics, vol. 153, Birkh¨auser/Springer Basel AG, Basel, 2013. MR 301445633. W. Schirotzek,
Nonsmooth analysis , Universitext, Springer, Berlin, 2007. MR 233077834. P.R. Stinga and J.L. Torrea,
Extension problem and Harnack’s inequality for some fractional operators , Comm.Partial Differential Equations (2010), no. 11, 2092–2122. MR 275408035. M. Stynes, Too much regularity may force too much uniqueness , Fract. Calc. Appl. Anal. (2016), no. 6,1554–1562. MR 358936536. , Fractional-order derivatives defined by continuous kernels are too restrictive , Appl. Math. Lett. (2018), 22–26. MR 382027537. , Singularities , Handbook of fractional calculus with applications. Vol. 3, De Gruyter, Berlin, 2019, pp. 287–305. MR 396657038. T. Tang, H. Yu, and T. Zhou,
On energy dissipation theory and numerical stability for time-fractional phase-fieldequations , SIAM J. Sci. Comput. (2019), no. 6, A3757–A3778. MR 403609539. V. Vergara and R. Zacher, Lyapunov functions and convergence to steady state for differential equations offractional order , Math. Z. (2008), no. 2, 287–309. MR 239008240. A.N. Vityuk,
Existence of solutions of a differential inclusion of fractional order with an upper-semicontinuousright-hand side , Ukra¨ın. Mat. Zh. (1999), no. 11, 1562–1565. MR 174433641. R. Zacher, A weak Harnack inequality for fractional differential equations , J. Integral Equations Appl. (2007),no. 2, 209–232. MR 235500942. , Global strong solvability of a quasilinear subdiffusion problem , J. Evol. Equ. (2012), no. 4, 813–831.MR 300045743. , A De Giorgi–Nash type theorem for time fractional diffusion equations , Math. Ann. (2013), no. 1,99–146. MR 303812344. ,
A weak Harnack inequality for fractional evolution equations with discontinuous coefficients , Ann. Sc.Norm. Super. Pisa Cl. Sci. (5) (2013), no. 4, 903–940. MR 318457345. , Time fractional diffusion equations: solution concepts, regularity, and long-time behavior , Handbook offractional calculus with applications. Vol. 2, De Gruyter, Berlin, 2019, pp. 159–179. MR 396539346. Y. Zhang,
Numerical treatment of the modified time fractional Fokker-Planck equation , Abstr. Appl. Anal. (2014),Art. ID 282190, 10. MR 3191030
Email address , W. Li: [email protected]
Email address , A.J. Salgado: [email protected]@utk.edu