[PDF] Concentration Compactness for the Critical Maxwell-Klein-Gordon Equation

Abstract

We prove global regularity, scattering and a priori bounds for the energy critical Maxwell-Klein-Gordon equation relative to the Coulomb gauge on (1+4)-dimensional Minkowski space. The proof is based upon a modified Bahouri-Gerard profile decomposition [1] and a concentration compactness/rigidity argument by Kenig-Merle [10], following the method developed by the first author and Schlag [20] in the context of critical wave maps.

Full PDF

aa r X i v : . [ m a t h . A P ] N ov CONCENTRATION COMPACTNESS FOR THE CRITICALMAXWELL-KLEIN-GORDON EQUATION

JOACHIM KRIEGER AND JONAS L ¨UHRMANNA bstract . We prove global regularity, scattering and a priori bounds for the energy critical Maxwell-Klein-Gordon equation relative to the Coulomb gauge on (1 + / rigidity argument by Kenig-Merle [10], following the method developed by the ﬁrstauthor and Schlag [20] in the context of critical wave maps. C ontents

1. Introduction 12. Function spaces and technical preliminaries 133. Microlocalized magnetic wave equation 164. Breakdown criterion 355. A concept of weak evolution 366. How to arrive at the minimal energy blowup solution 537. Concentration compactness step 548. Rigidity argument 114References 1581. I ntroduction

The Maxwell-Klein-Gordon system on Minkowski space-time R + n , n ≥

1, is a classical ﬁeldtheory for a complex scalar ﬁeld φ : R + n → C and a connection 1-form A α : R + n → R for α = , , . . . , n . Deﬁning the covariant derivative D α = ∂ α + iA α and the curvature 2-form F αβ = ∂ α A β − ∂ β A α , the formal Lagrangian action functional of the Maxwell-Klein-Gordon system is given by Z R + n (cid:16) F αβ F αβ + D α φ D α φ (cid:17) dx dt , The ﬁrst author was supported in part by the Swiss National Science Foundation under Consolidator GrantBSCGI0 157694. The second author was supported in part by the Swiss National Science Foundation under grantSNF 200020-159925. where the Einstein summation convention is in force and Minkowski space R + n is endowed withthe standard metric diag( − , + , . . . , +  ∂ β F αβ = Im (cid:0) φ D α φ (cid:1) , (cid:3) A φ = , where (cid:3) A = D α D α is the covariant d’Alembertian. The system has two important features. First, itenjoys the gauge invariance A α A α − ∂ α γ, φ e i γ φ for any suitably regular function γ : R + n → R . Second, it is Lorentz invariant . Moreover, thesystem admits a conserved energy(1.2) E ( A , φ ) : = Z R n (cid:16) X α,β F αβ + X α (cid:12)(cid:12)(cid:12) D α φ (cid:12)(cid:12)(cid:12) (cid:17) dx . Given that the system of equations (1.1) is invariant under the scaling transformation A α ( t , x ) → λ A α ( λ t , λ x ) , φ ( t , x ) → λφ ( λ t , λ x ) for λ > , one distinguishes between the energy sub-critical case corresponding to n ≤

3, the energy criticalcase for n =

4, and the energy super-critical case for n ≥

5. To the best of the authors’ knowledge,at this point no methods are available to prove global regularity for large data for super-criticalnonlinear dispersive equations. The most advanced results for large data can be achieved for criticalequations.Imposing the

Coulomb condition P nj = ∂ j A j = A , the Maxwell-Klein-Gordon system decouples into a system of wave equations for the dynamical variables ( A j , φ ), j = , . . . , n , coupled to an elliptic equation for the temporal compo-nent A ,(MKG-CG)  (cid:3) A j = −P j Im (cid:0) φ D x φ (cid:1) , (cid:3) A φ = , ∆ A = − Im (cid:0) φ D φ (cid:1) , where P is the standard projection onto divergence free vector ﬁelds.We observe that in the formulation (MKG-CG), the components ( A j , φ ), j = , . . . , n , implicitlycompletely describe ( A α , φ ), since the missing component A is uniquely determined by the ellipticequation(1.3) ∆ A = − Im (cid:0) φ∂ t φ (cid:1) + | φ | A . For this reason, we will mostly work in terms of the dynamical variables ( A x , φ ), it being understoodthat required bounds on A can be directly inferred from (1.3). In particular, to describe initial datafor (MKG-CG), we will use the notation A j [0] : = ( A j , ∂ t A j )(0 , · ) and φ [0] : = ( φ, ∂ t φ )(0 , · ). Often,we will simply denote these by ( A x , φ )[0].The present work will give a complete analysis of the energy critical case n =

4. More precisely,we implement an analysis closely analogous to the one by the ﬁrst author and Schlag [20] in thecontext of critical wave maps in order to prove existence, scattering and a priori bounds for largeglobal solutions to (MKG-CG). Moreover, we establish a concentration compactness phenomenon,which describes a kind of “atomic decomposition” of sequences of solutions of bounded energy.

ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 3

To formulate our main result, we introduce the following notion of admissible data for the evo-lution problem (MKG-CG) on R + . Deﬁnition 1.1.

We call C ∞ -smooth data ( A x , φ )[0] admissible, provided A x [0] satisfy the Coulombcondition and φ [0] as well as all spatial curvature components F jk [0] are Schwartz class. Moreover,we require that for j = , . . . , , (cid:12)(cid:12)(cid:12) A j [0]( x ) (cid:12)(cid:12)(cid:12) . h x i − as | x | → ∞ . In particular, admissible data are of class H sx ( R ) × H s − x ( R ) for any s ≥ Theorem 1.2.

Consider the evolution problem (MKG-CG) on R + . There exists a functionK : (0 , ∞ ) −→ (0 , ∞ ) with the following property. Let ( A x , φ )[0] be an admissible Coulomb class data set such that thecorresponding full set of components ( A α , φ ) has energy E. Then there exists a unique global in timeadmissible solution ( A , φ ) to (MKG-CG) with initial data ( A x , φ )[0] that satisﬁes for any q + r ≤ with ≤ q ≤ ∞ , ≤ r < ∞ , γ = − q − r the following a priori bound (1.4) (cid:13)(cid:13)(cid:13)(cid:0) ( − ∆ ) − γ ∇ t , x A x , ( − ∆ ) − γ ∇ t , x φ (cid:1)(cid:13)(cid:13)(cid:13) L qt L rx ( R × R ) ≤ C r K ( E ) . The solution scatters in the sense that there exist ﬁnite energy free waves f j and g, (cid:3) f j = (cid:3) g = ,such that for j = , . . . , , lim t → + ∞ (cid:13)(cid:13)(cid:13) ∇ t , x A j ( t , · ) − ∇ t , x f j ( t , · ) (cid:13)(cid:13)(cid:13) L x ( R ) = , lim t → + ∞ (cid:13)(cid:13)(cid:13) ∇ t , x φ ( t , · ) − ∇ t , x g ( t , · ) (cid:13)(cid:13)(cid:13) L x ( R ) = , and analogously with di ﬀ erent free waves for t → −∞ . In fact, we will prove the signiﬁcantly stronger a priori bound (cid:13)(cid:13)(cid:13) ( A x , φ ) (cid:13)(cid:13)(cid:13) S ( R × R ) ≤ K ( E ) , where the precise deﬁnition of the S norm will be introduced in Section 2. The purpose of thisnorm is to control the regularity of the solutions.Recently, a proof of the global regularity and scattering a ﬃ rmations in the preceding theoremwas obtained by Oh-Tataru [32–34], following the method developed by Sterbenz-Tataru [40, 41]in the context of critical wave maps. Our conclusions were reached before the appearance of theirwork and our methods are completely independent.1.1. A history of the problem.

In this subsection we ﬁrst consider this work in the broader contextof the study of the local and global in time behavior of nonlinear wave equations and highlightsome of the important developments over the last decades that crucially enter the proof of ourmain theorem. Afterwards we give an overview of previous results on the Maxwell-Klein-Gordonequation.

Null structure.

In many geometric wave equations like the wave map equation, the Maxwell-Klein-Gordon equation, and the Yang-Mills equation, the nonlinearities exhibit so-called null structures.Heuristically speaking, such null structures are amenable to better estimates, because they damp the

CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION interactions of parallel waves. The key role that these special nonlinear structures play in the globalregularity theory of nonlinear waves was ﬁrst highlighted by Klainerman [11]. At that point thetheory of nonlinear wave equations relied for the most part on vector ﬁeld methods and parametricesin physical space. However, in more recent times the key role that null structures play also within themore technical harmonic analysis approach cannot be overstated. In fact, in [12] a whole programwith precise conjectures pertaining to the sharp well-posedness of a number of nonlinear waveequations with null structure was outlined. The present work may be seen as a further vindicationof the program outlined by Klainerman. Null structures also play a pivotal role in the much morecomplex system of Einstein equations, as evidenced for example in the recent deep sequence ofworks by Klainerman-Rodnianski-Szeftel [17] and Szeftel [44–48] on the bounded L curvatureconjecture. Function spaces.

The development of X s , b spaces by Klainerman-Machedon in the seminal works[13–16] in the low regularity study of nonlinear wave equations provided a powerful tool to takeadvantage of the null structures in geometric wave equations. The fact that the Maxwell-Klein-Gordon and Yang-Mills equations actually display a null structure in the Coulomb gauge was a keyobservation in these works. Moreover, the observation by Klainerman-Machedon that these nullstructures are beautifully compatible with the X s , b functional framework has been highly inﬂuentialever since. Di ﬀ erent variants of the X s , b spaces were independently introduced by Bourgain [2] inthe context of the nonlinear Schr¨odinger equation and the Korteweg-de Vries equation.In the quest to prove global regularity for critical wave maps for small initial data it turned out thatnot even the strongest versions of the critical X s , b type spaces yielded good algebra type estimates.This problem was resolved through the development of the null frame spaces in the breakthroughwork of Tataru [52]. We will introduce these spaces later on, see also [51], [53], and [20] for morediscussion. Renormalization.

The key di ﬃ culty for the (MKG-CG) equation is the equation (cid:3) A φ =

0, which inexpanded form is given by (cid:3) φ = − iA α ∂ α φ + i ( ∂ t A ) φ + A α A α φ. The contribution of the low-high frequency interactions in the magnetic interaction term (1.5) − iA j ∂ j φ turns out to be non-perturbative in the case when the spatial components of the connection formare just free waves. This problem already occurs for small data and is not only a large data issue.One encounters a similar situation in the wave map equation. In the breakthrough works [50, 51]Tao exploited the intrinsic gauge freedom for the wave maps problem to recast the nonlinearity intoa perturbative form. However, for the (MKG-CG) equation the gauge invariance is already spent.Rodnianski and Tao [35] found a way out of this impasse by incorporating the non-perturbativeterm into the linear operator and by deriving Strichartz estimates for the resulting wave operator viaa parametrix construction. This enabled them to prove global regularity for (MKG-CG) for smallcritical Sobolev data in n ≥ n = X s , b type and null frame spaces. The functional calculus from [22] for the paradi ﬀ erentialmagnetic wave operator (cid:3) pA = (cid:3) + i X k ∈ Z P < k − C A f reej P k ∂ j , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 5 where P k denotes the standard Littlewood-Paley projections and A f reej , j = , . . . ,

4, are free waves,plays an important role in this work and has to be adapted to the large data setting.

Bahouri-G´erard concentration compactness decomposition.

Bahouri and G´erard [1] proved thefollowing description of sequences of solutions to the free wave equation with uniformly boundedenergy.

Let { ( ϕ n , ψ n ) } n ≥ ⊂ ˙ H x ( R ) × L x ( R ) be a bounded sequence and let v n be the solution to thefree wave equation (cid:3) v n = on R × R with initial data ( v n , ∂ t v n ) | t = = ( ϕ n , ψ n ) . Then thereexists a subsequence { v ′ n } n ≥ of { v n } n ≥ and ﬁnite energy free waves V ( j ) as well as sequences (cid:8) ( λ ( j ) n , t ( j ) n , x ( j ) n ) (cid:9) n ≥ ⊂ R + × R × R , j ≥ , such that for every l ≥ , (1.6) v ′ n ( t , x ) = l X j = q λ ( j ) n V ( j ) (cid:18) t − t ( j ) n λ ( j ) n , x − x ( j ) n λ ( j ) n (cid:19) + w ( l ) n ( t , x ) and lim l →∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) w ( l ) n (cid:13)(cid:13)(cid:13) L t L x ( R × R ) = . Moreover, there is asymptotic decoupling of the free energy (cid:13)(cid:13)(cid:13) ∇ t , x v ′ n (cid:13)(cid:13)(cid:13) L x = l X j = (cid:13)(cid:13)(cid:13) ∇ t , x V ( j ) (cid:13)(cid:13)(cid:13) L x + (cid:13)(cid:13)(cid:13) ∇ t , x w ( l ) n (cid:13)(cid:13)(cid:13) L x + o (1) as n → ∞ , and for each j , k, we have the asymptotic orthogonality property (1.7) lim n →∞ λ ( j ) n λ ( k ) n + λ ( k ) n λ ( j ) n + | t ( j ) n − t ( k ) n | λ ( j ) n + | x ( j ) n − x ( k ) n | λ ( j ) n = ∞ . The free waves V ( j ) are referred to as concentration proﬁles and the importance of the linear proﬁledecomposition (1.6) is that it captures the failure of compactness of the sequence of bounded solu-tions { v n } n ≥ to the free wave equation in terms of the non-compact symmetries of the equation andthe superposition of proﬁles. Simultaneously to Bahouri and G´erard, Merle and Vega [28] obtainedsimilar concentration compactness decompositions in the context of the mass-critical nonlinearSchr¨odinger equation. This is very analogous to the concentration compactness method originallydeveloped in the context of elliptic equations, see e.g. [23–26] and [43] for a discussion of theoriginal works.Bahouri and G´erard also established an analogous nonlinear proﬁle decomposition for Shatah-Struwe solutions { u n } n ≥ to the energy critical defocusing nonlinear wave equation (cid:3) u n = u n on R × R , see [38], with the same initial data ( u n , ∂ t u n ) | t = = ( ϕ n , ψ n ). Their main application of thisnonlinear proﬁle decomposition was to prove the existence of a function A : (0 , ∞ ) → (0 , ∞ ) withthe property that for any Shatah-Struwe solution u to (cid:3) u = u it holds that k u k L t L x ( R × R ) ≤ A (cid:0) E ( u ) (cid:1) , where E ( u ) denotes the energy functional associated with the quintic nonlinear wave equation.The Bahouri-G´erard proﬁle decomposition is of fundamental importance for the Kenig-Merlemethod that we will describe in the next paragraph. In the proof of our main theorem we will haveto study sequences of solutions to the (MKG-CG) equation with uniformly bounded energy. Akey step will be to obtain an analogous Bahouri-G´erard proﬁle decomposition for such sequences.This poses signiﬁcant problems, which can be heuristically understood as follows. Very roughlyspeaking, the reason why the Bahouri-G´erard proﬁle decomposition “works” for the energy critical CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION nonlinear wave equation (cid:3) u = u is that in the quintic nonlinearity the interaction of two nonlinearconcentration proﬁles living at asymptotically orthogonal frequency scales vanishes. This reducesto consider diagonal frequency interactions of the concentration proﬁles in the nonlinearity, but thenthese proﬁles must essentially be supported in di ﬀ erent regions of space-time due to the asymptoticorthogonality property (1.7) and therefore do not interact strongly. In contrast, for the (MKG-CG) equation frequency diagonalization appears to fail in the di ﬃ cult magnetic interaction term(1.5) for low-high interactions. A similar situation occurs for critical wave maps. In the lattercontext the ﬁrst author and Schlag [20] devised a novel proﬁle decomposition to take into accountthe corresponding low-high frequency interactions. Our approach is strongly inﬂuenced by [20],but we will have to use a slightly di ﬀ erent “covariant” wave operator to extract the concentrationproﬁles, see the discussion in the next subsection. The Kenig-Merle method.

Kenig and Merle [9, 10] introduced a very general method to proveglobal well-posedness and scattering for critical nonlinear dispersive and wave equations in bothdefocusing and focusing cases, in the latter case only for energies strictly less than the ground stateenergy. Their approach has found a vast amount of applications over the last years. We illustratethe method in the context of the energy critical defocusing nonlinear wave equation (cid:3) u = u on R × R . One can use the L t L x ( R × R ) norm to control the regularity of solutions to this equationand easily prove local well-posedness and small data global well-posedness based on this norm.In the ﬁrst step of the Kenig-Merle method one assumes that global well-posedness and scatteringfails for some ﬁnite energy level. Then let E crit be the critical energy below which all solutions existglobally in time with ﬁnite L t L x ( R × R ) bounds, in particular it must hold that E crit >

0. Thus, weﬁnd a sequence of solutions { u n } n ≥ such that E ( u n ) → E crit and k u n k L t L x → ∞ as n → ∞ . Applyingthe Bahouri-G´erard proﬁle decomposition to { u n (0) } n ≥ , we may conclude by the minimality of E crit that there exists exactly one proﬁle in the decomposition. This enables us to extract a minimalblowup solution u C of lifespan I with E ( u C ) = E crit and k u C k L t L x ( I × R ) = ∞ . Moreover, we can infera crucial compactness property of u C , namely that there exist continuous functions x : I → R and λ : I → R + such that the family of functions ((cid:18) λ ( t ) u C (cid:18) t , · − x ( t ) λ ( t ) (cid:19) , λ ( t ) ∂ t u C (cid:18) t , · − x ( t ) λ ( t ) (cid:19)(cid:19) : t ∈ I ) is pre-compact in ˙ H x ( R ) × L x ( R ). The second step of the Kenig-Merle method is a rigidity argu-ment to rule out the existence of such a minimal blowup solution u C by combining the compactnessproperty with conservation laws and other identities of virial or Morawetz type for the energy criti-cal nonlinear wave equation. We will adapt the Kenig-Merle method to the Maxwell-Klein-Gordonequation.We now review previous results on the Maxwell-Klein-Gordon equation. The existence of globalsmooth solutions to the Maxwell-Klein-Gordon equation in n = n = ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 7 also Selberg-Tesfahun [37] for a ﬁnite energy global well-posedness result for the Maxwell-Klein-Gordon equation in n = n = n = n ≥ ﬃ cult magnetic interaction term in (MKG-CG) into the linearwave operator and to derive Strichartz estimates for the resulting wave operator via a parametrixconstruction.Small energy global well-posedness of the energy critical Maxwell-Klein-Gordon equation in n = n = ﬀ erent proof was later obtained by Oh [30, 31], using the Yang-Mills heat ﬂow. Global regular-ity for the Yang-Mills system for small critical Sobolev data for n ≥ (cid:3) u = u on R + and for radial critical wave mapson R + . At this point, with the exception of some special problems, it appears that a general largedata result cannot be inferred by using the small data result as a black box, but instead requires amore or less complete re-working of the small data theory. See, for instance, the works on criticallarge data wave maps [40, 41], [20], [49]. Our approach here is to implement a similar strategy asthe one by the ﬁrst author and Schlag [20] for critical wave maps, which consists of essentially twosteps. First, a novel “covariant” Bahouri-G´erard procedure to take into account the non-negligibleinﬂuence of low on high frequencies in the magnetic interaction term. Second, an implementation ofa variant of a concentration compactness / rigidity argument by Kenig-Merle, following more or lessthe sequence of steps in [10]. As the latter was introduced in the context of a scalar wave equation,and we are considering a complex nonlinearly linked system, we believe that the implementation ofthis step for the energy critical Maxwell-Klein-Gordon equation is also of interest in its own right.We expect that our methods extend to prove global regularity, scattering and a priori bounds forthe energy critical Yang-Mills equations in n = Overview of the proof.

In this subsection we give a detailed overview of the proof of Theo-rem 1.2. In fact, the purpose of this paper is to prove a signiﬁcantly stronger version of Theorem 1.2,

CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION namely the existence of a function K : (0 , ∞ ) → (0 , ∞ ) with (cid:13)(cid:13)(cid:13) ( A x , φ ) (cid:13)(cid:13)(cid:13) S ( R × R ) ≤ K ( E ) , E = E ( A , φ )for any admissible solution ( A , φ ) to (MKG-CG). Once this a priori bound is known, one also obtainsthe scattering assertion in Theorem 1.2. The fact that the dynamical variables of a global admissiblesolution scatter to ﬁnite energy free waves, and not to solutions to a suitable linear magnetic waveequation, crucially relies on our strong spatial decay assumptions about the data, see the proof ofscattering at the end of Section 8. The precise deﬁnition of the S space and its time localizedversion will be given in Section 2 and Deﬁnition 4.1.Beginning the argument at this point, we assume that the existence of such a function K fails forsome ﬁnite energy level. Thus, the set of energies E : = n E > ( A ,φ ) admissible E ( A ,φ ) ≤ E (cid:13)(cid:13)(cid:13) ( A x , φ ) (cid:13)(cid:13)(cid:13) S = + ∞ o is non-empty. In view of the small energy global well-posedness result [22], it therefore has apositive inﬁmum, which we denote by E crit , E crit : = inf E . By deﬁnition we can then ﬁnd a sequence of admissible solutions { ( A n , φ n ) } n ≥ to (MKG-CG) suchthat E ( A n , φ n ) → E crit , lim n →∞ (cid:13)(cid:13)(cid:13) ( A nx , φ n ) (cid:13)(cid:13)(cid:13) S = + ∞ . As in [20], we call such a sequence an essentially singular sequence . The goal of this paper is torule out the existence of such an object. This will be accomplished in broad strokes by the followingtwo steps.(1) Extracting an energy class minimal blowup solution (cid:0) A ∞ , Φ ∞ (cid:1) to (MKG-CG) with thecompactness property via a modiﬁed Bahouri-G´erard procedure, which consists of an in-ductive sequence of low-frequency approximations and a proﬁle extraction process takinginto account the e ﬀ ect of the magnetic potential interaction. Here we closely follow theprocedure introduced by the ﬁrst author and Schlag [20], but we have to subtly divergefrom the proﬁle extraction process there to correctly capture the asymptotic evolution ofthe atomic components. We note that the heart of the modiﬁed Bahouri-G´erard procedureresides in Section 7.(2) Ruling out the existence of the minimal blowup solution (cid:0) A ∞ , Φ ∞ (cid:1) by essentially follow-ing the method of Kenig and Merle [10]. This step is carried out in Section 8.We now describe these steps in more detail. A concept of weak evolution for energy class data.

In order to extract a minimal blowup solu-tion at the end of the modiﬁed Bahouri-G´erard procedure, we ﬁrst need to introduce the notionof a solution to (MKG-CG) that is merely of energy class. A natural idea here is to approximatea given Coulomb energy class datum by a sequence of admissible data and to deﬁne the energyclass solution to (MKG-CG) as a suitable limit of the admissible solutions. One then needs a goodperturbation theory to show that this limit is well-deﬁned and independent of the approximatingsequence. Unfortunately, there is not such a strong perturbation theory for (MKG-CG) as for in-stance for critical wave maps in [20] due to a low frequency divergence. However, the problem withevolving irregular data is really a “high frequency issue” and in Proposition 5.1 we show that thereis a good perturbation theory for perturbing frequency localized data by adding high frequency per-turbations. We can then deﬁne the evolution of a Coulomb energy class datum as a suitable limit

ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 9 of the evolutions of low frequency approximations of the energy class datum, provided these lowfrequency approximations exist on some joint time interval and satisfy uniform S norm boundsthere. This is achieved in Proposition 5.2 via a suitable method of localizing the data and exploitinga version of Huygens’ principle together with the gauge invariance of the Maxwell-Klein-Gordonsystem. This step is additionally complicated by the fact that the (MKG-CG) equation does nothave the ﬁnite speed of propagation property due to non-local terms in the equation for the spatialcomponents of the connection form A . Bahouri-G´erard I: Filtering out frequency atoms and evolving the “non-atomic” lowest frequencyapproximation.

The extraction of the energy class minimal blowup solution (cid:0) A ∞ , Φ ∞ (cid:1) consists ofa two step Bahouri-G´erard type procedure. This is carried out in Section 7 and forms the core ofour argument. In the ﬁrst step we consider the initial data ( A nx , φ n )[0] at time t = A n , φ n ) and use a procedure due to M´etivier-Schochet [29] to extract frequencyscales. In what follows we will slightly abuse notation and write ( A n , φ n )[0] instead of ( A nx , φ n )[0].This yields the decompositions A n [0] = Λ X a = A na [0] + A n Λ [0] ,φ n [0] = Λ X a = φ na [0] + φ n Λ [0] , where the “frequency atoms” ( A na , φ na )[0] are essentially frequency localized to scales ( λ na ) − thattend apart as n → ∞ , more precisely lim n →∞ λ na λ na ′ + λ na ′ λ na = ∞ for a , a ′ , while the error ( A n Λ , φ n Λ )[0] satisﬁeslim sup n →∞ (cid:13)(cid:13)(cid:13) A n Λ [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ + (cid:13)(cid:13)(cid:13) φ n Λ [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ < δ for given δ > Λ = Λ ( δ ) su ﬃ ciently large. Moreover, we prepare these frequency atoms suchthat their frequency supports are sharply separated as n → ∞ and so that the errors (cid:8) ( A n Λ , φ n Λ )[0] (cid:9) n ≥ are supported away from the frequency scales ( λ na ) − in frequency space. Then we select a numberof “large” frequency atoms ( A na , φ na )[0], a = , . . . , Λ , whose energy E ( A na , φ na ) is above a certainsmall threshold ε depending only on E crit . We order these frequency atoms by the scale aroundwhich they are essentially supported starting with the lowest one.Eventually, we want to conclude that the essentially singular sequence of data (cid:8) ( A n , φ n )[0] (cid:9) n ≥ consists of exactly one non-trivial frequency atom that is composed of exactly one non-trivial phys-ical concentration proﬁle of asymptotic energy E crit . We argue by contradiction and assume thatthis is not the case. Then the idea is to approximate the essentially singular sequence of initialdata ( A n , φ n )[0] by low frequency truncations, obtained by removing all or some of the atoms( A na , φ na )[0], a = , . . . , Λ , and to inductively derive bounds on the S norms of the (MKG-CG)evolutions of the truncated data. As this induction stops after Λ many steps, we obtain an a prioribound on the evolutions(1.8) lim inf n →∞ k ( A n , φ n ) k S < ∞ , contradicting the assumption that { ( A n , φ n ) } n ≥ is an essentially singular sequence of solutions to(MKG-CG). We observe that by construction the “non-atomic” errors ( A n Λ , φ n Λ )[0] are split into Λ + A na , φ na )[0], i.e. we can write A n Λ [0] = Λ + X j = A n j Λ [0] , φ n Λ [0] = Λ + X j = φ n j Λ [0] , where the ﬁrst pieces ( A n Λ , φ n Λ )[0] have Fourier support in the region closest to the origin. InSubsection 7.3 we then derive a priori S norm bounds on the evolutions of the lowest frequencyapproximations ( A n Λ , φ n Λ )[0]. The problem here is that the pieces ( A n Λ , φ n Λ )[0] might still havelarge energy, which forces us to use a ﬁnite number of further delicately chosen low frequencyapproximations (cid:0) P J L A n Λ , P J L φ n Λ (cid:1) [0] of these pieces. Importantly, this number only depends on thesize of E crit . We then inductively obtain bounds on the S norms of the (MKG-CG) evolutions ofthe low frequency approximations (cid:0) P J L A n Λ , P J L φ n Λ (cid:1) [0] by bootstrap. This step is tied together inProposition 7.4. In particular, Step 3 of the proof of Proposition 7.4 is the core perturbative resultof this paper and is used in variations at other instances later on. Bahouri-G´erard II: Selecting concentration proﬁles and adding the ﬁrst large frequency atom.

Hav-ing established control over the evolution of the lowest frequency “non-atomic” part ( A n Λ , φ n Λ )[0],we then add the ﬁrst frequency atom ( A n , φ n )[0] and consider the evolution of the data (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0] . Here we ﬁrst have to understand the lack of compactness of the functions ( A n , φ n )[0]. It is atthis point that we deviate most signiﬁcantly from the standard Bahouri-G´erard proﬁle extractionprocedure [1] and also the modiﬁed proﬁle extraction procedure developed by the ﬁrst author andSchlag [20] in the context of critical wave maps. We still extract the concentration proﬁles for thedata A n [0] using the standard Bahouri-G´erard extraction procedure. However, we evolve the data φ n [0] with respect to the following “covariant” wave operator e (cid:3) A n : = (cid:3) + i (cid:0) A n Λ ,ν + A n , f ree ν (cid:1) ∂ ν and extract the proﬁles as weak limits of these evolutions to take into account the strong low-highinteractions for (MKG-CG). Here, A n Λ ( t , x ) is the (MKG-CG) evolution of the low frequency data( A n Λ , φ n Λ )[0], while A n , f reej is the free wave evolution of the data A n j [0] for j = , . . . ,

4, and wesimply put A n , f ree =

0. In comparison with [1] and [20], a key di ﬃ culty in this step is that solutionsto the covariant linear wave equation e (cid:3) A n u = t abn , x abn ) for extracting the concentration proﬁles both for A n [0] and for φ n [0]. Oncethe proﬁles have been picked, we use them to construct approximate, but highly accurate, nonlinearproﬁles in Theorem 7.14. To this end we solve the (MKG-CG) system in very large but ﬁnite space-time boxes centered around ( t abn , x abn ), using the concentration proﬁles as data, while outside of theseboxes, we use the free wave propagation for A and the “full” covariant wave operator (involving theinﬂuence of all other proﬁles) for φ . This is the same strategy as the one pursued for wave mapsin [20]. Provided that all concentration proﬁles have energy strictly less than E crit with respect tothe Maxwell-Klein-Gordon energy functional, we can then use our perturbation theory to constructthe global (MKG-CG) evolution of the data (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0] and to obtain a priori S normbounds. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 11

Conclusion of the induction on frequency process.

We may then repeat the preceding steps and “addin” all remaining frequency atoms to conclude a priori global S norm bounds on the evolution ofthe full data ( A n , φ n )[0]. The conclusion of this induction on frequency process is that we arrive ata contradiction, unless the essentially singular sequence of data ( A n , φ n )[0] consists of exactly onefrequency atom that is composed of precisely one concentration proﬁle of asymptotic energy E crit .Due to our relatively poor perturbation theory for (MKG-CG), it then still requires a fair amountof work to extract an energy class minimal blowup solution from this essentially singular sequence( A n , φ n ), see Section 6 and Subsection 7.6. Finally, in Theorem 7.23 we obtain an energy classminimal blowup solution ( A ∞ , Φ ∞ ) to (MKG-CG) with lifespan I and with the crucial compactnessproperty that there exist continuous functions x : I → R and λ : I → (0 , ∞ ) so that each of thefamily of functions ((cid:18) λ ( t ) A ∞ j (cid:18) t , · − x ( t ) λ ( t ) (cid:19) , λ ( t ) ∂ t A ∞ j (cid:18) t , · − x ( t ) λ ( t ) (cid:19)(cid:19) : t ∈ I ) for j = , . . . , ((cid:18) λ ( t ) Φ ∞ (cid:18) t , · − x ( t ) λ ( t ) (cid:19) , λ ( t ) ∂ t Φ ∞ (cid:18) t , · − x ( t ) λ ( t ) (cid:19)(cid:19) : t ∈ I ) is pre-compact in ˙ H x ( R ) × L x ( R ). The Kenig-Merle rigidity argument.

In the ﬁnal Section 8, we rule out the existence of such aminimal blowup solution ( A ∞ , Φ ∞ ) with the compactness property by following the scheme of theKenig-Merle rigidity argument [10]. The idea is to infer from the compactness property and theminimal energy property of ( A ∞ , Φ ∞ ) the existence of either a static solution to (MKG-CG) or elsethe existence of a self-similar blowup solution to (MKG-CG) and to then exclude the existence ofboth of these objects.A crucial step in the Kenig-Merle rigidity argument is to conclude that the momentum of ( A ∞ , Φ ∞ )must vanish. The proof of this hinges on the relativistic invariance of the Maxwell-Klein-Gordonequation and the transformation behavior of the Maxwell-Klein-Gordon energy functional underLorentz transformations. This step is technically di ﬃ cult for the Maxwell-Klein-Gordon equation,because the S norm is much more complicated than the Strichartz norms used in [10].We then distinguish between the lifespan I of ( A ∞ , Φ ∞ ) being ﬁnite in at least one time directionor not. If I is inﬁnite, we face the possibility of a static solution, which we rule out using virialtype identities for the Maxwell-Klein-Gordon equation, the vanishing momentum condition for( A ∞ , Φ ∞ ) and a Vitali covering argument from [20]. If instead I is ﬁnite at one end, we reduceto a self-similar blowup scenario. We then uncover a Lyapunov functional for solutions to theMaxwell-Klein-Gordon equation in self-similar variables. This is the key ingredient, which allowsus to also rule out this scenario. The derivation of this Lyapunov functional appears signiﬁcantlymore complicated than in [10] or [20] and we use the trick of working in a Cronstrom-type gaugeto simplify the computations.1.3. Overview of the paper.

We now give an overview of the structure of this paper. The twomain steps of the proof of Theorem 1.2 are the modiﬁed Bahouri-G´erard procedure in Section 7and the rigidity argument in Section 8. The necessary technical preparations are carried out in thesections leading up to Section 7. • In Section 2 we lay out the functional framework following [22]. • In Section 3 we prove key estimates for the linear magnetic wave equation (cid:3) pA u = f . • In Section 4 we state the property of the S norm as a regularity controlling device. • In Section 5 we show how to unambiguously locally evolve Coulomb energy class data( A x , φ )[0] via approximation by smoothed data and truncation in physical space to reduceto the admissible setup. Here one needs to pay close attention to the fact that solutions to(MKG-CG) do not obey as good a perturbation theory with respect to the S spaces as,say, critical wave maps in a suitable gauge, due to a low frequency divergence. Hence,one needs to be very careful about the correct choice of smoothing, using low frequencytruncations of the data. Moreover, to ensure the existence of an energy class local evolutionof Coulomb energy class data ( A x , φ )[0] on a non-trivial time slice around t =

0, we needto prove uniform S norm bounds for the approximations, which we accomplish similarlyto the procedure in [20] via localization in physical space, see Proposition 5.2. We alsointroduce the concept of the “lifespan” of such an energy class solution and the deﬁnitionof its S norm. • In Section 6 we then state that energy class data ( A x , φ )[0] obtained as the limit of the dataof an essentially singular sequence, which will be the outcome of the modiﬁed Bahouri-G´erard procedure, lead to a singular solution ( A , φ ) in the sense thatsup J ⊂ I (cid:13)(cid:13)(cid:13) ( A x , φ ) (cid:13)(cid:13)(cid:13) S ( J × R ) = + ∞ , where I denotes its lifespan. The proof of this as well as a number of further technicalassertions will be relegated to Subsection 7.4. • In Section 7 we carry out the modiﬁed Bahouri-G´erard procedure. In Subsection 7.1 andSubsection 7.2 we extract the “frequency atoms” mimicking closely the procedure in [20].Then we show in Subsection 7.3 how the lowest frequency “non-atomic” part of the lowfrequency approximation induction can be globally evolved with good S norm bounds.In Subsection 7.4 we prove several technical assertions that all use the core perturbativeresult from Step 3 of the proof of Proposition 7.4. In Subsection 7.5, we add the ﬁrst“large” frequency atom by extracting concentration proﬁles and invoking the inductionon energy hypothesis that all proﬁles have energy strictly less than E crit . The end resultof the modiﬁed Bahouri-G´erard procedure is obtained in Subsection 7.6, see in particularTheorem 7.23. We then have a minimal blowup solution ( A ∞ , Φ ∞ ) with the requiredcompactness property. • In Section 8 we rule out the existence of a minimal blowup solution ( A ∞ , Φ ∞ ) with thecompactness property. To this end we largely follow the scheme of the rigidity argumentby Kenig-Merle [10]. In Subsection 8.1 we derive several energy and virial identities forenergy class solutions to (MKG-CG). Then we prove some preliminary properties of theminimal blowup solution ( A ∞ , Φ ∞ ), in particular that its momentum must vanish. Denot-ing by I the lifespan of ( A ∞ , Φ ∞ ), we distinguish between I + : = I ∩ [0 , ∞ ) being a ﬁnite oran inﬁnite time interval. In the next Subsection 8.2, we exclude the existence of a minimalblowup solution ( A ∞ , Φ ∞ ) with inﬁnite time interval I + using the virial identities, the factthat the momentum of ( A ∞ , Φ ∞ ) must vanish and an additional Vitali covering argumentintroduced in [20]. Moreover, we reduce the case of ﬁnite lifespan I + to a self-similarblowup scenario. In the last Subsection 8.3, we then derive a suitable Lyapunov func-tional for the Maxwell-Klein-Gordon system in self-similar variables, which will enableus to also rule out the self-similar case and thus ﬁnishes the rigidity argument. Finally, weaddress the proof of the scattering assertion in Theorem 1.2. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 13

We remark that we will often abuse notation and denote the spatial components A x of the con-nection form simply by A . Acknowledgments : The authors are grateful to the referee for valuable corrections and suggestions.2. F unction spaces and technical preliminaries

We will be working with the same function spaces that were used for the small data energycritical global well-posedness result for the MKG-CG system [22] together with their time-localizedversions. In this section we brieﬂy recall their deﬁnitions. For a more detailed discussion of thesespaces we refer to Section 3 in [22] and [51], [40], [20].In this work we only rely on the precise ﬁne structure of these spaces in that we frequently usethe multilinear estimates from [22] to reduce to “su ﬃ ciently generic” situations where a divisibilityargument works, i.e. when all inputs are approximately at the same frequency and have angularseparation between their frequency supports.In order to introduce various Littlewood-Paley projection operators, we pick a non-negative evenbump function ϕ ∈ C ∞ ( R ) satisfying ϕ ( y ) = | y | ≤ ϕ ( y ) = | y | > ϕ ( y ) = ϕ ( y ) − ϕ (2 y ). Then we deﬁne the standard Littlewood-Paley projection operators for k ∈ Z by d P k f ( ξ ) = ϕ (cid:0) − k | ξ | (cid:1) b f ( ξ ) . We use the concept of modulation to measure proximity of the space-time Fourier support to thelight cone and deﬁne for j ∈ Z the projection operators F (cid:0) Q j f (cid:1) ( τ, ξ ) = ϕ (cid:0) − j || τ | − | ξ || (cid:1) F ( f )( τ, ξ ) , F (cid:0) Q ± j f (cid:1) ( τ, ξ ) = ϕ (cid:0) − j || τ | − | ξ || (cid:1) χ {± τ> } F ( f )( τ, ξ ) , where F denotes the space-time Fourier transform. Occasionally, we also need multipliers S l torestrict the space-time frequency and correspondingly set for l ∈ Z , F (cid:0) S l f (cid:1) ( τ, ξ ) = ϕ (cid:0) − l | ( τ, ξ ) | (cid:1) F ( f )( τ, ξ ) . We also use projection operators P ω l to localize the homogeneous variable ξ | ξ | to caps ω ⊂ S ofdiameter ∼ l for integers l < ﬀ s. We assume that for each such l < ﬀ s form a smooth partition of unity subordinate to a uniformly ﬁnitely overlapping covering of S by caps ω of diameter ∼ l .With these projection operators in hand we introduce the convention that for any norm k · k S andany p ∈ [1 , ∞ ), k F k ℓ p S = (cid:18)X k ∈ Z k P k F k pS (cid:19) p . Next we deﬁne the X s , b type norms applied to functions at spatial frequency ∼ k , k F k X s , bp = sk (cid:18)X j ∈ Z (cid:16) b j k Q j P k F k L t L x (cid:17) p (cid:19) p for s , b ∈ R and p ∈ [1 , ∞ ) with the obvious analogue for p = ∞ , k F k X s , b ∞ = sk sup j ∈ Z b j k Q j P k F k L t L x . We will mainly use three function spaces N , N ∗ , and S . Their dyadic subspaces N k , N ∗ k and S k satisfy N k = L t L x + X , − , N ∗ k = L ∞ t L x ∩ X , ∞ , X , ⊆ S k ⊆ N ∗ k . Then we have k F k N = X k ∈ Z k P k F k N k , k F k N ∗ = X k ∈ Z k P k F k N ∗ k . The space S k is deﬁned by k φ k S k = k φ k S Strk + k φ k S angk + k φ k X , ∞ , where S S trk = \ q + / r ≤ ( q + r − k L qt L rx , k φ k S angk = sup l < X ω k P ω l Q < k + l φ k S ω k ( l ) and the angular sector norms S ω k ( l ) are deﬁned below.To introduce the angular sector norms S ω k ( l ) we ﬁrst deﬁne the plane wave space k φ k PW ± ω ( l ) = inf φ = R φ ω ′ Z | ω − ω ′ |≤ l k φ ω ′ k L ± ω ′ L ∞ ( ± ω ′ ) ⊥ d ω ′ and the null energy space k φ k NE = sup ω k / ∇ ω φ k L ∞ ω L ω ⊥ , where the norms are with respect to ℓ ± ω = t ± ω · x and the transverse variable, while / ∇ ω denotesspatial di ﬀ erentiation in the ( ℓ + ω ) ⊥ plane. We now set k φ k S ω k ( l ) = k φ k S Strk + − k k φ k NE + − k X ± k Q ± φ k PW ∓ ω ( l ) + sup k ′ ≤ k , l ′ ≤ , k + l ≤ k ′ + l ′ ≤ k + l X C k ′ ( l ′ ) (cid:18) k P C k ′ ( l ′ ) φ k S Strk + − k k P C k ′ ( l ′ ) φ k NE + − k ′ − k k P C k ′ ( l ′ ) φ k L t L ∞ x + − k ′ + l ′ ) X ± k Q ± P C k ′ ( l ′ ) φ k PW ∓ ω ( l ) (cid:19) , where P C k ′ ( l ′ ) is a projection operator to a radially directed block C k ′ ( l ′ ) of dimensions 2 k ′ × (2 k ′ + l ′ ) .Then we deﬁne k φ k S = X k ∈ Z k∇ t , x P k φ k S k + k (cid:3) φ k ℓ L t ˙ H − x and the higher derivative norms k φ k S N : = k∇ N − t , x φ k S , N ≥ . Moreover, we introduce k u k S ♯ k = k∇ t , x u k L ∞ t L x + k (cid:3) u k N k . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 15

On occasion we need to separate the two characteristic cones { τ = ±| ξ |} . To this end we deﬁne N k , ± , N k = N k , + ∩ N k , − S ♯ k , ± , S ♯ k = S ♯ k , + + S ♯ k , − N ∗ k , ± , N ∗ k = N ∗ k , + + N ∗ k , − . We will also use an auxiliary space of L t L ∞ x type, k φ k Z = X k ∈ Z k P k φ k Z k , k φ k Z k = sup l < C X ω l k P ω l Q k + l φ k L t L ∞ x . Finally, to control the component A , we deﬁne k A k Y = k∇ t , x A k L ∞ t L x + k A k L t ˙ H / x + k ∂ t A k L t ˙ H / x and the higher derivative norms k A k Y N = k∇ N − t , x A k Y , N ≥ . The link between the S and N spaces is given by the following energy estimate from [22], k∇ t , x φ k S . k∇ t , x φ (0) k L x + k (cid:3) φ k N . We will need to work with time-localized versions of the S k and N k spaces. For any compactinterval I ⊂ R and k ∈ Z , we deﬁne k ψ k S k ( I × R ) : = inf ˜ ψ | I = ψ | I k P k ˜ ψ k S k ( R × R ) with ψ and ˜ ψ Schwartz functions. Analogously, we deﬁne N k ( I × R ).The following lemma shows that the S k and Z k spaces are compatible with time cuto ﬀ s. We willfrequently use this fact without further mentioning. Lemma 2.1.

Let χ I be a smooth cuto ﬀ to a time interval I ⊂ R . Then it holds for all k ∈ Z that (cid:13)(cid:13)(cid:13) P k ( χ I φ ) (cid:13)(cid:13)(cid:13) S k ( R × R ) . (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k ( R × R ) and (cid:13)(cid:13)(cid:13) P k ( χ I φ ) (cid:13)(cid:13)(cid:13) Z k ( R × R ) . (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) Z k ( R × R ) . Proof.

This is obvious for the Strichartz type norms. It remains to show it for the X , ∞ and S angk components. We start with the former. For ﬁxed j ∈ Z , we have Q j (cid:0) χ I φ (cid:1) = Q j (cid:0) Q j + O (1) ( χ I ) Q ≤ j − C φ (cid:1) + Q j (cid:0) χ I Q > j − C ( φ ) (cid:1) . Using the bound (cid:13)(cid:13)(cid:13) Q j + O (1) ( χ I ) (cid:13)(cid:13)(cid:13) L t . − j , we obtain2 j (cid:13)(cid:13)(cid:13) Q j (cid:0) Q j + O (1) ( χ I ) P k Q ≤ j − C φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j (cid:13)(cid:13)(cid:13) Q j + O (1) ( χ I ) (cid:13)(cid:13)(cid:13) L t (cid:13)(cid:13)(cid:13) P k Q ≤ j − C φ (cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) L ∞ t L x . Moreover, we ﬁnd2 j (cid:13)(cid:13)(cid:13) Q j (cid:0) χ I P k Q > j − C ( φ ) (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j (cid:13)(cid:13)(cid:13) χ I (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:13)(cid:13)(cid:13) P k Q > j − C ( φ ) (cid:13)(cid:13)(cid:13) L t , x . (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) X , ∞ . Thus, we have (cid:13)(cid:13)(cid:13) P k ( χ I φ ) (cid:13)(cid:13)(cid:13) X , ∞ . (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k . Next, we consider the S angk component, which is given by k φ k S angk = sup l < X ω (cid:13)(cid:13)(cid:13) P ω l Q < k + l φ (cid:13)(cid:13)(cid:13) S ω k ( l ) . We write P ω l Q < k + l ( χ I φ ) = P ω l Q < k + l ( χ I Q < k + l + C φ ) + P ω l Q < k + l ( χ I Q ≥ k + l + C φ )Then the ﬁrst term on the right hand side is bounded by (cid:13)(cid:13)(cid:13) P ω l Q < k + l ( χ I Q < k + l + C φ ) (cid:13)(cid:13)(cid:13) S ω k ( l ) . (cid:13)(cid:13)(cid:13) P ω l Q < k + l + C φ (cid:13)(cid:13)(cid:13) S ω k ( l ) , where we have used the fact that the operator P ω l Q < k + l is disposable. For the second term above,we use that X ω (cid:13)(cid:13)(cid:13) P ω l Q < k + l ( χ I Q ≥ k + l + C φ ) (cid:13)(cid:13)(cid:13) S ω k ( l ) . (cid:13)(cid:13)(cid:13) P k Q < k + l ( χ I Q ≥ k + l + C φ ) (cid:13)(cid:13)(cid:13) X , . (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) X , ∞ . For the Z k space, ﬁx a scale l < X ω l (cid:13)(cid:13)(cid:13) P ω l Q k + l (cid:0) χ I φ (cid:1)(cid:13)(cid:13)(cid:13) L t L ∞ x . Write P ω l Q k + l (cid:0) χ I φ (cid:1) = P ω l Q k + l (cid:0) Q < k + l − C ( χ I ) φ (cid:1) + P ω l Q k + l (cid:0) Q ≥ k + l − C ( χ I ) φ (cid:1) . For the ﬁrst term on the right hand side, we have (cid:13)(cid:13)(cid:13) P ω l Q k + l (cid:0) Q < k + l − C ( χ I ) φ (cid:1)(cid:13)(cid:13)(cid:13) L t L ∞ x . (cid:13)(cid:13)(cid:13) P ω l Q k + l + O (1) φ (cid:13)(cid:13)(cid:13) L t L ∞ x , which leads to an acceptable contribution. For the second term on the right hand side, we use (cid:13)(cid:13)(cid:13) P ω l Q k + l (cid:0) Q ≥ k + l − C ( χ I ) φ (cid:1)(cid:13)(cid:13)(cid:13) L t L ∞ x . (cid:13)(cid:13)(cid:13) Q ≥ k + l − C ( χ I ) (cid:13)(cid:13)(cid:13) L t l + k (cid:0) − l − k (cid:13)(cid:13)(cid:13) P ω l φ k (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:1) It follows that 2 l (cid:13)(cid:13)(cid:13) P ω l Q k + l (cid:0) Q ≥ k + l − C ( χ I ) φ (cid:1)(cid:13)(cid:13)(cid:13) L t L ∞ x . (cid:0) − l − k (cid:13)(cid:13)(cid:13) P ω l φ k (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:1) , which can be square-summed over ω , see (9) in [22]. (cid:3)

3. M icrolocalized magnetic wave equation

In this section we assume that the spatial components of the connection form A are solutions tothe linear wave equation (cid:3) A j = R t × R x for j = , . . . , A is in Coulomb gauge. Wedeﬁne the magnetic wave operator(3.1) (cid:3) pA = (cid:3) + i X k ∈ Z P ≤ k − C A j P k ∂ j . The goal of this section is to derive the following linear estimate for the magnetic wave operator (cid:3) pA . Theorem 3.1.

Suppose that (cid:3) A j = on R t × R x for j = , . . . , and that A is in Coulomb gauge.For all f ∈ N ( R × R ) and ( g , h ) ∈ ˙ H x ( R ) × L x ( R ) , the solution to the magnetic wave equation (3.2)  (cid:3) pA φ = f on R × R , ( φ, φ t ) | t = = ( g , h ) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 17 exists globally and satisﬁes (3.3) k φ k S ( R × R ) ≤ C (cid:16) k g k ˙ H x + k h k L x + k f k ( N ∩ ℓ L t ˙ H − x )( R × R ) (cid:17) , where the constant C > depends only on k∇ t , x A k L x and grows at most polynomially in k∇ t , x A k L x .Proof. By time reversibility it su ﬃ ces to prove the existence of the solution φ on the time interval[0 , ∞ ). Let ε > ﬃ ciently small constant to be ﬁxed later. We may cover the time interval[0 , ∞ ) by ﬁnitely many consecutive closed intervals I , . . . , I J with the following properties. Thenumber of intervals J depends only on k∇ t , x A k L x and ε , the intervals I j overlap at most two at atime, consecutive intervals have intersection with non-empty interior and [0 , ∞ ) = ∪ ∞ j = I j . Mostimportantly, the intervals I j are chosen such that a ﬁnite number of suitable space-time norms ofthe magnetic potential A that will be speciﬁed later are less than ε uniformly on all intervals I j .We ﬁrst construct suitable local solutions to the magnetic wave equation (3.2) on the intervals I j . The precise statement is summarized in the following theorem. Its proof will be given furtherbelow and is based on a parametrix construction. The accuracy of the parametrix crucially relieson the above mentioned smallness of suitable space-time norms of the magnetic potential A on theintervals I j . We use the notation I j = [ T ( l ) j , T ( r ) j ] for the left and right endpoints of I j . Theorem 3.2.

Let f ∈ N ( R × R ) and (˜ g , ˜ h ) ∈ ˙ H x ( R ) × L x ( R ) . For j = , . . . , J there exists asolution φ ( j ) ∈ S ( R × R ) to (3.4)  (cid:3) pA φ ( j ) = f on I j × R , ( φ ( j ) , φ ( j ) t ) | t = T ( l ) j = (˜ g , ˜ h ) in the sense that k χ I j ( (cid:3) pA φ ( j ) − f ) k N ( R × R ) = for a sharp cuto ﬀ χ I j to the time interval I j . Moreover,it holds that (3.5) k φ ( j ) k S ( R × R ) ≤ C (cid:16) k ˜ g k ˙ H x + k ˜ h k L x + k f k ( N ∩ ℓ L t ˙ H − x )( R × R ) (cid:17) , where the constant C > depends only on k∇ t , x A k L x . Finally, we obtain the solution φ to the magnetic wave equation (3.2) on [0 , ∞ ) × R by patchingtogether suitable local solutions on the intervals I j . Given ( g , h ) ∈ ˙ H x ( R ) × L x ( R ) and f ∈ N ( R × R ), Theorem 3.2 yields a solution φ (1) ∈ S ( R × R ) on I = [0 , T ( r )1 ] to  (cid:3) pA φ (1) = f on I × R , ( φ (1) , φ (1) t ) | t = = ( g , h ) . Next, we obtain a solution φ (2) ∈ S ( R × R ) on I = [ T ( l )2 , T ( r )2 ] to  (cid:3) pA φ (2) = f on I × R , ( φ (2) , φ (2) t ) | t = T ( l )2 = ( φ (1) ( T ( l )2 ) , φ (1) t ( T ( l )2 )) , where we recall that I ∩ I , ∅ with T ( l )2 < T ( r )1 . We proceed analogously for the remainingintervals I , . . . , I J . By uniqueness, we must have φ ( j ) | I j ∩ I j + = φ ( j + | I j ∩ I j + for j = , . . . , J −

1. We choose a smooth partition of unity { χ j } subordinate to the cover { I j } such that supp( χ j ) ⊂ I j andsupp( χ ′ j ) ⊂⊂ ( I j − ∩ I j ) ∪ ( I j ∩ I j + ). We then deﬁne φ = J X j = χ j φ ( j ) . Since we have χ ′ j + χ ′ j + = I j ∩ I j + for j = , . . . , J −

1, it follows that J X j = χ ′ j φ ( j ) = R t × R x and hence, ∇ t , x J X j = χ j φ ( j ) = J X j = χ j ∇ t , x φ ( j ) on R t × R x . Similarly, we ﬁnd that (cid:3) J X j = χ j φ ( j ) = J X j = χ j (cid:3) φ ( j ) . Using Lemma 2.1 and estimate (3.5), we thus conclude that k φ k S ( R × R ) = (cid:13)(cid:13)(cid:13) J X j = χ j φ ( j ) (cid:13)(cid:13)(cid:13) S ( R × R ) . J X j = k φ ( j ) k S ( R × R ) . C ( k∇ t , x A k L x ) (cid:16) J X j = k φ ( j ) ( T ( l ) j ) k ˙ H x + k ∂ t φ ( j ) ( T ( l ) j ) k L x + k f k N ( R × R ) (cid:17) . C ( J ) C ( k∇ t , x A k L x ) (cid:0) k g k ˙ H x + k h k L x + k f k N ( R × R ) (cid:1) . Since J depends only on the size of k∇ t , x A k L x and ε , we obtain the desired estimate (3.3). (cid:3) We proceed with the proof of Theorem 3.2.

Proof of Theorem 3.2.

We begin by considering for every k ∈ Z the frequency localized problem(3.6)  (cid:3) pA < k φ ( j ) k = f k on I j × R , ( φ ( j ) k , ∂ t φ ( j ) k ) | t = T ( l ) j = (˜ g k , ˜ h k ) , where (cid:3) pA < k = (cid:3) + iP ≤ k − C A j P k ∂ j . Let χ I j denote a sharp cuto ﬀ to the time interval I j . We ﬁrst wantto construct an approximate solution φ ( j ) app , k to (3.6) that satisﬁes(3.7) k φ ( j ) app , k k S k ( R × R ) ≤ C (cid:16) k ˜ g k k ˙ H x + k ˜ h k k L x + k f k k N k ( R × R ) (cid:17) and k φ ( j ) app , k ( T ( l ) j ) − ˜ g k k ˙ H x + k ∂ t φ ( j ) app , k ( T ( l ) j ) − ˜ h k k L x + k χ I j ( (cid:3) pA < k φ app , k − f k ) k ( N k ∩ L t ˙ H − x )( R × R ) . ε (cid:16) k ˜ g k k ˙ H x + k ˜ h k k L x + k f k k ( N k ∩ L t ˙ H − x )( R × R ) (cid:17) . (3.8) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 19

To this end we split f k = f hypk + f ellk , where f hypk is supported in the region || τ | − | ξ || . k . We note that it holds that(3.9) k (cid:3) − f ellk k S k ( R × R ) . k f ellk k N k ( R × R ) . Theorem 3.3 below then yields an approximate solution ˜ φ ( j ) app , k to(3.10)  (cid:3) ˜ φ ( j ) app , k = f hypk on I j × R , ( ˜ φ ( j ) app , k , ∂ t ˜ φ ( j ) app , k ) | t = T ( l ) j = (˜ g k , ˜ h k ) − (( (cid:3) − f ellk )( T ( l ) j ) , ( ∂ t (cid:3) − f ellk )( T ( l ) j ))that satisﬁes(3.11) (cid:13)(cid:13)(cid:13) ˜ φ ( j ) app , k (cid:13)(cid:13)(cid:13) S k ( R × R ) . k ˜ g k k ˙ H x + k ˜ h k k L x + k f k k N k ( R × R ) and (cid:13)(cid:13)(cid:13) ˜ φ ( j ) app , k ( T ( l ) j ) − (cid:0) ˜ g k − ( (cid:3) − f ellk )( T ( l ) j ) (cid:1)(cid:13)(cid:13)(cid:13) ˙ H x + (cid:13)(cid:13)(cid:13) ∂ t ˜ φ ( j ) app , k ( T ( l ) j ) − (cid:0) ˜ h k − ( ∂ t (cid:3) − f ellk )( T ( l ) j ) (cid:1)(cid:13)(cid:13)(cid:13) L x + (cid:13)(cid:13)(cid:13) χ I j (cid:0) (cid:3) A p < k ˜ φ ( j ) app , k − f hypk (cid:1)(cid:13)(cid:13)(cid:13) N k ( R × R ) . ε (cid:0) k ˜ g k k ˙ H x + k ˜ h k k L x + k f k k N k ( R × R ) (cid:1) . (3.12)We remark that because of scaling invariance Theorem 3.3 below is only formulated for the case k =

0. Next we set φ ( j ) app , k = ˜ φ ( j ) app , k + ( (cid:3) − f ellk )and ﬁnd that (cid:13)(cid:13)(cid:13) χ I j (cid:0) (cid:3) pA < k φ ( j ) app , k − f k (cid:1)(cid:13)(cid:13)(cid:13) ( N k ∩ L t ˙ H − x )( R × R ) . (cid:13)(cid:13)(cid:13) χ I j (cid:0) (cid:3) pA < k ˜ φ ( j ) app , k − f hypk (cid:1)(cid:13)(cid:13)(cid:13) ( N k ∩ L t ˙ H − x )( R × R ) + (cid:13)(cid:13)(cid:13) χ I j A j < k P k ∂ j ( (cid:3) − f ellk ) (cid:13)(cid:13)(cid:13) ( N k ∩ L t ˙ H − x )( R × R ) . ε k f k k ( N k ∩ L t ˙ H − )( R × R ) . (3.13)Here we used that the intervals I j can be chosen such that uniformly for all j = , . . . , J , k A k L t L x ( I j × R ) ≤ ε and thus, (cid:13)(cid:13)(cid:13) χ I j A j < k P k ∂ j ( (cid:3) − f ellk ) (cid:13)(cid:13)(cid:13) ( N k ∩ L t ˙ H − x )( R × R ) ≤ (cid:13)(cid:13)(cid:13) χ I j A j < k P k ∂ j ( (cid:3) − f ellk ) (cid:13)(cid:13)(cid:13) ( L t L x ∩ L t ˙ H − x )( R × R ) . (cid:13)(cid:13)(cid:13) χ I j A < k (cid:13)(cid:13)(cid:13) ( L t L ∞ x )( R × R ) (cid:13)(cid:13)(cid:13) P k ∇ x ( (cid:3) − f ellk ) (cid:13)(cid:13)(cid:13) ( L t L x ∩ L ∞ t ˙ H − x )( R × R ) . k A k L t L x ( I j × R ) k (cid:13)(cid:13)(cid:13) P k ∇ x ( (cid:3) − f ellk ) (cid:13)(cid:13)(cid:13) ( L t L x ∩ L ∞ t ˙ H − x )( R × R ) . ε (cid:13)(cid:13)(cid:13) (cid:3) − f ellk (cid:13)(cid:13)(cid:13) S k ( R × R ) . ε k f k k N k ( R × R ) . From (3.9), (3.11), (3.12), and (3.13) it now follows immediately that φ ( j ) app , k is an approximatesolution to (3.6) that satisﬁes the estimates (3.7) and (3.8). Finally, we reassemble the approximate solutions φ ( j ) app , k to the frequency localized problems(3.6) to a full approximate solution φ ( j ) app = P k ∈ Z φ ( j ) app , k to (3.4) satisfying k φ ( j ) app ( T ( l ) j ) − ˜ g k ˙ H x + k ∂ t φ ( j ) app ( T ( l ) j ) − ˜ h k L x + k χ I j ( (cid:3) pA φ ( j ) app − f ) k ( N ∩ ℓ L t ˙ H − x )( R × R ) . ε (cid:0) k ˜ g k ˙ H x + k ˜ h k L x + k f k ( N ∩ ℓ L t ˙ H − x )( R × R ) (cid:1) and (cid:13)(cid:13)(cid:13) φ ( j ) app (cid:13)(cid:13)(cid:13) S ( R × R ) . k ˜ g k ˙ H x + k ˜ h k L x + k f k ( N ∩ ℓ L t ˙ H − x )( R × R ) . Applying this procedure iteratively to the successive errors, we obtain an exact solution φ ( j ) to (3.4)satisfying (3.5). (cid:3) We now turn to the heart of the matter, namely the construction of the approximate solutions tothe frequency localized magnetic wave equations.

Theorem 3.3.

Let (˜ g , ˜ h ) ∈ ˙ H x ( R ) × L x ( R ) and ˜ f ∈ N ( R × R ) . Assume that ˜ f , ˜ g , ˜ h are frequencylocalized at | ξ | ∼ and that ˜ f is localized at modulation || τ | − | ξ || . . For j = , . . . , J there existsan approximate solution ˜ φ ( j ) app to (3.14)  (cid:3) pA < φ = ˜ f on I j × R , ( φ, φ t ) | t = T ( l ) j = (˜ g , ˜ h ) in the sense that (3.15) (cid:13)(cid:13)(cid:13) ˜ φ ( j ) app (cid:13)(cid:13)(cid:13) S ( R × R ) . k ˜ g k L x + k ˜ h k L x + k ˜ f k N ( R × R ) and (cid:13)(cid:13)(cid:13) ˜ φ ( j ) app ( T ( l ) j ) − ˜ g (cid:13)(cid:13)(cid:13) L x + (cid:13)(cid:13)(cid:13) ∂ t ˜ φ ( j ) app ( T ( l ) j ) − ˜ h (cid:13)(cid:13)(cid:13) L x + (cid:13)(cid:13)(cid:13) χ I j (cid:0) (cid:3) pA < ˜ φ ( j ) app − ˜ f (cid:1)(cid:13)(cid:13)(cid:13) N ( R × R ) . ε (cid:0) k ˜ g k L x + k ˜ h k L x + k ˜ f k N ( R × R ) (cid:1) , (3.16) where χ I j denotes a sharp cuto ﬀ to the time interval I j .Proof. In order to prove estimates and construct a parametrix for the frequency localized magneticwave equation (3.14) we adapt the scheme in Section 6 of [22] to our time-localized setting. We willuse frequency localized renormalization operators e − i ψ ± < ( t , x , D ) and e + i ψ ± < ( D , y , s ), where P ( x , D )denotes the left quantization and P ( D , y ) the right quantization of a pseudodi ﬀ erential operator P and where the subscript < ≪

1. For the deﬁnition of the phase correction ψ ± in the renormalization operator e + i ψ ± < ( D , y , s ) we need to introduce some notation.For any ξ ∈ R \{ } we set ω = ξ | ξ | , L ω ± : = ± ∂ t + ω · ∇ x , ∆ ω ⊥ : = ∆ − ( ω · ∇ x ) . Moreover, for any ω ∈ S and any angle 0 < θ .

1, we deﬁne the sector projection Π ω>θ in frequencyspace by the formula [ Π ω>θ f ( ζ ) : = (cid:16) − η (cid:16) ∠ ( ζ, ω ) θ (cid:17)(cid:17)(cid:16) − η (cid:16) ∠ ( − ζ, ω ) θ (cid:17)(cid:17) b f ( ζ ) , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 21 where η ( y ) is a bump function on R which equals 1 when | y | < and vanishes for | y | >

1, and ∠ ( ζ, ω )is the angle between ζ and ω . Thus, Π ω>θ restricts f smoothly (except at the frequency origin) to thesector of frequencies ζ whose angle with both ω and − ω is & θ . Similarly, we deﬁne the Fouriermultipliers Π ωθ , Π ω ≤ θ and Π ωθ > · >θ .Let C , C > ﬃ ciently large later on depending on the size of k∇ t , x A k L x and let σ > ﬃ ciently small. We then deﬁne the phasecorrection ψ ± by(3.17) ψ ± ( t , x , ξ ) = X − C ≤ k < L ω ± ∆ − ω ⊥ Π ω − C > · > σ k A k · ω + X k < − C L ω ± ∆ − ω ⊥ Π ω> σ k A k · ω. Note that the ﬁrst sum e ﬀ ectively only starts at k . − C σ . See Section 6 in [22] for a motivation forsuch a choice of phase correction. We emphasize that this phase slightly di ﬀ ers from the one usedin [22], because for intermediate frequencies − C ≤ k < ﬀ .We deﬁne the approximate solution ˜ φ ( j ) app to (3.14) by˜ φ ( j ) app = χ I j ( t ) 12 X ± (cid:26) e − i ψ ± < ( t , x , D ) 1 | D | e ± i ( t − T ( l ) j ) | D | e i ψ ± < ( D , y , T ( l ) j )( | D | ˜ g ± ( − i )˜ h ) ± e − i ψ ± < ( t , x , D ) 1 | D | K ± j e i ψ ± < ( D , y , s )( − i ) ˜ f (cid:27) , where K ± j ˜ f ( t ) = Z tT ( l ) j e ± i ( t − s ) | D | ˜ f ( s ) ds . In order to prove the estimates (3.15) and (3.16) we establish the following crucial time-localizedmapping properties of the renormalization operator e ± i ψ ± < ( t , x , D ). Theorem 3.4.

For j = , . . . , J, the frequency localized renormalization operators have the follow-ing mapping properties with Z ∈ { N ( R × R ) , L x ( R ) , N ∗ ( R × R ) } , χ I j e ± i ψ ± < : Z → Z , (3.18) χ I j ∂ t e ± i ψ ± < : Z → ε Z , (3.19) χ I j ( e − i ψ ± < ( t , x , D ) e + i ψ ± < ( D , y , t ) −

1) : Z → ε Z , (3.20) χ I j ( e − i ψ ± < ( t , x , D ) (cid:3) − (cid:3) pA < e − i ψ ± < ( t , x , D )) : N ∗ , ± ( R × R ) → ε N , ± ( R × R ) , (3.21) χ I j e − i ψ ± < ( t , x , D ) : S ♯ ( R × R ) → S ( R × R ) , (3.22) where χ I j denotes a sharp cuto ﬀ to the time interval I j . In the estimates (3.18) and (3.19) , theoperator e ± i ψ ± < , respectively ∂ t e ± i ψ ± < , stands for both left and right quantization. The estimates (3.15) and (3.16) then follow by adapting the manipulations in the proof of Theo-rem 4 in [22] to our time-localized setting. (cid:3)

The remainder of this section is devoted to the proof of Theorem 3.4. To this end we willadapt the general scheme of Sections 7 – 11 in [22] to our large data setting. The accuracy of theapproximate solution ˜ φ ( j ) app relies on the error estimates (3.19), (3.20) and (3.21). While in [22] thesmall energy assumption can be used to achieve smallness in the corresponding error estimates,we have to argue more carefully here, using the high angle cut-o ﬀ in the deﬁnition of the phase correction and smallness of suitable space-time norms of A on su ﬃ ciently small time intervals,namely the intervals I j .3.1. Decomposable function spaces.

We begin by reviewing the notion of decomposable functionspaces and estimates from [35], [21], and [22].Let c ( t , x , D ) be a pseudodi ﬀ erential operator whose symbol c ( t , x , ξ ) is homogeneous of degree0 in ξ . Assume that c has a representation c = X θ ∈ − N c ( θ ) . Let 1 ≤ q , r ≤ ∞ . For every θ ∈ − N , we deﬁne k c ( θ ) k D θ ( L qt L rx )( R × R ) = (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) X l = X Γ νθ sup ω ∈ Γ νθ (cid:13)(cid:13)(cid:13) b νθ ( ω )( θ ∇ ξ ) l c ( θ ) (cid:13)(cid:13)(cid:13) L rx (cid:19) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt ( R ) , where { Γ νθ } ν ∈ S is a uniformly ﬁnitely overlapping covering of S by caps of diameter ∼ θ and { b νθ } ν ∈ S is a smooth partition of unity subordinate to the covering { Γ νθ } ν ∈ S . Then we deﬁne thedecomposable norm k c k D ( L qt L rx )( R × R ) = inf c = P θ c ( θ ) X θ ∈ − N k c ( θ ) k D θ ( L qt L rx )( R × R ) . We will repeatedly use the following decomposable estimates.

Lemma 3.5 ( [22, Lemma 7.1]) . Let P ( t , x , D ) be a pseudodi ﬀ erential operator with symbol p ( t , x , ξ ) .Suppose that P satisﬁes the ﬁxed-time estimate sup t ∈ R k P ( t , x , D ) k L x → L x . . Let ≤ q , q , q , r , r ≤ ∞ such that q = q + q and r = r + . For any symbol c ( t , x , ξ ) ∈ D ( L q t L r x )( R × R ) that is zero homogeneous in ξ , we have k ( cp )( t , x , D ) φ k L qt L rx ( R × R ) . k c k D ( L q t L r x )( R × R ) k φ k L q t L x ( R × R ) . By duality we obtain decomposable estimates for right quantizations.

Lemma 3.6.

Let P be a pseudodi ﬀ erential operator with symbol p ( t , x , ξ ) . Suppose that P satisﬁesthe ﬁxed-time estimate sup t ∈ R k P ( t , x , D ) k L x → L x . . Let ≤ q < ∞ and ≤ q , q ≤ ∞ such that q = q + q . For any symbol c ( t , x , ξ ) ∈ D ( L q t L ∞ x )( R × R ) that is zero homogeneous in ξ , the right-quantized operator ( c p )( D , y , t ) has the followingmapping property (cid:13)(cid:13)(cid:13) ( c p )( D , y , t ) φ (cid:13)(cid:13)(cid:13) L qt L x ( R × R ) . k c k D ( L q t L ∞ x )( R × R ) k φ k L q t L x ( R × R ) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 23

Proof.

Let 1 < q ′ ≤ ∞ be the the conjugate exponent to q and deﬁne q = q + q ′ . By duality,H ¨older’s inequality and Lemma 3.5, we have (cid:13)(cid:13)(cid:13) ( c p )( D , y , t ) φ (cid:13)(cid:13)(cid:13) L qt L x = sup k ψ k Lq ′ t L x ≤ (cid:10) ψ, ( c p )( D , y , t ) φ (cid:11) = sup k ψ k Lq ′ t L x ≤ (cid:10) ( cp )( t , x , D ) ψ, φ (cid:11) ≤ sup k ψ k Lq ′ t L x ≤ (cid:13)(cid:13)(cid:13) ( cp )( t , x , D ) ψ (cid:13)(cid:13)(cid:13) L ˜ qt L x k φ k L q t L x . sup k ψ k Lq ′ t L x ≤ k c k D ( L q t L ∞ x ) k ψ k L q ′ t L x k φ k L q t L x . k c k D ( L q t L ∞ x ) k φ k L q t L x . (cid:3) From [21, Lemma 10.2] we have the following H ¨older-type estimate for decomposable norms(3.23) (cid:13)(cid:13)(cid:13)(cid:13) m Y i = c i (cid:13)(cid:13)(cid:13)(cid:13) D ( L qt L rx ) . m Y i = k c i k D ( L qit L rix ) , where m ∈ N , 1 ≤ q , r , q i , r i ≤ ∞ for i = , . . . , m and ( q , r ) = P mi = ( q i , r i ).3.2. Some symbol bounds for phases.

Recall that the magnetic potential A is assumed to besupported at frequencies .

1. For any integer k < < θ .

1, we usethe notation ψ ( θ ) k ( t , x , ξ ) = L ω ± ∆ − ω ⊥ Π ωθ A k · ω and ψ < k = X l < k P l ψ ± . Lemma 3.7.

For any t , s ∈ R , x , y ∈ R , ξ ∈ R and any integer k < , it holds that (3.24) | ψ ± ( t , x , ξ ) − ψ ± ( s , y , ξ ) | . (2 − C / + − C ) k∇ t , x A (0) k L x ( | t − s | + | x − y | ) . Moreover, we have for any multi-index α ∈ N with ≤ | α | ≤ σ − that (3.25) |∇ αξ ( ψ ( t , x , ξ ) − ψ ( s , y , ξ )) | . h| t − s | + | x − y |i σ ( | α |− ) k∇ t , x A (0) k L x . Proof.

For any t ∈ R , x ∈ R , ξ ∈ R and any integer k <

0, we obtain that | ψ ( θ ) k ( t , x , ξ ) | ≤ k L ω ± ∆ − ω Π ωθ P k A · ω k L ∞ x . ( θ k ) − ∞ k L ω ± ∆ − ω Π ωθ P k A · ω k L x . θ / k θ − k θ − k Π ωθ P j L ω ± A k L x . θ / k∇ t , x A k k L x , where we used Bernstein’s inequality, the Coulomb gauge of A and that | d ∆ − ω ⊥ ( ξ ) | ∼ − k θ − on thefrequency support of Π ωθ P k . Similarly, we ﬁnd |∇ t , x ψ ( θ ) k ( t , x , ξ ) | . k θ / k∇ t , x A k k L x . Thus, we have | ψ ± ( t , x , ξ ) − ψ ± ( s , y , ξ ) |≤ X − C ≤ k < X σ k <θ< − C | ψ ( θ ) k ( t , x , ξ ) − ψ ( θ ) k ( s , y , ξ ) | + X k < − C X σ k <θ | ψ ( θ ) k ( t , x , ξ ) − ψ ( θ ) k ( s , y , ξ ) |≤ (cid:18) X − C ≤ k < X σ k <θ< − C k θ / + X k < − C X σ k <θ k θ / (cid:19) / k∇ t , x A k L x ( | x − y | + | t − s | ) ≤ (2 − C / + − C ) k∇ t , x A k L x ( | x − y | + | t − s | ) . We now turn to the proof of (3.25). To this end we note that di ﬀ erentiating with respect to ξ yields θ − factors, i.e. for any α ∈ N it holds that |∇ αξ ψ ( θ ) k ( t , x , ξ ) | . θ −| α | k∇ t , x A k k L x and |∇ t , x ∇ αξ ψ ( θ ) k ( t , x , ξ ) | . k θ −| α | k∇ t , x A k k L x . For any 1 ≤ | α | ≤ σ − and l < |∇ αξ ( ψ ± ( t , x , ξ ) − ψ ± ( s , y , ξ )) | . X k < l X σ k <θ k θ −| α | k∇ t , x A k L x ( | x − y | + | t − s | ) + X k ≥ l X σ k <θ θ −| α | k∇ t , x A k L x . l (1 − σ ( | α |− )) k∇ t , x A k L x ( | x − y | + | t − s | ) + − σ l ( | α |− ) k∇ t , x A k L x . Optimizing the choice of l < |∇ αξ ( ψ ± ( t , x , ξ ) − ψ ± ( s , y , ξ )) | . h| t − s | + | x − y |i σ ( | α |− ) k∇ t , x A k L x . (cid:3) We will frequently use the following bounds on decomposable norms of the phase ψ ± . Lemma 3.8 ( [22, Lemma 7.3]) . Let ≤ q , r ≤ ∞ with q + r ≤ . For any integer k < and anydyadic angle θ ∈ − N the component ψ ( θ ) k = L ω ± ∆ − ω ⊥ Π ωθ A k · ω satisﬁes (3.26) (cid:13)(cid:13)(cid:13) ( ψ ( θ ) k , − k ∇ t , x ψ ( θ ) k ) (cid:13)(cid:13)(cid:13) D θ ( L qt L rx )( R × R ) . − ( q + r ) k θ − q − r k∇ t , x A k L x . Oscillatory integral estimates.

In order to prove the mapping properties in Theorem 3.4, weneed pointwise kernel bounds for operators of the form T a = e − i ψ ± ( t , x , D ) a ( D ) e ± i ( t − s ) | D | e i ψ ± ( D , y , s ) , where a is localized at frequency | ξ | ∼

1. The kernel of T a is given by the oscillatory integral K a ( t , x ; s , y ) = Z R e − i ( ψ ± ( t , x ,ξ ) − ψ ± ( s , y ,ξ )) e i ( t − s ) | ξ | e i ( x − y ) · ξ a ( ξ ) d ξ, where a is a smooth bump function with support on the annulus | ξ | ∼ Lemma 3.9.

For any t , s ∈ R , x , y ∈ R and any integer ≤ N ≤ σ − , we have (3.27) | K a ( t , x ; t , y ) | . k∇ t , x A k L x h| x − y |i N (1 − σ )ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 25 and (3.28) | K a ( t , x ; t , y ) − ˇ a ( x − y ) | . min n (2 − C / + − C ) , | x − y | N (1 − σ ) o k∇ t , x A k L x . Moreover, it holds that (3.29) | K a ( t , x ; s , y ) | . h t − s i − h| t − s | − | x − y |i − N k∇ t , x A k L x . Proof. If | x − y | . | K a ( t , x ; t , y ) | . . If instead | x − y | ≫

1, we use (3.25) and integrate by parts repeatedly to obtain for any 1 ≤ N ≤ σ − that | K a ( t , x ; t , y ) | . k∇ t , x A k L x | x − y | N (1 − σ ) . This proves (3.27). In order to show (3.28), we integrate by parts repeatedly for | x − y | ≫

1, whilefor | x − y | .

1, we use (3.24) to estimate | K a ( t , x ; t , y ) − ˇ a ( x − y ) | ≤ Z R | e i ( ψ ± ( t , x ,ξ ) − ψ ± ( t , y ,ξ )) − | a ( ξ ) d ξ ≤ Z R | ψ ± ( t , x , ξ ) − ψ ± ( t , y , ξ ) | a ( ξ ) d ξ . (2 − C / + − C ) k∇ t , x A k L x | x − y | . Finally, the proof of (3.29) proceeds along the lines of Proposition 6(a) in [22]. We only have toargue a bit more that away from the cone for su ﬃ ciently large | t − s | the phase is still non-degenerate.But this is because away from the cone (cid:12)(cid:12)(cid:12) − i ∇ ξ ( ψ ± ( t , x , ξ ) − ψ ± ( s , y , ξ )) + i ( t − s ) ξ | ξ | + i ( x − y ) (cid:12)(cid:12)(cid:12) ≥ c h| t − s | + | x − y |i − C k∇ t , x A k L x h| t − s | + | x − y |i σ and we choose 0 < σ ≪ ﬃ ciently small. (cid:3) To deal with the frequency localized operators e ± i ψ ± < ( t , x , D ) and e i ψ ± < ( D , y , s ), we need to producesimilar estimates for the kernel K a ,< of the operator T a ,< = e − i ψ ± < ( t , x , D ) a ( D ) e ± i ( t − s ) | D | e i ψ ± < ( D , y , s ) . Noting that the frequency localized symbol e ± i ψ ± < can be represented as e ± i ψ ± < = Z R + z m ( z ) e ± iT z ψ ± dz , where m ( z ) is an integrable bump function on the unit scale and T z denotes space-time translationin the direction z ∈ R + , the transition to these frequency localized operators can be made just asin Proposition 7 in [22]. We obtain the following estimates for K a ,< . Lemma 3.10.

For any t , s ∈ R , x , y ∈ R and any integer ≤ N ≤ σ − , we have (3.30) | K a ,< ( t , x ; t , y ) | . k∇ t , x A k L x h| x − y |i N (1 − σ )6 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION and (3.31) | K a ,< ( t , x ; t , y ) − ˇ a ( x − y ) | . min n (2 − C / + − C ) , | x − y | N (1 − σ ) o k∇ t , x A k L x . Moreover, it holds that (3.32) | K a ,< ( t , x ; s , y ) | . h t − s i − h| t − s | − | x − y |i − N k∇ t , x A k L x . Fixed-time L x estimates in Theorem 3.4. In this subsection we prove the ﬁxed-time L x esti-mates in Theorem 3.4 using the above oscillatory integral estimates. To obtain a small factor ε inthe estimates (3.18) and (3.19), we additionally have to ﬁx the constants C , C > ψ ± su ﬃ ciently large. Lemma 3.11.

For any t ∈ R , we have (3.33) (cid:13)(cid:13)(cid:13) e ± i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) L x . k∇ t , x A k L x k P φ k L x and (3.34) (cid:13)(cid:13)(cid:13) e ± i ψ ± < ( D , y , t ) P φ (cid:13)(cid:13)(cid:13) L x . k∇ t , x A k L x k P φ k L x Proof.

The claim follows immediately from the kernel bound (3.30) and a

T T ∗ -argument. (cid:3) Lemma 3.12.

For any ε > the constants C , C > in the deﬁnition of the phase correction ψ ± can be chosen su ﬃ ciently large (depending on the size of ε − and k∇ t , x A k L x ) such that we have (3.35) (cid:13)(cid:13)(cid:13) ( ∇ t , x e − i ψ ± < )( t , x , D ) P φ (cid:13)(cid:13)(cid:13) L x . ε k P φ k L x . Proof.

Using Lemma 3.5, Lemma 3.8, and (3.33) we obtain that (cid:13)(cid:13)(cid:13) ( ∇ t , x e − i ψ ± < )( t , x , D ) P φ (cid:13)(cid:13)(cid:13) L x = (cid:13)(cid:13)(cid:13) ( ∇ t , x ψ ± ) e − i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) L x ≤ X − C ≤ k < X σ k <θ< − C (cid:13)(cid:13)(cid:13) ( ∇ t , x ψ ( θ ) k ) e − i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) L x + X k < − C X σ k <θ (cid:13)(cid:13)(cid:13) ( ∇ t , x ψ ( θ ) k ) e − i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) L x . X − C ≤ k < X σ k <θ< − C k∇ t , x ψ ( θ ) k k D θ ( L ∞ t L ∞ x ) k∇ t , x A k L x k P φ k L x + X k < − C X σ k <θ k∇ t , x ψ ( θ ) k k D θ ( L ∞ t L ∞ x ) k∇ t , x A k L x k P φ k L x . (cid:18) X − C ≤ k < X σ k <θ< − C k θ / + X k < − C X σ k <θ k θ / (cid:19) k∇ t , x A k L x k P φ k L x . (2 − C / + − C ) k∇ t , x A k L x k P φ k L x , from which the assertion follows. (cid:3) Lemma 3.13.

For any ε > the constants C , C > in the deﬁnition of the phase correction ψ ± can be chosen su ﬃ ciently large (depending on the size of ε − and k∇ t , x A k L x ) such that we have (3.36) (cid:13)(cid:13)(cid:13) ( e − i ψ ± < ( t , x , D ) e i ψ ± < ( D , y , t ) − P φ (cid:13)(cid:13)(cid:13) L x . ε k P φ k L x . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 27

Proof.

The integral kernel of ( e − i ψ ± < ( t , x , D ) e i ψ ± < ( D , y , t ) − a ( D )is given by K a ,< ( t , x ; t , y ) − ˇ a ( x − y ). Using (3.31) we ﬁnd thatsup y Z R | K a ,< ( t , x ; t , y ) − ˇ a ( x − y ) | dx . Z R min n (2 − C / + − C ) , | x | N (1 − σ ) o k∇ t , x A k L x dx . inf R > n (2 − C / + − C ) R + R N (1 − σ ) − o k∇ t , x A k L x . Choosing C , C > ﬃ ciently large depending on the size of ε − and k∇ t , x A k L x , we obtain thatsup y Z R | K a ,< ( t , x ; t , y ) − ˇ a ( x − y ) | dx ≤ ε and similarly for sup x R R | K a ( t , x ; t , y ) − ˇ a ( x − y ) | dy . The assertion then follows from Schur’s lemma. (cid:3) Remark 3.14.

The ﬁxed-time L x bounds from Lemma 3.11, Lemma 3.12 and Lemma 3.13 in facthold for the operators e ± i ψ < l < k ( t , x , D ) , e ± i ψ < l k ( t , x , D ) , and e ± i ψ < l ( t , x , D ) for any k , l < . The proofs inthis and the previous subsection can be easily adapted to obtain this assertion. Modulation localized estimates.

All implicit constants in this subsection may depend on thesize of k∇ t , x A k L x . Proposition 3.15.

For any ε > the intervals I j can be chosen such that uniformly for all j = , . . . , J and all integers k ≤ k ′ ± O (1) < , it holds that (3.37) (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j e ± i ψ ± k ′ ( t , x , D ) P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ε − k δ ( k − k ′ ) k P φ k N ∗ ( R × R ) . In the proof of Proposition 3.15 we will use the following result whose proof will be given later.

Lemma 3.16.

Let ≤ q ≤ p ≤ ∞ . For any ε > the intervals I j can be chosen such that uniformlyfor all j = , . . . , J and all integers k + C ≤ l ≤ , the following operator bound holds (cid:13)(cid:13)(cid:13) χ I j ( e ± i ψ < k l )( t , x , D ) (cid:13)(cid:13)(cid:13) L pt L x ( R × R ) → L qt L x ( R × R ) . ε k − l ) ( p − q ) k . Proof of Proposition 3.15.

In the following we denote an interval I j just by I and e ± i ψ ± k ′ stands forthe left quantization e ± i ψ ± k ′ ( t , x , D ).We ﬁrst reduce to the case k = k ′ ± O (1). To this end we will use that Proposition 9 and Lemma10 in [22] hold without the ε smallness gain also for large energies. We split Q k (cid:0) χ I e ± i ψ ± k ′ P φ (cid:1) = Q k (cid:0) Q < k − C ( χ I ) e ± i ψ ± k ′ P φ (cid:1) + Q k (cid:0) Q [ k − C , k + C ] ( χ I ) e ± i ψ ± k ′ P φ (cid:1) + Q k (cid:0) Q > k + C ( χ I ) e ± i ψ ± k ′ P φ (cid:1) . (3.38)For the ﬁrst term we obtain2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) Q < k − C ( χ I ) e ± i ψ ± k ′ P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x = k (cid:13)(cid:13)(cid:13) Q k (cid:0) Q < k − C ( χ I ) Q k + O (1) e ± i ψ ± k ′ P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k (cid:13)(cid:13)(cid:13) Q k + O (1) e ± i ψ ± k ′ P φ (cid:13)(cid:13)(cid:13) L t L x . δ ( k − k ′ ) k P φ k N ∗ , where we used Proposition 9 from [22]. We estimate the second term from (3.38) by2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) Q [ k − C , k + C ] ( χ I ) e ± i ψ ± k ′ P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k (cid:13)(cid:13)(cid:13) Q [ k − C , k + C ] ( χ I ) (cid:13)(cid:13)(cid:13) L t ( R ) (cid:13)(cid:13)(cid:13) e ± i ψ ± k ′ P φ (cid:13)(cid:13)(cid:13) L t L x . k − k (cid:13)(cid:13)(cid:13) e ± i ψ ± k ′ P φ (cid:13)(cid:13)(cid:13) L t L x . Using continuous Littlewood-Paley resolutions to decompose the group element we have e ± i ψ ± k ′ = e ± i ψ < k ′− C k ′ ± i Z l > k ′ − C S k ′ (cid:0) ψ l e ± i ψ < l (cid:1) dl . By Lemma 10 in [22] and the decomposable estimates (3.26) we ﬁnd (cid:13)(cid:13)(cid:13) e ± i ψ ± k ′ P φ (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) e ± i ψ < k ′− C k ′ P φ (cid:13)(cid:13)(cid:13) L t L x + Z l > k ′ − C (cid:13)(cid:13)(cid:13) S k ′ (cid:0) ψ l e ± i ψ < l (cid:1) P φ (cid:13)(cid:13)(cid:13) L t L x dl . − k ′ k P φ k L ∞ t L x + Z l > k ′ − C k ψ l k D ( L t L ∞ x ) k e ± i ψ < l P φ k L ∞ t L x dl . − k ′ k P φ k L ∞ t L x + Z l > k ′ − C − l k∇ t , x A k L x k P φ k L ∞ t L x dl . − k ′ k P φ k N ∗ . Thus, the second term from (3.38) is bounded by2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) Q [ k − C , k + C ] ( χ I ) e ± i ψ ± k ′ P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . ( k − k ′ ) k P φ k N ∗ . To estimate the third term from (3.38) we ﬁrst use Bernstein in time to obtain2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) Q > k + C ( χ I ) e ± i ψ ± k ′ P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x ≤ k X j ≥ k + C (cid:13)(cid:13)(cid:13) Q k (cid:0) Q j ( χ I ) Q j + O (1) ( e ± i ψ ± k ′ P φ ) (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k X j ≥ k + C k Q j ( χ I ) k L t (cid:13)(cid:13)(cid:13) Q j + O (1) e ± i ψ ± k ′ P φ (cid:13)(cid:13)(cid:13) L t L x . k k ′ + C X j = k + C − j (cid:13)(cid:13)(cid:13) Q j + O (1) e ± i ψ ± k ′ P φ (cid:13)(cid:13)(cid:13) L t L x + k X j > k ′ + C − j (cid:13)(cid:13)(cid:13) Q j + O (1) e ± i ψ ± k ′ P φ (cid:13)(cid:13)(cid:13) L t L x . For the ﬁrst sum we use Proposition 9 from [22], for the second sum we ﬁrst note that there is nomodulation interference since k ′ < j − C and then use the ﬁxed-time L x → L x estimate for e ± i ψ ± k ′ .Hence, . k k ′ + C X j = k + C − j − j δ ( j − k ′ ) k∇ t , x A k L x k P φ k N ∗ + k X j > k ′ + C − j k P φ k X , ∞ . (2 δ ( j − k ′ ) + k − k ′ ) k P φ k N ∗ . Putting things together we ﬁnd that2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I e ± i ψ ± k ′ P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . δ ( k − k ′ ) k P φ k N ∗ and for su ﬃ ciently large | k | ≫ | k ′ | we therefore trivially gain a smallness factor ε from 2 δ ( k − k ′ ) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 29

We are thus reduced to the case k = k ′ ± O (1) and it remains to show that2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I e ± i ψ ± k P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . ε k P φ k N ∗ . As in the proof of Proposition 9 in [22] we expand the untruncated group element e ± i ψ = e ± i ψ < k − C ± i Z l > k − C ψ l e ± i ψ < k − C dl − " l , l ′ > k − C ψ l ψ l ′ e ± i ψ < k − C dl ′ dl ∓ i $ l , l ′ , l ′′ > k − C ψ l ψ l ′ ψ l ′′ e ± i ψ < l ′′ dl ′′ dl ′ dl = Z + L + Q + C and estimate each of these terms seperately. Zero order term Z : From Lemma 3.16 we immediately obtain that (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I e ± i ψ < k − C k P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . ε − k k P φ k N ∗ . Linear term L : We have to show that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Q k (cid:16) χ I Z l > k − C S k (cid:0) ψ l e ± i ψ < k − C (cid:1) P φ dl (cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x . ε − k k P φ k N ∗ . To this end we decompose ψ l into a small and a large angular part(3.39) ψ l = X σ l <θ< − C ψ ( θ ) l + X − C ≤ θ . ψ ( θ ) l . In order to bound the small angular part we split χ I = Q ≥ k − C ( χ I ) + Q < k − C ( χ I ) . Using Lemma 3.5, we estimate the ﬁrst term by (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Q k (cid:18) Q ≥ k − C ( χ I ) Z l > k − C S k (cid:16) X σ l <θ< − C ( ψ ( θ ) l ) e ± i ψ < k − C (cid:17) P φ dl (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) Q ≥ k − C ( χ I ) (cid:13)(cid:13)(cid:13) L t Z l > k − C X σ l <θ< − C k ψ ( θ ) l k D θ ( L t L ∞ x ) (cid:13)(cid:13)(cid:13) e ± i ψ < k − C P φ (cid:13)(cid:13)(cid:13) L ∞ t L x dl . − k Z l > k − C X σ l <θ< − C θ − l k∇ t , x A k L x k P φ k L ∞ t L x dl . − C − k k P φ k N ∗ . For the second term we have Q k (cid:18) Q < k − C ( χ I ) Z l > k − C S k (cid:16) X σ l <θ< − C ψ ( θ ) l e ± i ψ < k − C (cid:17) P φ dl (cid:19) = Q k (cid:18) Q < k − C ( χ I ) Q k + O (1) Z l > k − C S k (cid:16) X σ l <θ< − C ψ ( θ ) l e ± i ψ < k − C (cid:17) P φ dl (cid:19) . Then since ψ ( θ ) l is a free wave, we can write this as Q k (cid:18) Q < k − C ( χ I ) Q k + O (1) Z l > k − C S k (cid:16) X σ l <θ< − C ψ ( θ ) l e ± i ψ < k − C (cid:17) Q k + O (1) P φ dl (cid:19) and estimate by (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Q k (cid:18) Q < k − C ( χ I ) Q k + O (1) Z l > k − C S k (cid:16) X σ l <θ< − C ψ ( θ ) l e ± i ψ < k − C (cid:17) Q k + O (1) P φ dl (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x . Z l > k − C X σ l <θ< − C (cid:13)(cid:13)(cid:13) ψ ( θ ) l (cid:13)(cid:13)(cid:13) D θ ( L t L ∞ x ) (cid:13)(cid:13)(cid:13) e ± i ψ < k − C Q k + O (1) P φ (cid:13)(cid:13)(cid:13) L t L x dl . Z l > k − C X σ l <θ< − C − l − θ k∇ t , x A k L x (cid:13)(cid:13)(cid:13) Q k + O (1) P φ (cid:13)(cid:13)(cid:13) L t L x dl . − C − k k (cid:13)(cid:13)(cid:13) Q k + O (1) P φ (cid:13)(cid:13)(cid:13) L t L x . − C − k k P φ k N ∗ . Here we used Lemma 3.5, the ﬁxed-time L x → L x estimate for e ± iT z ψ < k − C and then Bernstein intime.The large angular part in (3.39) has to be estimated more carefully. Noting that the symbollocalization S k (cid:0) ψ l e ± i ψ < k − C (cid:1) can be represented as(3.40) S k (cid:0) ψ l e ± i ψ < k − C (cid:1) = Z R + z m k ( z )( T z ψ l ) e ± iT z ψ < k − C dz , where m k is an integrable bump function at scale 2 − k and T z denotes translation in space-timedirection z ∈ R + , we derive the following key estimate for the large angular part (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Q k (cid:18) χ I Z l > k − C Z R + z m k ( z ) (cid:16) X − C ≤ θ . ( T z ψ ( θ ) l ) (cid:17) e ± iT z ψ < k − C P φ dz dl (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x . C / − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18)Z l > k − C − ( l − k ) Z R + z | m k ( z ) | X − C ≤ θ . X Γ νθ sup ω ∈ Γ νθ (cid:16) − l (cid:13)(cid:13)(cid:13) b νθ ( ω ) Π ωθ ∇ t , x T z A l (cid:13)(cid:13)(cid:13) L x (cid:17) dz dl (cid:19) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t ( I ) ×× (cid:18)Z l > k − C − ( l − k ) Z R + z | m k ( z ) | (cid:13)(cid:13)(cid:13) e ± iT z ψ < k − C P φ (cid:13)(cid:13)(cid:13) L ∞ t L x dz dl (cid:19) . C / − k ×× (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18)Z l > k − C − ( l − k ) Z R + z | m k ( z ) | X − C ≤ θ . X Γ νθ sup ω ∈ Γ νθ (cid:16) − l (cid:13)(cid:13)(cid:13) b νθ ( ω ) Π ωθ ∇ t , x T z A l (cid:13)(cid:13)(cid:13) L x (cid:17) dz dl (cid:19) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t ( I ) k P φ k N ∗ . This estimate can be proven by carefully opening up the proof of the decomposable estimates inLemma 3.5. We emphasize that uniformly for all integers k <

0, the quantity (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18)Z l > k − C − ( l − k ) Z R + z | m k ( z ) | X − C ≤ θ . X Γ νθ sup ω ∈ Γ νθ (cid:16) − l (cid:13)(cid:13)(cid:13) b νθ ( ω ) Π ωθ ∇ t , x T z A l (cid:13)(cid:13)(cid:13) L x (cid:17) dz dl (cid:19) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t ( R ) ≤ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18)Z R X k < l + C − ( l − k ) Z R + z | m k ( z ) | X − C ≤ θ . X Γ νθ sup ω ∈ Γ νθ (cid:16) − l (cid:13)(cid:13)(cid:13) b νθ ( ω ) Π ωθ ∇ t , x T z A l (cid:13)(cid:13)(cid:13) L x (cid:17) dz dl (cid:19) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t ( R ) is bounded by k∇ t , x A k L x by Strichartz estimates.By ﬁrst ﬁxing C > ﬃ ciently large and then suitably choosing the intervals I j , the estimateof the linear term L follows. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 31

Quadratic and cubic terms Q and C : Using the above ideas these can be estimated similarly. Weomit the details. (cid:3)

Proof of Lemma 3.16.

As in [22, Lemma 10] we write the symbol as S l e ± i ψ < k = ( ± i ) − l Y r = [ S ( r ) l ∂ t ψ < k ] e ± i ψ < k , where the product denotes a nested multiplication by S l ∂ t ψ < k for a series of frequency cuto ﬀ s S ( r + l S ( r ) l = S ( r ) l ≈ S l with expanding widths. Then we have S ( r ) l ∂ t ψ < k = Z R + zr m ( r ) l ( z r )( T z r ∂ t ψ < k ) dz r , where m ( r ) l is an integrable bump function at scale 2 − l and T z r denotes translation in space-timedirection z r ∈ R + . The claim now reduces to proving that the intervals I j can be chosen such thatuniformly for j = , . . . , J and all integers k ≤ l − C , it holds that(3.41) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) χ I j (cid:18) Y r = Z R + zr m ( r ) l ( z r )( T z r ∂ t ψ < k ) (cid:19) e ± i ψ < k ( t , x , D ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L pt L x → L qt L x . ε k l ( p − q ) k . To this end we show that the intervals I j can be chosen such that uniformly for j = , . . . , J , allintegers k ≤ l − C and all integers k , . . . , k < k , we have the operator bound (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) χ I j (cid:18) Y r = Z R + zr m ( r ) l ( z r )( T z r ∂ t ψ k r ) (cid:19) e ± i ψ < k ( t , x , D ) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L pt L x → L qt L x . ε l − q k (1 − q ) k · · · (1 − q ) k , (3.42)where q = q − p . By summing over dyadic frequencies, the estimate (3.41) then follows.In order to prove (3.42), we split ∂ t ψ k into a small and a large angular part ∂ t ψ k = X σ k <θ < − C ∂ t ψ ( θ ) k + X − C ≤ θ . ∂ t ψ ( θ ) k for some constant C > ﬃ ciently large later in the proof. We estimate the smallangular part using H ¨older-type estimates for decomposable function spaces (3.23) and the bounds(3.26) for the phase, (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) χ I j (cid:18) X σ k <θ < − C Z R + z m (1) l ( z )( T z ∂ t ψ ( θ ) k ) dz (cid:19)(cid:18) Y r = Z R + zr m ( r ) l ( z r )( T z r ∂ t ψ k r ) dz r (cid:19) e ± i ψ < k ( t , x , D ) φ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L x . (cid:18) X σ k <θ < − C Z R + z | m (1) l ( z ) | (cid:13)(cid:13)(cid:13) T z ∂ t ψ ( θ ) k (cid:13)(cid:13)(cid:13) D θ ( L qt L ∞ x ) dz (cid:19)(cid:18) Y r = Z R + zr | m ( r ) l ( z r ) | (cid:13)(cid:13)(cid:13) T z r ∂ t ψ k r (cid:13)(cid:13)(cid:13) D ( L qt L ∞ x ) dz r (cid:19) k φ k L pt L x . (cid:18) X σ k <θ < − C (1 − q ) k θ − q k∇ t , x A k L x (cid:19)(cid:18) Y r = (1 − q ) k r k∇ t , x A k L x (cid:19) k φ k L pt L x . − C ( − q ) l − q k (1 − q ) k · · · (1 − q ) k k∇ t , x A k L x k φ k L pt L x . Here we dropped the time cuto ﬀ χ I j and used the space-time translation invariance of the decom-posable function spaces. For the large angular part we establish the crucial estimate (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) χ I j (cid:18) X − C ≤ θ . Z R + z m (1) l ( z )( T z ∂ t ψ ( θ ) k ) dz (cid:19)(cid:18) Y r = Z R + zr m ( r ) l ( z r )( T z r ∂ t ψ k r ) dz r (cid:19) e ± i ψ < k ( t , x , D ) φ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L x . l − q k (1 − q ) k · · · (1 − q ) k k∇ t , x A k L x C / ×× (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) k − l Z R + z | m (1) l ( z ) | X − C ≤ θ . X Γ ν θ sup ω ∈ Γ ν θ (cid:16) ( q + r − k k b ν θ ( ω ) Π ωθ ∇ t , x T z A k k L r x (cid:17) dz (cid:19) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt ( I j ) k φ k L pt L x , where r ≥ q , r ) is sharp wave admissible. This estimate can beproven by carefully opening up the proof of Lemma 3.5 and of H ¨older-type estimates for decom-posable function spaces (3.23).Noting that uniformly for all integers k ≤ l − C , the quantity (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18) k − l Z R + z | m (1) l ( z ) | X − C ≤ θ . X Γ ν θ sup ω ∈ Γ ν θ (cid:16) ( q + r − k k b ν θ ( ω ) Π ωθ ∇ t , x T z A k k L r x (cid:17) dz (cid:19) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt ( R ) ≤ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:18)X l ∈ Z X k ≤ l − C k − l Z R + z | m (1) l ( z ) | X − C ≤ θ . X Γ ν θ sup ω ∈ Γ ν θ (cid:16) ( q + r − k k b ν θ ( ω ) Π ωθ ∇ t , x T z A k k L r x (cid:17) dz (cid:19) / (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt ( R ) is bounded by k∇ t , x A k L x by Strichartz estimates, the assertion follows by ﬁrst choosing C > ﬃ ciently large and then suitably choosing the intervals I j . (cid:3) Proposition 3.17.

For any ε > the intervals I j can be chosen such that uniformly for all j = , . . . , J and all integers k ≤ k ′ ± O (1) < , it holds that (3.43) (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j e − i ψ ± k ′ ( D , y , t ) P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ε − k δ ( k − k ′ ) k P φ k N ∗ ( R × R ) . Proof.

The proof proceeds analogously to the one of Proposition 3.15 using Lemma 3.6 in place ofLemma 3.5. (cid:3)

Proof of the N → N and N ∗ → N ∗ bounds (3.18) for χ I j e ± i ψ ± < .Proposition 3.18. For j = , . . . , J it holds that (3.44) (cid:13)(cid:13)(cid:13) χ I j e ± i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) N ( R × R ) . (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) N ( R × R ) and (3.45) (cid:13)(cid:13)(cid:13) χ I j e ± i ψ ± < ( D , y , t ) P φ (cid:13)(cid:13)(cid:13) N ( R × R ) . (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) N ( R × R ) . Proof.

We begin with the proof of (3.44). To simplify the notation we denote an interval I j justby I in what follows. If φ is an L t L x atom, the claim follows immediately from the ﬁxed-time L x → L x estimate for e ± i ψ ± < ( t , x , D ). The key point is therefore to show that if φ is an X , − atom atmodulation k , then we have (cid:13)(cid:13)(cid:13) χ I e ± i ψ ± < ( t , x , D ) Q k P φ (cid:13)(cid:13)(cid:13) N . − k k P φ k L t L x . By duality, this is equivalent to proving (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I e ± i ψ ± < ( D , y , t ) P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . − k k P φ k N ∗ . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 33

As in [22, Proposition 9.1] we now write Q k (cid:0) χ I e ± i ψ ± < ( D , y , t ) P φ (cid:1) = Q k (cid:0) χ I e ± i ψ ± < k − C ( D , y , t ) P φ (cid:1) + Q k (cid:0) χ I ( e ± i ψ ± < − e ± i ψ ± < k − C )( D , y , t ) P φ (cid:1) = Q k (cid:0) Q < k − C ( χ I ) e ± i ψ ± < k − C ( D , y , t ) P φ (cid:1) + Q k (cid:0) Q ≥ k − C ( χ I ) e ± i ψ ± < k − C ( D , y , t ) P φ (cid:1) + Q k (cid:0) χ I ( e ± i ψ ± < − e ± i ψ ± < k − C )( D , y , t ) P φ (cid:1) . For the ﬁrst term we obtain2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) Q < k − C ( χ I ) e ± i ψ ± < k − C ( D , y , t ) P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k P φ k X , ∞ . k P φ k N ∗ , because the output modulation directly transfers to φ . We estimate the second term by2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) Q ≥ k − C ( χ I ) e ± i ψ ± < k − C ( D , y , t ) P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k (cid:13)(cid:13)(cid:13) Q ≥ k − C ( χ I ) (cid:13)(cid:13)(cid:13) L t (cid:13)(cid:13)(cid:13) e ± i ψ ± < k − C ( D , y , t ) P φ (cid:13)(cid:13)(cid:13) L ∞ t L x . k P φ k N ∗ , where we used that (cid:13)(cid:13)(cid:13) Q ≥ k − C ( χ I ) (cid:13)(cid:13)(cid:13) L t . − k . To deal with the last term we use Proposition 3.17.The proof of (3.45) works similarly using Proposition 3.15. (cid:3) In a similar vein we obtain the following N ∗ → N ∗ bounds. Proposition 3.19.

For j = , . . . , J it holds that (3.46) (cid:13)(cid:13)(cid:13) χ I j e ± i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) N ∗ ( R × R ) . (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) N ∗ ( R × R ) and (3.47) (cid:13)(cid:13)(cid:13) χ I j e ± i ψ ± < ( D , y , t ) P φ (cid:13)(cid:13)(cid:13) N ∗ ( R × R ) . (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) N ∗ ( R × R ) . Proof of the N → ε N and N ∗ → ε N ∗ bounds (3.19) for χ I j ∂ t e ± i ψ ± < .Proposition 3.20. For any ε > the intervals I j can be chosen such that uniformly for all j = , . . . , J it holds that (3.48) (cid:13)(cid:13)(cid:13) χ I j ∂ t e ± i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) N ( R × R ) . ε k P φ k N ( R × R ) and (3.49) (cid:13)(cid:13)(cid:13) χ I j ∂ t e ± i ψ ± < ( D , y , t ) P φ (cid:13)(cid:13)(cid:13) N ( R × R ) . ε k P φ k N ( R × R ) . Proof.

We proceed as in the proof of Proposition 3.18 using the L x → ε L x bound for ∂ t e ± i ψ ± < andthat we have for k ≤ k ′ ± O (1), (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j ∂ t e − i ψ ± k ′ P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ε − k δ ( k − k ′ ) k P φ k N ∗ ( R × R ) for both left and right quantization. The latter estimate can be proven similarly to the proof ofProposition 3.15. (cid:3) Proposition 3.21.

For any ε > the intervals I j can be chosen such that uniformly for all j = , . . . , J it holds that (3.50) (cid:13)(cid:13)(cid:13) χ I j ∂ t e ± i ψ ± < ( t , x , D ) P φ (cid:13)(cid:13)(cid:13) N ∗ ( R × R ) . ε k P φ k N ∗ ( R × R ) and (3.51) (cid:13)(cid:13)(cid:13) χ I j ∂ t e ± i ψ ± < ( D , y , t ) P φ (cid:13)(cid:13)(cid:13) N ∗ ( R × R ) . ε k P φ k N ∗ ( R × R ) . Proof of the renormalization error estimate (3.20) .Proposition 3.22.

For any ε > the intervals I j can be chosen such that uniformly for all j = , . . . , J we have (3.52) (cid:13)(cid:13)(cid:13) χ I j (cid:0) e − i ψ ± < ( t , x , D ) e + i ψ ± < ( D , y , t ) − (cid:1) P φ (cid:13)(cid:13)(cid:13) N ( R × R ) . ε k P φ k N ( R × R ) and (3.53) (cid:13)(cid:13)(cid:13) χ I j (cid:0) e − i ψ ± < ( t , x , D ) e + i ψ ± < ( D , y , t ) − (cid:1) P φ (cid:13)(cid:13)(cid:13) N ∗ ( R × R ) . ε k P φ k N ∗ ( R × R ) . Proof.

We prove the N ∗ → ε N ∗ estimate (3.53). The bound (3.52) then follows by duality. The L ∞ t L x part of (3.53) follows immediately from the ﬁxed-time L x → ε L x estimate (3.36). The X , ∞ part reduces to showing that we can choose the intervals I j such that uniformly for j = , . . . , J andall k ∈ Z , 2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j (cid:0) e − i ψ ± < ( t , x , D ) e i ψ ± < ( D , y , t ) − (cid:1) P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ε k P φ k N ∗ ( R × R ) . We use the notation R < k = e − i ψ ± < k ( t , x , D ) e i ψ ± < k ( D , y , t )to write Q k (cid:0) χ I j ( R < − P φ (cid:1) = Q k (cid:0) χ I j ( R < − Q > k − C P φ (cid:1) + Q k (cid:0) χ I j ( R < k − C − Q ≤ k − C P φ (cid:1) + Q k (cid:0) χ I j ( R < − R < k − C ) Q ≤ k − C P φ (cid:1) . (3.54)Using the ﬁxed-time L x → ε L x estimate (3.36) for ( R < − k (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j ( R < − Q > k − C P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k (cid:13)(cid:13)(cid:13) ( R < − Q > k − C P φ (cid:13)(cid:13)(cid:13) L t L x . k ε k Q > k − C P φ k L t L x . ε k P φ k X , ∞ . To estimate the second term in (3.54) we observe that we have Q k (cid:0) χ I j ( R < k − C − Q ≤ k − C P φ (cid:1) = Q k (cid:0) ( Q [ k − C , k + C ] χ I j )( R < k − C − Q ≤ k − C P φ (cid:1) and hence by the ﬁxed-time L x → ε L x estimate for ( R < k − C − k (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j ( R < k − C − Q ≤ k − C P φ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k (cid:13)(cid:13)(cid:13) Q [ k − C , k + C ] ( χ I j ) (cid:13)(cid:13)(cid:13) L t (cid:13)(cid:13)(cid:13) ( R < k − C − Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) L ∞ t L x . ε k P φ k L ∞ t L x . Finally, we expand the third term in (3.54) as follows Q k (cid:0) χ I j ( R < − R < k − C ) Q ≤ k − C P (cid:1) = Q k (cid:16) χ I j (cid:0) e − i ψ ± < ( t , x , D ) − e − i ψ ± < k − C ( t , x , D ) (cid:1) e i ψ ± < ( D , y , t ) Q ≤ k − C P φ (cid:17) + Q k (cid:16) χ I j e − i ψ ± < k − C ( t , x , D ) (cid:0) e i ψ ± < ( D , y , t ) − e i ψ ± < k − C ( D , y , t ) (cid:1) Q ≤ k − C P φ (cid:17) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 35

To handle the ﬁrst term in the above expansion we use Proposition 3.15 and the N ∗ → N ∗ estimate(3.47) for e i ψ ± < ( D , y , t ) to ﬁnd that2 k (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j (cid:0) e − i ψ ± < ( t , x , D ) − e − i ψ ± < k − C ( t , x , D ) (cid:1) e i ψ ± < ( D , y , t ) Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) L t L x . k ′ = X k ′ = k − C k (cid:13)(cid:13)(cid:13) Q k (cid:0) χ I j e − i ψ ± k ′ ( t , x , D ) e i ψ ± < ( D , y , t ) Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) L t L x . k ′ = X k ′ = k − C ε δ ( k − k ′ ) (cid:13)(cid:13)(cid:13) e i ψ ± < ( D , y , t ) Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) N ∗ . ε k P φ k N ∗ . Observing that Q k (cid:16) χ I j e − i ψ ± < k − C ( t , x , D ) (cid:0) e i ψ ± < ( D , y , t ) − e i ψ ± < k − C ( D , y , t ) (cid:1) Q ≤ k − C P φ (cid:17) = Q k (cid:16) e − i ψ ± < k − C ( t , x , D ) Q k + O (1) (cid:16) χ I j (cid:0) e i ψ ± < ( D , y , t ) − e i ψ ± < k − C ( D , y , t ) (cid:1) Q ≤ k − C P φ (cid:17)(cid:17) , we estimate the second term analogously using the ﬁxed-time L x → L x estimate for e − i ψ ± < k − C ( t , x , D )and Proposition 3.17. (cid:3) Proof of the renormalization error estimate (3.21) . This estimate can be proven by adaptingthe proof in [22, Section 10.2] to our large data setting using similar ideas as above. The additionalerrors generated by the high-angle cuto ﬀ for intermediate frequencies in the deﬁnition of the phasecorrection ψ ± can be controlled by divisibility of suitable space-time norms of A .3.10. Proof of the dispersive estimate (3.22) . Since the S space is compatible with time local-izations by Lemma 2.1, the dispersive estimate (3.22) follows immediately from the estimate (83)in [22]. 4. B reakdown criterion Deﬁnition 4.1.

Let T , T > . For any admissible solution ( A , φ ) to the MKG-CG system on ( − T , T ) × R , we deﬁne k ( A , φ ) k S (( − T , T ) × R ) : = sup < T < T , < T ′ < T (cid:18) X j = k A j k S ([ − T , T ′ ] × R ) + k φ k S ([ − T , T ′ ] × R ) (cid:19) . We establish the following blowup criterion for admissible solutions to the MKG-CG system.

Proposition 4.2.

Let ( − T , T ) be the maximal interval of existence of an admissible solution ( A , φ ) to the MKG-CG system. If k ( A , φ ) k S (( − T , T ) × R ) < ∞ , then it must hold that T = T = ∞ . The idea of the proof of Proposition 4.2 is to establish an a priori bound on a subcritical normsup t ∈ ( − T , T ) 4 X j = (cid:13)(cid:13)(cid:13) A j [ t ] (cid:13)(cid:13)(cid:13) H sx × H s − x + (cid:13)(cid:13)(cid:13) φ [ t ] (cid:13)(cid:13)(cid:13) H sx × H s − x < ∞ for some s >

1. By the local well-posedness result [36] for the MKG-CG system it then followsthat the solution can be smoothly extended beyond the time interval ( − T , T ). To this end, we will use Tao’s device of frequency envelopes. For su ﬃ ciently small σ > k ∈ Z , c k : = (cid:18)X l ∈ Z − σ | k − l | (cid:16) X j = (cid:13)(cid:13)(cid:13) P l A j [0] (cid:13)(cid:13)(cid:13) H x × L x + (cid:13)(cid:13)(cid:13) P l φ [0] (cid:13)(cid:13)(cid:13) H x × L x (cid:17)(cid:19) . Proposition 4.2 is then a consequence of the following result.

Proposition 4.3.

Let ( − T , T ) be the maximal interval of existence of an admissible solution ( A , φ ) to the MKG-CG system. If k ( A , φ ) k S (( − T , T ) × R ) < ∞ , there exists C = C ( k ( A , φ ) k S (( − T , T ) × R ) ) < ∞ such that for all k ∈ Z , (4.1) (cid:13)(cid:13)(cid:13) P k A (cid:13)(cid:13)(cid:13) S k (( − T , T ) × R ) + (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k (( − T , T ) × R ) ≤ Cc k . Proof.

A sketch of the proof is given in Subsection 7.4. (cid:3)

5. A concept of weak evolution

In order to implement the contradiction argument after the concentration compactness step, wehave to deﬁne the notion of a solution to the MKG-CG system that is merely of energy class. Inthe context of critical wave maps in [20] this is achieved by ﬁrst approximating an energy classdatum by Schwartz class data in the energy topology. One then deﬁnes the energy class solutionas a suitable limit of the associated Schwartz class solutions. Using perturbation theory, one showsthat this limit is well-deﬁned and independent of the approximating sequence.For the MKG-CG system we have to argue more carefully, because it appears that the strongperturbative step in the context of the critical wave maps in [20] is not available due to a lowfrequency divergence. However, the problem with evolving irregular data is really a “high frequencyissue” and it appears that truncating high frequencies away does not lead to the same problems as ageneral perturbative step. More concretely, consider Coulomb energy class data at time t =

0. Bytruncating in frequency, we can assume that the frequency support of either input is at | ξ | ≤ K forsome K >

0. Then the problem becomes to show that we can add high-frequency perturbations tothe data, i.e. supported in frequency space at | ξ | > K at time t =

0, and to obtain a perturbed globalevolution.

Proposition 5.1.

Let ( A , φ ) be an admissible solution to the MKG-CG system on [ − T , T ] × R for some T , T > . Assume that ( A , φ )[0] have frequency support at | ξ | ≤ K for some K > andthat k ( A , φ ) k S ([ − T , T ] × R ) = L < ∞ . Then there exists δ ( L ) > with the following property: Let ( A + δ A , φ + δφ ) be any other admissible solution to the MKG-CG system deﬁned locally aroundt = such that E ( δ A , δφ )(0) = δ < δ ( L ) and such that ( δ A , δφ )[0] have frequency support at | ξ | > K. Then ( A + δ A , φ + δφ ) extends to anadmissible solution to the MKG-CG system on the whole time interval [ − T , T ] and satisﬁes k ( A + δ A , φ + δφ ) k S ([ − T , T ] × R ) ≤ ˜ L ( L , δ ) . Moreover, we have k ( δ A , δφ ) k S ([ − T , T ] × R ) → as δ → .Proof. A sketch of the proof is given in Subsection 7.4. (cid:3)

ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 37

The above high-frequency perturbation result suggests that we could deﬁne the MKG-CG evo-lution of energy class Coulomb data as a suitable limit of the evolutions of low frequency approxi-mations of the energy class data. More precisely, for Coulomb data ( A , φ )[0] ∈ ˙ H x × L x , we pick asequence of smoothings ( A n , φ n )[0] by truncating the frequency support of ( A , φ )[0] so thatlim n →∞ ( A n , φ n )[0] = ( A , φ )[0]in the sense of ˙ H x × L x . Here the rather technical issue appears whether there exists a smooth (local)solution ( A n , φ n ) to the MKG-CG system with initial data ( A n , φ n )[0]. The hypothesis ( A , φ )[0] ∈ ˙ H x × L x does not guarantee that A (0) and φ (0) are L integrable in the low frequencies. For thisreason we cannot directly invoke the local well-posedness result [36] to obtain a smooth localsolution. The natural way around this is to localize in physical space. This will be explained inmore detail in Subsection 5.2 below.For each smooth local solution ( A n , φ n ) to the MKG-CG system with initial data ( A n , φ n )[0] wethen deﬁne I n : = ∪ ˜ I ∈A n ˜ I , where A n : = (cid:8) ˜ I ⊂ R open interval with 0 ∈ ˜ I : sup J ⊂ ˜ I , Jclosed k ( A n , φ n ) k S ( J × R ) < ∞ (cid:9) . We call I n the maximal lifespan of the solution ( A n , φ n ).In order to deﬁne a canonical evolution of Coulomb energy class data, we have to show thatthe low frequency approximations ( A n , φ n ) exist on some joint time interval and satisfy uniform S norm bounds there. Proposition 5.2.

Let ( A , φ )[0] be Coulomb energy class data and let (cid:8) ( A n , φ n )[0] (cid:9) n be a sequenceof smooth low frequency truncations of ( A , φ )[0] such that lim n →∞ ( A n , φ n )[0] = ( A , φ )[0] in the sense of ˙ H x × L x . Denote by ( A n , φ n ) the smooth solutions to the MKG-CG system withinitial data ( A n , φ n )[0] and with maximal intervals of existence I n . Then there exists a time T ≡ T ( A , φ ) > such that [ − T , T ] ⊂ I n for all su ﬃ ciently large n and lim sup n →∞ k ( A n , φ n ) k S ([ − T , T ] × R ) ≤ C ( A , φ ) , where C ( A , φ ) > is a constant that depends only on the energy class data ( A , φ )[0] .Proof. The proof is given in Subsection 5.1 below. (cid:3)

Using Proposition 5.1 and Proposition 5.2, we may introduce the following notion of energyclass solutions to the MKG-CG system that we outlined above.

Deﬁnition 5.3.

Let ( A , φ )[0] be Coulomb energy class data and let { ( A n , φ n )[0] } n be a sequence ofsmooth low frequency truncations of ( A , φ )[0] such that lim n →∞ ( A n , φ n )[0] = ( A , φ )[0] in the sense of ˙ H x × L x . We denote by ( A n , φ n ) the smooth solutions to the MKG-CG system withinitial data ( A n , φ n )[0] and deﬁne I = ( − T , T ) = ∪ ˜ I to be the union of all open time intervals ˜ Icontaining with the property that sup J ⊂ ˜ I , Jclosed lim inf n →∞ k ( A n , φ n ) k S ( J × R ) < ∞ . Then we deﬁne the MKG-CG evolution of ( A , φ )[0] on I × R to be ( A , φ )[ t ] : = lim n →∞ ( A n , φ n )[ t ] , t ∈ I , where the limit is taken in the energy topology. We call I the maximal lifespan of ( A , φ ) . For anyclosed interval J ⊂ I, we set k ( A , φ ) k S ( J × R ) : = lim n →∞ k ( A n , φ n ) k S ( J × R ) . We obtain the following characterization of the maximal lifespan I of an energy class solution. Lemma 5.4.

Let ( A , φ ) , ( A n , φ n ) and I be as in Deﬁnition 5.3. Assume in addition that I , ( −∞ , ∞ ) .Then sup J ⊂ I , Jclosed lim inf n →∞ k ( A n , φ n ) k S ( J × R ) = ∞ . We call an energy class solution ( A , φ ) with maximal lifespan I singular , if either I , R , or if I = R and sup J ⊂ I , Jclosed k ( A , φ ) k S ( J × R ) = ∞ . Proof of Proposition 5.2.

A natural idea is to localize the data ( A n , φ n )[0] in physical spaceto ensure smallness of the energy and to then try to “patch together” the local solutions obtainedfrom the small energy global well-posedness result [22]. The problem is that the MKG-CG systemdoes not have the ﬁnite speed of propagation property due to non-local terms in the equation forthe magnetic potential A . To overcome this di ﬃ culty, we exploit that the Maxwell-Klein-Gordonsystem enjoys gauge invariance.We ﬁrst describe how we suitably localize the data ( A n , φ n )[0] in physical space to obtain admis-sible Coulomb data with small energy that can be globally evolved by [22]. Let χ ∈ C ∞ c ( R ) be asmooth cuto ﬀ function with support in the ball B (0 , ) and such that χ ≡ B (0 , ). For x ∈ R and r >

0, we set χ {| x − x | . r } ( x ) : = χ ( x − x r ). Then we deﬁne(5.1) γ n (0 , · ) : = ∆ − ∂ j (cid:0) χ {| x − x | . r } ( · ) A jn (0 , · ) (cid:1) and for j = , . . . , A n , j (0 , · ) : = χ {| x − x | . r } ( · ) A n , j (0 , · ) − ∂ j γ n (0 , · ) . We determine ˜ A n , (0 , · ) as the solution to the elliptic equation(5.3) ∆ ˜ A n , = − Im (cid:0) χ {| x − x | . r } φ n χ {| x − x | . r } ∂ t φ n (cid:1) + | χ {| x − x | . r } φ n | A n , on R , where φ n and A n , are evaluated at time t =

0. We note that ˜ A n is in Coulomb gauge. Then we set(5.4) ∂ t γ n (0 , · ) : = A n , (0 , · ) − ˜ A n , (0 , · )and deﬁne ∂ t ˜ A n , j (0 , · ) for j = , . . . ,

4, ﬁrst just on B ( x , r ), by setting(5.5) ∂ t ˜ A n , j | B ( x , r ) (0 , · ) : = (cid:0) ∂ t A n , j (0 , · ) − ∂ j ∂ t γ n (0 , · ) (cid:1)(cid:12)(cid:12)(cid:12) B ( x , r ) . We observe that ∆ ( A n , (0 , · ) − ˜ A n , (0 , · )) = B ( x , r ) by the deﬁnition of ˜ A n , (0 , · ). Thus, thedata ∂ t ˜ A n | B ( x , r ) (0 , · ) satisfy the Coulomb compatibility condition ∂ j ( ∂ t ˜ A jn )(0 , · ) = B ( x , r ).Using [7, Proposition 2.1], we extend ( ∂ t ˜ A n , j )(0 , · ) | B ( x , r ) to the whole of R while maintaining ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 39 the Coulomb compatibility condition and such that k ∂ t ˜ A n , j k L x ( R ) . k ∂ t ˜ A n , j k L x ( B ( x , r )) . Finally, wedeﬁne(5.6) ˜ φ n (0 , · ) : = e i γ n (0 , · ) χ {| x − x | . r } ( · ) φ n (0 , · )and(5.7) ∂ t ˜ φ n (0 , · ) : = i ∂ t γ n (0 , · ) e i γ n (0 , · ) χ {| x − x | . r } ( · ) φ n (0 , · ) + e i γ n (0 , · ) χ {| x − x | . r } ( · ) ∂ t φ n (0 , · ) . In the next lemma we prove that by choosing r > ﬃ ciently small, we can ensure that theCoulomb data ( ˜ A n , ˜ φ n )[0] have small energy for all su ﬃ ciently large n . Here we exploit that theconvergence ( A n , φ n )[0] → ( A , φ )[0] in the energy topology as n → ∞ implies a uniform non-concentration property of the energy of the data ( A n , φ n )[0]. We denote by ε > Lemma 5.5.

Let ( ˜ A n , ˜ φ n ) be deﬁned as in (5.1) – (5.7) . Given ε > there exists r > such thatuniformly for all x ∈ R and for all su ﬃ ciently large n, it holds thatE ( ˜ A n , ˜ φ n ) < ε . Proof.

We start with the components ˜ A n , j . Suppressing that A n is evaluated at time t =

0, we havefor j = , . . . , k∇ x ˜ A n , j k L x ( R ) . k∇ x ( χ {| x − x | . r } A n , j ) k L x ( R ) + k∇ x ∂ j γ n k L x ( R ) . X i = k∇ x ( χ {| x − x | . r } A n , i ) k L x ( R ) . X i = r Z B ( x , r ) | A n , i ( x ) | dx + Z B ( x , r ) |∇ x A n , i ( x ) | dx . X i = (cid:18)Z B ( x , r ) | A n , i ( x ) | dx (cid:19) / + Z B ( x , r ) |∇ x A n , i ( x ) | dx . (5.8)Next we note that we can pick r > A thatsup x ∈ R X i = Z B ( x , r ) |∇ x A i ( x ) | dx + Z B ( x , r ) | A i ( x ) | dx ≪ ε . Since A n → A in ˙ H x ( R ) as n → ∞ , we also obtain for su ﬃ ciently large n thatsup x ∈ R X i = Z B ( x , r ) |∇ x A n , i ( x ) | dx + Z B ( x , r ) | A n , i ( x ) | dx ≪ ε . From (5.8) we conclude that k∇ x ˜ A n , j k L x ( R ) . ε . In a similar manner we argue that r > ﬃ ciently large n we also have X i = k ∂ t ˜ A n , i k L x ( R ) + k∇ x ˜ A n , k L x ( R ) + k∇ t , x ˜ φ n k L x ( R ) . ε and hence, E ( ˜ A n , ˜ φ n ) . ε . (cid:3) By Lemma 5.5 we can pick r > A n , ˜ φ n )[0] can be globally evolved forsu ﬃ ciently large n by the small energy global well-posedness result [22] and we obtain global S norm bounds on their evolutions ( ˜ A n , ˜ φ n ). For t > ∂ t γ n ( t , · ) : = A n , ( t , · ) − ˜ A n , ( t , · ) , which implies that(5.10) γ n ( t , · ) = γ n (0 , · ) + Z t (cid:0) A n , ( s , · ) − ˜ A n , ( s , · ) (cid:1) ds . Our next goal is to relate the evolutions ( ˜ A n , ˜ φ n ) and ( A n , φ n ) on the light cone K x , r = (cid:8) ( t , x ) : 0 ≤ t < r , | x − x | < r − t (cid:9) over the ball B ( x , r ). These identities will be the key ingredient to recover S norm bounds for( A n , φ n ) from those of ( ˜ A n , ˜ φ n ). Lemma 5.6.

Let ( ˜ A n , ˜ φ n ) and γ n be deﬁned as in (5.1) – (5.7) and (5.9) – (5.10) such that E ( ˜ A n , ˜ φ n ) < ε .For all su ﬃ ciently large n it holds that ˜ A n , j = A n , j − ∂ j γ n on K x , r for j = , . . . , and that ˜ φ n = e i γ n φ n on K x , r . Proof.

To simplify the notation we omit the subscript n . Using the equations that ( A , φ ), ( ˜ A , ˜ φ ), and γ satisfy, we obtain that(5.11) (cid:3) ˜ A j = − Im (cid:0) ˜ φ ˜ D j ˜ φ (cid:1) + ∂ j ∆ − ∂ i Im (cid:0) ˜ φ ˜ D i ˜ φ (cid:1) on R t × R x and(5.12) (cid:3) ( A j − ∂ j γ ) = − Im (cid:0) φ D j φ (cid:1) + ∂ j ∆ − ∂ i Im (cid:0) ˜ φ ˜ D i ˜ φ (cid:1) − ∂ j Z t n Im (cid:0) φ D t φ (cid:1) − Im (cid:0) ˜ φ D t ˜ φ (cid:1)o ds on K x , r , where we use the notation ˜ D α = ∂ α + i ˜ A α . Next we introduce the quantities B j = ˜ A j − ( A j − ∂ j γ )and ψ = ˜ φ − e i γ φ. From (5.11) and (5.12) we infer that (cid:3) B j = Im (cid:0) φ D j φ (cid:1) − Im (cid:0) ˜ φ ˜ D j ˜ φ (cid:1) − ∂ j Z t n Im (cid:0) φ D t φ (cid:1) − Im (cid:0) ˜ φ ˜ D t ˜ φ (cid:1)o ds on K x , r . The ﬁrst two terms in the above equation can be rewritten asIm (cid:0) φ D j φ (cid:1) − Im (cid:0) ˜ φ ˜ D j ˜ φ (cid:1) = B j | φ | − Im (cid:0) ψ ( ∂ j + i ˜ A j )( ψ + e i γ φ ) (cid:1) − Im (cid:0) e i γ φ ( ∂ j + i ˜ A j ) ψ (cid:1) and similarly we obtain for the last term thatIm (cid:0) φ D t φ (cid:1) − Im (cid:0) ˜ φ ˜ D t ˜ φ (cid:1) = − Im (cid:0) ψ ( ∂ t + i ˜ A )( ψ + e i γ φ ) (cid:1) − Im (cid:0) e i γ ( ∂ t + i ˜ A ) ψ (cid:1) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 41

We conclude that the wave equation for B j on the light cone K x , r is of the schematic form (cid:3) B j = f B j + f | ψ | + f ψ + f ψ + f ( ∂ j ψ ) + f ( ∂ j ψ ) + ∂ j Z t n f | ψ | + f ψ + f ψ + f ( ∂ t ψ ) + f ( ∂ t ψ ) o ds , (5.13)where f , . . . , f are smooth functions on K x , r . To obtain a wave equation for ψ , we note that B = ˜ A − ( A − ∂ t γ ) = = (cid:3) ˜ A ˜ φ − e i γ (cid:3) A φ = (cid:3) ˜ A ( ψ + e i γ φ ) − e i γ (cid:3) A φ = (cid:3) ˜ A ψ + (cid:3) B + A − ∂γ ( e i γ φ ) − e i γ (cid:3) A φ = (cid:3) ˜ A ψ + (cid:3) B ( e i γ φ ) − (cid:3) ( e i γ φ ) − B j ( A j − ∂ j γ ) e i γ φ + (cid:3) A − ∂γ ( e i γ φ ) − e i γ (cid:3) A φ = (cid:3) ˜ A ψ + i ( ∂ j B j )( e i γ φ ) + iB j ∂ j ( e i γ φ ) − B j B j ( e i γ φ ) − B j ( A j − ∂ j γ ) e i γ φ. Thus, ψ satisﬁes a wave equation on the light cone K x , r of the schematic form(5.14) (cid:3) ψ = f ψ + f α ∂ α ψ + g j B j + gB j B j + h ( ∂ j B j ) , where f , f α , g , g j , h are smooth functions on K x , r . Since B [0] and ψ [0] vanish on B ( x , r ) byour choice of the initial data ( ˜ A , ˜ φ )[0], we conclude from (5.13) and (5.14) by a standard energyargument that indeed ˜ A j = A j − ∂ j γ on K x , r and ˜ φ = e i γ φ on K x , r . (cid:3) It is clear that given ε >

0, there exists R > ﬃ ciently large n , it holds that E (cid:0) χ {| x | > R } ( · ) A n (0 , · ) , χ {| x | > R } ( · ) φ n (0 , · ) (cid:1) < ε . For our later purposes we have to localize the initial data ( A n , φ n ) outside the large ball B (0 , R ) ina scaling invariant way. For any x l ∈ R with | x l | ∼ R m for some m ∈ N , we set r l : = R m − .Then we deﬁne γ ( l ) n (0 , · ) : = ∆ − ∂ j (cid:0) χ {| x − x l | . r l } ( · ) A jn (0 , · ) (cid:1) and for j = , . . . ,

4, ˜ A ( l ) n , j (0 , · ) : = χ {| x − x l | . r l } ( · ) A n , j (0 , · ) − ∂ j γ ( l ) n (0 , · ) . We deﬁne ˜ A ( l ) n , (0 , · ) , ∂ t ˜ A ( l ) n , j (0 , · ) , ˜ φ ( l ) n (0 , · ) , ∂ t ˜ φ ( l ) n (0 , · ) analogously to (5.3) – (5.7) and γ ( l ) n ( t , · ), ∂ t γ ( l ) n ( t , · )for t > Lemma 5.7.

Given ε > there exists R > such that the initial data ( ˜ A ( l ) n , ˜ φ ( l ) n ) deﬁned as abovesatisfy for all su ﬃ ciently large n that E ( ˜ A ( l ) n , ˜ φ ( l ) n ) < ε . and Lemma 5.8.

For all su ﬃ ciently large n it holds that ˜ A ( l ) n , j = A n , j − ∂ j γ ( l ) n on K x l , r l for j = , . . . , , and that ˜ φ ( l ) n = e i γ ( l ) n φ n on K x l , r l , where K x l , r l : = (cid:8) ( t , x ) : 0 ≤ t < r l , | x − x l | < r l − t (cid:9) . We now begin with the proof of Proposition 5.2 where we suitably “patch together” the smallenergy global evolutions constructed above.

Proof of Proposition 5.2.

By time reversibility, it su ﬃ ces to only prove the statement in forwardtime. We pick r > ﬃ ciently small and R > ﬃ ciently large according to Lemma 5.5and Lemma 5.7. Then we cover the ball B (0 , R ) ⊂ R by the supports of ﬁnitely many cuto ﬀ s χ {| x − x l | . r l } with r l = r for l = , . . . , L for some L ∈ N . We divide the complement B (0 , R ) c of theball B (0 , R ) into dyadic annulli A m : = (cid:8) x ∈ R : 2 R m − < | x | ≤ R m (cid:9) , m ∈ N , and cover each A m by the supports of ﬁnitely many suitable cuto ﬀ s χ {| x − x l | . r l } ( · ) with | x l | ∼ R m and r l ∼ R m − .This can be carried out in such a way that (cid:8) supp( χ {| x − x l | . r l } ) (cid:9) ∞ l = is a uniformly ﬁnitely overlappingcovering of R . We denote by ( ˜ A ( l ) n , ˜ φ ( l ) n ) the associated global solutions to MKG-CG with smallenergy data given by Lemma 5.5, respectively Lemma 5.7. Fix 0 < T ≪ r such that[0 , T ] × R ⊂ ∞ [ l = K x l , r l . Then Lemma 5.6 and Lemma 5.8 imply that the evolutions ( A n , φ n ) exist on the time interval [0 , T ]uniformly for all su ﬃ ciently large n . The covering of R by the supports of the cuto ﬀ s χ {| x − x l | . r l } ( · )can be done in such a way that there exists a uniformly ﬁnitely overlapping, smooth partition ofunity { χ l } l ∈ N ⊂ C ∞ c ( R × R ),(5.15) 1 = ∞ X l = χ l on [0 , T ] × R , so that each cuto ﬀ function χ l ( t , x ) is non-zero only for t ∈ [ − T , T ] and satisﬁes K x l , r l ∩{ t }× R ⊂ supp( χ l ( t , · )) ⊂ K x l , r l ∩ { t } × R for t ∈ [0 , T ].In order to obtain uniform S norm bounds on the evolutions of ( A n , φ n ) on [0 , T ] × R , itsu ﬃ ces to establish uniform bounds on the Strichartz and X , ∞ components of the S norms of( A n , φ n ) on [0 , T ] × R . These bounds then imply uniform bounds on the full S norms of ( A n , φ n )on [0 , T ] × R by a bootstrap argument as in the proof of Proposition 8.7, see the key Observation 1and Observation 2 there. Since the argument in Proposition 8.7 is self-contained, we omit the detailshere. To facilitate the notation in the following, we introduce the ˜ S norm k u k S : = X k ∈ Z k P k ∇ t , x u k S k . Its dyadic subspaces ˜ S k are given by k u k S k : = k u k S Strk + k u k X , ∞ , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 43 where we recall that S S trk = \ q + / r ≤ ( q + r − k L qt L rx . We begin by deriving uniform ˜ S norm bounds on the evolutions A n on [0 , T ] × R . To this endwe deﬁne for i , j = , . . . , l ∈ N , the curvature tensors F n , i j = ∂ i A n , j − ∂ j A n , i and ˜ F ( l ) n , i j = ∂ i ˜ A ( l ) n , j − ∂ j ˜ A ( l ) n , i . From Lemma 5.6 and Lemma 5.8 we conclude that F n , i j = ∞ X l = χ l F n , i j = ∞ X l = χ l ˜ F ( l ) n , i j on [0 , T ] × R . Using the Coulomb gauge, we ﬁnd for j = , . . . , A n , j = ∆ − ∂ i F n , i j = ∞ X l = ∆ − ∂ i (cid:0) χ l ˜ F ( l ) n , i j (cid:1) on [0 , T ] × R . In order to infer ˜ S norm bounds on A n , j from the ﬁnite S norm bounds of the globally deﬁnedevolutions ˜ A ( l ) n , we invoke the following almost orthogonality estimate. We defer its proof to theend of this subsection. Lemma 5.9.

There exists a constant C ( A , φ ) > so that uniformly for all n, we have for j = , . . . , that (5.16) (cid:13)(cid:13)(cid:13)(cid:13) ∞ X l = ∆ − ∂ i (cid:0) χ l ˜ F ( l ) n , i j (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) ˜ S ([0 , T ] × R ) ≤ C ( A , φ ) ∞ X l = (cid:13)(cid:13)(cid:13) ∆ − ∂ i (cid:0) χ l ˜ F ( l ) n , i j (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) ! / . The constant C ( A , φ ) > depends only on the size of T > , which is determined by the energyclass data ( A , φ )[0] . Hence, by (5.16) we obtain for j = , . . . , k A n , j k ˜ S ([0 , T ] × R ) = (cid:13)(cid:13)(cid:13)(cid:13) ∞ X l = ∆ − ∂ i (cid:0) χ l ˜ F ( l ) n , i j (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) ˜ S ([0 , T ] × R ) ≤ C ( A , φ ) ∞ X l = (cid:13)(cid:13)(cid:13) ∆ − ∂ i (cid:0) χ l ˜ F ( l ) n , i j (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) ! / . C ( A , φ ) ∞ X l = (cid:13)(cid:13)(cid:13) ∆ − ∇ x (cid:0) χ l ∇ x ˜ A ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) ! / . Next we will invoke the following multiplier bound for the ˜ S norm that will be proven at the endof this subsection. Lemma 5.10.

Let χ ∈ C ∞ ( R × R ) satisfy max k = , ,..., k∇ kt , x χ k L qt L rx ( R × R ) ≤ D for all ≤ q , r ≤ ∞ for some D > . Then there exists a constant C > independent of χ such that for all ψ ∈ ˜ S ( R × R ) ,it holds that (5.18) (cid:13)(cid:13)(cid:13) ∆ − ∇ x (cid:0) χ ∇ x ψ (cid:1)(cid:13)(cid:13)(cid:13) ˜ S ( R × R ) ≤ CD k ψ k ˜ S ( R × R ) . By scaling invariance of the ˜ S norm and the scaling invariant setup of the partition of unity { χ l } l ∈ N , we are in a position to apply Lemma 5.10 uniformly for all multipliers χ l to estimate theright-hand side of (5.17) by C ( A , φ ) ∞ X l = (cid:13)(cid:13)(cid:13) ˜ A ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) ! / . By the small energy global well-posedness result [22], this is in turn bounded by(5.19) C ( A , φ ) ∞ X l = k∇ t , x ˜ φ ( l ) n (0) k L x ( R ) + k∇ t , x ˜ A ( l ) n (0) k L x ( R ) ! / . It remains to square sum in l over the ˙ H x × L x norms of the initial data ( ˜ φ ( l ) n , ˜ A ( l ) n )[0], which we deferto the end of the proof of Proposition 5.2.To deduce uniform ˜ S norm bounds on the evolutions φ n on [0 , T ] × R , we use Lemma 5.6 andLemma 5.8 to write(5.20) φ n = ∞ X l = χ l φ n = ∞ X l = χ l e − i γ ( l ) n ˜ φ ( l ) n on [0 , T ] × R . Next, we apply the following almost orthogonality estimate whose proof we defer to the end of thissubsection.

Lemma 5.11.

There exists a constant C ( A , φ ) > so that uniformly for all n, (5.21) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X l = χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ˜ S ([0 , T ] × R ) ≤ C ( A , φ ) ∞ X l = (cid:13)(cid:13)(cid:13) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) ! / . The constant C ( A , φ ) > depends only on the size of T > , which is determined by the energyclass data ( A , φ )[0] . Thus, by (5.20) and (5.21) we ﬁnd that(5.22) k φ n k ˜ S ([0 , T ] × R ) ≤ C ( A , φ ) ∞ X l = (cid:13)(cid:13)(cid:13) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) ! / . Here it is not immediate how to obtain ˜ S norm bounds for χ l e − i γ ( l ) n ˜ φ ( l ) n from the ﬁnite S normbounds of the globally deﬁned ˜ φ ( l ) n , because γ ( l ) n implicitly depends on the unknown quantity φ n .Indeed, we deﬁned in (5.10) for t > γ ( l ) n ( t , · ) = γ ( l ) n (0 , · ) + Z t (cid:0) A n , ( s , · ) − ˜ A ( l ) n , ( s , · ) (cid:1) ds and we have ∆ A n , = − Im (cid:0) φ n D t φ n (cid:1) . We will overcome this di ﬃ culty by exploiting that ∂ t γ ( l ) n isa harmonic function on every ﬁxed-time slice of K ( x l , r l ) in view of Lemma 5.6, respectivelyLemma 5.8, and its deﬁnition ∂ t γ ( l ) n ( t , · ) = A n , ( t , · ) − ˜ A n , ( t , · ) for t ≥ . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 45

It therefore enjoys the interior derivative estimates for harmonic functions on every ﬁxed-time sliceof K ( x l , r l ). The partition of unity (5.15) was chosen in such a way that the cuto ﬀ functions χ l satisfy for all 0 ≤ t ≤ T and x ∈ supp( χ l ( t , · )) that B ( x , r l ) ⊂ K x l , r l ∩ { t } × R . Thus, for allintegers k ≥

0, we obtain from the interior derivative estimates for harmonic functions that (cid:12)(cid:12)(cid:12) χ l ∇ kx ∂ t γ ( l ) n ( t , x ) (cid:12)(cid:12)(cid:12) . C ( k ) r + kl (cid:13)(cid:13)(cid:13)(cid:0) A n , − ˜ A ( l ) n , (cid:1) ( t , · ) (cid:13)(cid:13)(cid:13) L x ( B ( x , rl )) . C ( k ) r + kl (cid:13)(cid:13)(cid:13)(cid:0) A n , − ˜ A ( l ) n , (cid:1) ( t , · ) (cid:13)(cid:13)(cid:13) L x ( R ) . C ( k ) r + kl E ( A , φ ) / . We may therefore conclude that(5.23) r + kl (cid:13)(cid:13)(cid:13) χ l ∇ kx ∂ t γ ( l ) n (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( R × R ) ≤ C ( k , A , φ ) . Similarly, we observe that by Lemma 5.6, ∂ t γ ( l ) n ( t , · ) = ∂ t A n , ( t , · ) − ∂ t ˜ A n , ( t , · )is harmonic on every ﬁxed-time slice of K ( x l , r ). The interior derivative estimates for harmonicfunctions then yield for all integers k ≥ (cid:12)(cid:12)(cid:12) χ l ∇ kx ∂ t γ ( l ) n ( t , x ) (cid:12)(cid:12)(cid:12) . C ( k ) r + kl (cid:13)(cid:13)(cid:13)(cid:0) ∂ t A n , − ∂ t ˜ A n , (cid:1) ( t , · ) (cid:13)(cid:13)(cid:13) L x ( B ( x , rl )) . C ( k ) r + kl (cid:13)(cid:13)(cid:13)(cid:0) ∂ t A n , − ∂ t ˜ A n , )( t , · ) (cid:13)(cid:13)(cid:13)(cid:13) L x ( R ) . Since we have (cid:13)(cid:13)(cid:13) ∂ t A n ( t , · ) (cid:13)(cid:13)(cid:13) L x ( R ) . (cid:13)(cid:13)(cid:13) ∇ x ∂ t A n ( t , · ) (cid:13)(cid:13)(cid:13) L x . X i = (cid:13)(cid:13)(cid:13) Im (cid:0) φ n D i φ n (cid:1)(cid:13)(cid:13)(cid:13) L x . X i = k φ n k L x k D i φ n k L x . E ( A , φ )and analogously for k ∂ t ˜ A n ( t , · ) k L x ( R ) , it follows that(5.24) r + kl (cid:13)(cid:13)(cid:13) χ l ∇ kx ∂ t γ ( l ) n (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( R × R ) . C ( k , A , φ ) . Next, we note that γ ( l ) n (0 , · ) as deﬁned in (5.1) is harmonic on the ball B ( x l , r l ). As before, theinterior derivative estimates for harmonic functions give for all integers k ≥ r kl (cid:13)(cid:13)(cid:13) χ l ∇ kx γ ( l ) n (0 , · ) (cid:13)(cid:13)(cid:13) L ∞ x ( R ) ≤ C ( k , A , φ ) . We then obtain from γ ( l ) n ( t , x ) = γ ( l ) n (0 , x ) + Z t ∂ t γ ( l ) n ( s , x ) ds that(5.26) r kl (cid:13)(cid:13)(cid:13) χ l ∇ kx γ ( l ) n (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( R × R ) . C ( k , A , φ ) . From (5.23) – (5.26) we conclude that for all integers k ≥ C ( k , A , φ ) > k and the energy class data ( A , φ )[0], so that for all su ﬃ ciently large n and all l ∈ N ,(5.27) max m = , , r k + ml (cid:13)(cid:13)(cid:13) ∇ kx ∂ mt (cid:0) χ l e − i γ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( R × R ) ≤ C ( k , A , φ ) . Similarly to Lemma 5.10, we also have the following multiplier bound for the ˜ S norm. Lemma 5.12.

Let χ ∈ C ∞ ( R × R ) satisfy (5.28) max k = ,..., max m = , , (cid:13)(cid:13)(cid:13) ∇ kx ∂ mt χ (cid:13)(cid:13)(cid:13) L qt L rx ( R × R ) ≤ D for all ≤ q , r ≤ ∞ for some D > . Then there exists a constant C > independent of χ such that for all ψ ∈ ˜ S ( R × R ) , (5.29) k χψ k ˜ S ( R × R ) ≤ CD k ψ k ˜ S ( R × R ) . In view of (5.27), the scaling invariance of the ˜ S norm and the scaling invariant setup of thepartition of unity { χ l } l ∈ N , we can apply Lemma 5.12 uniformly for all multipliers χ l to estimate theright hand side of (5.22) by C ( A , φ ) ∞ X l = (cid:13)(cid:13)(cid:13) ˜ φ ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) ! / . By the small energy global well-posedness result [22], this is in turn bounded by(5.30) C ( A , φ ) (cid:18) ∞ X l = k∇ t , x ˜ φ ( l ) n (0) k L x + k∇ t , x ˜ A ( l ) n (0) k L x (cid:19) / . It remains to square sum in l over the ˙ H x × L x norms of the initial data ( ˜ φ ( l ) n , ˜ A ( l ) n )[0] in (5.19) and in(5.30). Here we have, for example, from the deﬁnition˜ φ ( l ) n (0 , · ) : = e i γ ( l ) n (0 , · ) χ {| x − x l | . r l } ( · ) φ n (0 , · )that ∞ X l = Z R |∇ x ˜ φ ( l ) n (0 , x ) | dx . ∞ X l = Z R (cid:0) |∇ x γ ( l ) n (0 , x ) | | χ {| x − x l | . r l } ( x ) | + |∇ x χ {| x − x l | . r l } ( x ) | (cid:1) | φ n (0 , x ) | dx + ∞ X l = Z R | χ {| x − x l | . r l } ( x ) | |∇ x φ n (0 , x ) | dx . (5.31)By the construction of the partition of unity, we have uniformly for all l ∈ N and x ∈ R that |∇ x χ {| x − x l | . r l } ( x ) | . C ( A , φ ) | x | and, using also (5.25), that |∇ x γ ( l ) n (0 , x ) | | χ {| x − x l | . r l } ( x ) | . C ( A , φ ) | x | . By Hardy’s inequality and the uniformly ﬁnite overlap of the supports of the cuto ﬀ s χ {| x − x l | . r l } ( · ),we conclude that (5.31) is bounded by C ( A , φ ) k∇ x φ n (0 , · ) k L x . C ( A , φ ) E ( A , φ )uniformly for all su ﬃ ciently large n . Proceeding similarly with the other terms in (5.30), we ﬁnallyobtain that (5.30) is bounded by C ( A , φ ) E ( A , φ ) uniformly for all su ﬃ ciently large n . This ﬁnishesthe proof of Proposition 5.2. (cid:3) Next, we turn to the proofs of Lemma 5.9 and of Lemma 5.11. We only give the proof ofLemma 5.11, the other one being similar.

ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 47

Proof of Lemma 5.11.

In view of the setup of the partition of unity { χ l } l ∈ N , we may assume in thisproof that the spatial support of χ l is at scale ∼ l for l ∈ N . Moreover, we recall that χ l ( t , · ) isnon-zero only for t ∈ [ − T , T ].We ﬁrst consider the S S trk component of the ˜ S norm. Here we want to show that X k ∈ Z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k ∞ X l = ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) S Strk ([0 , T ] × R ) . ∞ X l = X k ∈ Z (cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) S Strk ( R × R ) . Let ( q , r ) be a wave-admissible pair. Then we have(5.32) X k ∈ Z q + r − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k ∞ X l = ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L rx ([0 , T ] × R ) . X k ≤ q + r − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k − k X l = ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L rx ([0 , T ] × R ) + X k ∈ Z q + r − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k X l > − k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L rx ([0 , T ] × R ) . In order to bound the ﬁrst term on the right-hand side of (5.32), we introduce slightly fattened cuto ﬀ functions ˜ χ l ∈ C ∞ c ( R × R ) such that supp( χ l ) ⊂ supp( ˜ χ l ). Then we obtain from H ¨older’s inequalityand Bernstein’s estimate that X k ≤ q + r − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k − k X l = ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L rx ([0 , T ] × R ) . X k ≤ T q ( q + k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − k X l = ˜ χ l ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ t L x ([0 , T ] × R ) ! . X k ≤ − k X l = T q k k ˜ χ l k L ∞ t L x (cid:13)(cid:13)(cid:13) ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ([0 , T ] × R ) ! . Since the spatial support of ˜ χ l is at scale 2 l , this is bounded by T q X k ≤ − k X l = k + l ) (cid:13)(cid:13)(cid:13) ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ([0 , T ] × R ) ! . Finally, using Young’s inequality, we arrive at the desired bound T q ∞ X l = (cid:13)(cid:13)(cid:13) ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ([0 , T ] × R ) . T q ∞ X l = (cid:13)(cid:13)(cid:13) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) . Regarding the second term on the right-hand side of (5.32),(5.33) X k ∈ Z q + r − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)X l > − k P k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L rx ([0 , T ] × R ) , we note that the spatial support of the cuto ﬀ χ l is at scale 2 l , while the projection P k lives at spatialscale 2 − k . Thus, for l > − k the projection P k approximately preserves the spatial localizations of the cuto ﬀ s χ l , up to exponential tails that can be treated easily. Since the family of cuto ﬀ s { χ l } l ∈ N isuniformly ﬁnitely overlapping, we may therefore bound (5.33) schematically by X k ∈ Z X l > − k q + r − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L qt L rx ([0 , T ] × R ) . ∞ X l = (cid:13)(cid:13)(cid:13) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) , which is of the desired form.It remains to consider the X , ∞ component of the ˜ S norm. Here our goal is to prove that X k ∈ Z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k ∞ X l = ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X , ∞ ([0 , T ] × R ) . ∞ X l = (cid:13)(cid:13)(cid:13) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) . To this end we distinguish between small and large modulations. For small modulations j ≤

0, wemay just dispose of the projection Q j and trivially estimate X k ∈ Z sup j ≤ j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k Q j ∞ X l = ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . X k ∈ Z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X l = P k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . By the space-time support properties of the cuto ﬀ s χ l and H ¨older’s inequality in time, this isbounded by . X k ∈ Z T (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k ∞ X l = ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) and then we obtain as in the previous considerations on the S S trk component of the ˜ S norm thedesired bound . T ∞ X l = X k ∈ Z (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . For large modulations j > k >

0, the space-time supports of the cuto ﬀ s χ l are approximately preserved, up to exponential tails that can be treated easily. Denoting by ˜ χ l slightly fattended versions of the cuto ﬀ s χ l , we may therefore estimate schematically X k > sup j > j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X l = P k Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) ≃ X k > sup j > j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) ∞ X l = ˜ χ l P k Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ∞ X l = X k > sup j > j (cid:13)(cid:13)(cid:13) P k Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ∞ X l = X k > (cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ , which is of the desired form. It therefore remains to consider the case of large modulations j > k ≤

0. Here we distinguish the cases l > − k and 1 ≤ l < − k . For l > − k ,the projection P k Q j approximately preserves the space-time localization of χ l and we immediately ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 49 obtain the schematic estimate X k ≤ sup j > j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)X l > − k P k Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . X k ≤ X l > − k (cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ ( R × R ) , which is of the desired form. Finally, for 1 ≤ l < − k and large modulations j > Q j approximatelypreserves the time localization of χ l . Thus, we obtain for slightly fattened cuto ﬀ s ˜ χ l that X k ≤ sup j > j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − k X l = P k Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) ≃ X k ≤ sup j > j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − k X l = P k (cid:16) ˜ χ l Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . Since χ l and ˜ χ l approximately live at frequency 2 − l , this is basically a high-high interaction termand we may write schematically ≃ X k ≤ sup j > j (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) − k X l = P k (cid:16) P − l (cid:0) ˜ χ l (cid:1) P − l (cid:0) Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:1)(cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . X k ≤ sup j > j − k X l = (cid:13)(cid:13)(cid:13)(cid:13) P k (cid:16) P − l (cid:0) ˜ χ l (cid:1) P − l (cid:0) Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:1)(cid:17)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) ! . By Bernstein’s estimate and H ¨older’s inequality we then ﬁnd . X k ≤ sup j > j − k X l = k (cid:13)(cid:13)(cid:13) ˜ χ l (cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) (cid:13)(cid:13)(cid:13) P − l Q j ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) ! . X k ≤ − k X l = k + l (cid:13)(cid:13)(cid:13) P − l ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ ( R × R ) ! , where in the last line we used that the spatial support of ˜ χ l is at scale 2 l . Using Young’s inequality,we arrive at the desired bound . ∞ X l = (cid:13)(cid:13)(cid:13) P − l ∇ t , x (cid:0) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ ( R × R ) . ∞ X l = (cid:13)(cid:13)(cid:13) χ l e − i γ ( l ) n ˜ φ ( l ) n (cid:13)(cid:13)(cid:13) S ( R × R ) . This ﬁnishes the proof of Lemma 5.11. (cid:3)

It remains to prove Lemma 5.10 and Lemma 5.12. We only give the proof of Lemma 5.12, theother one being similar.

Proof of Lemma 5.12.

We have to prove that for any ψ ∈ ˜ S ( R × R ), k χψ k ˜ S ( R × R ) = X k ∈ Z (cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χψ (cid:1)(cid:13)(cid:13)(cid:13) S Strk ( R × R ) + (cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χψ (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ ( R × R ! / ≤ CD k ψ k ˜ S ( R × R ) . To this end we will constantly invoke the assumed space-time bounds (5.28) for the multiplier χ .We ﬁrst consider the S S trk component of the ˜ S k norm. Here we denote by ( q , r ) any wave-admissible exponent pair, i.e. satisfying 2 ≤ q , r ≤ ∞ and q + r ≤ . For any k ∈ Z we have(5.34) (cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χψ (cid:1)(cid:13)(cid:13)(cid:13) S Strk ≤ (cid:13)(cid:13)(cid:13) P k (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) S Strk + (cid:13)(cid:13)(cid:13) P k (cid:0) χ ∇ t , x ψ (cid:1)(cid:13)(cid:13)(cid:13) S Strk and begin with estimating the ﬁrst term on the right hand side of (5.34). If k ≤

0, we obtain by theBernstein and Sobolev inequalities uniformly for all ( q , r ) that2 ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L qt L rx . q k k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L qt L x k ψ k L ∞ t L x . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L qt L x k∇ x ψ k L ∞ t L x . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L qt L x k ψ k ˜ S . Here we used that k∇ x ψ k L ∞ t L x . X k ∈ Z k P k ∇ x ψ k L ∞ t L x ! / . X k ∈ Z k P k ∇ t , x ψ k S Strk ! / . k ψ k ˜ S . If k >

0, we have2 ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L qt L rx . ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0)(cid:0) P > k − C ( ∇ t , x χ ) (cid:1) ψ (cid:1)(cid:13)(cid:13)(cid:13) L qt L rx + ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0)(cid:0) P ≤ k − C ( ∇ t , x χ ) (cid:1) P k + O (1) ψ (cid:1)(cid:13)(cid:13)(cid:13) L qt L rx . q k (cid:13)(cid:13)(cid:13) P > k − C ∇ t , x χ (cid:13)(cid:13)(cid:13) L qt L x k ψ k L ∞ t L x + ( q + r − k (cid:13)(cid:13)(cid:13) P ≤ k − C ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:13)(cid:13)(cid:13) P k + O (1) ψ (cid:13)(cid:13)(cid:13) L qt L rx . q k − k (cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L qt L x k∇ x ψ k L ∞ t L x + ( q + r − k (cid:13)(cid:13)(cid:13) P ≤ k − C ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x − k (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x ψ (cid:13)(cid:13)(cid:13) L qt L rx . − k (cid:13)(cid:13)(cid:13) ∇ x ∂ t χ (cid:13)(cid:13)(cid:13) L qt L x k ψ k ˜ S + (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( q + r − k (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x ψ (cid:13)(cid:13)(cid:13) L qt L rx , where we used the reverse Bernstein inequality (cid:13)(cid:13)(cid:13) P > k − C ∇ t , x χ (cid:13)(cid:13)(cid:13) L qt L x . − k (cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L qt L x . Square-summing over k ∈ Z yields the desired bound. We continue with the second term on theright hand side of (5.34). If k ≤

0, we use Bernstein’s inequality to bound2 ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0) χ ∇ t , x ψ (cid:1)(cid:13)(cid:13)(cid:13) L qt L rx . q k k (cid:13)(cid:13)(cid:13) χ (cid:13)(cid:13)(cid:13) L qt L x k∇ t , x ψ k L ∞ t L x . k (cid:13)(cid:13)(cid:13) χ (cid:13)(cid:13)(cid:13) L qt L x k ψ k ˜ S . For k > ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0) χ ∇ t , x ψ (cid:1)(cid:13)(cid:13)(cid:13) L qt L rx . ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0)(cid:0) P > k − C χ (cid:1) ∇ t , x ψ (cid:13)(cid:13)(cid:13) L qt L rx + ( q + r − k (cid:13)(cid:13)(cid:13) P k (cid:0)(cid:0) P ≤ k − C χ (cid:1) P k + O (1) ∇ t , x ψ (cid:13)(cid:13)(cid:13) L qt L rx . q k (cid:13)(cid:13)(cid:13) P > k − C χ (cid:13)(cid:13)(cid:13) L qt L ∞ x k∇ t , x ψ k L ∞ t L x + ( q + r − k (cid:13)(cid:13)(cid:13) P ≤ k − C χ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:13)(cid:13)(cid:13) P k + O (1) ∇ t , x ψ (cid:13)(cid:13)(cid:13) L qt L rx . − k (cid:13)(cid:13)(cid:13) ∇ x χ (cid:13)(cid:13)(cid:13) L qt L ∞ x k ψ k ˜ S + (cid:13)(cid:13)(cid:13) χ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( q + r − k (cid:13)(cid:13)(cid:13) P k + O (1) ∇ t , x ψ (cid:13)(cid:13)(cid:13) L qt L rx . The desired bound again follows after square-summing over k ∈ Z .Next we consider the X , ∞ component of the ˜ S norm. For any k ∈ Z we have(5.35) (cid:13)(cid:13)(cid:13) P k ∇ t , x (cid:0) χψ (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ ≤ (cid:13)(cid:13)(cid:13) P k (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ + (cid:13)(cid:13)(cid:13) P k (cid:0) χ ∇ t , x ψ (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ . We start with the ﬁrst term on the right hand side of (5.35). If k ≤

0, we split into a small and alarge modulation term(5.36) P k (cid:0) ( ∇ t , x χ ) ψ (cid:1) = P k Q ≤ (cid:0) ( ∇ t , x χ ) ψ (cid:1) + P k Q > (cid:0) ( ∇ t , x χ ) ψ (cid:1) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 51

We easily estimate the small modulation term using Bernstein’s inequality, (cid:13)(cid:13)(cid:13) P k Q ≤ (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ . (cid:13)(cid:13)(cid:13) P k (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k L ∞ t L x . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k ˜ S . To estimate the large modulation term we consider for any j > j (cid:13)(cid:13)(cid:13) P k Q j (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P > j − C ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x + j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P ≤ j − C Q > j − C ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x + j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P ≤ j − C Q ≤ j − C ( ∇ t , x χ ) Q j + O (1) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . (5.37)We bound the ﬁrst term using the reverse Bernstein inequality,2 j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P > j − C ( ∇ t , x χ ) (cid:1) ψ (cid:13)(cid:13)(cid:13) L t L x . k j (cid:13)(cid:13)(cid:13)(cid:0) P > j − C ( ∇ t , x χ ) (cid:1) ψ (cid:13)(cid:13)(cid:13) L t L x . k − j (cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k ˜ S . For the second term on the right hand side of (5.37) we obtain from a reverse Bernstein estimate intime that 2 j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P ≤ j − C Q > j − C ( ∇ t , x χ ) (cid:1) ψ (cid:13)(cid:13)(cid:13) L t L x . k − j (cid:13)(cid:13)(cid:13) ∇ t , x ∂ t χ (cid:13)(cid:13)(cid:13) L t L x k ψ k ˜ S . The third term on the right hand side of (5.37) can be estimated via a Littlewood-Paley trichotomy2 j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P ≤ j − C Q ≤ j − C ( ∇ t , x χ ) (cid:1) Q j + O (1) ψ (cid:13)(cid:13)(cid:13) L t L x . j − C X l = k + C j (cid:13)(cid:13)(cid:13)(cid:0) P l Q ≤ j − C ( ∇ t , x χ )) (cid:1) P l + O (1) Q j + O (1) ψ (cid:13)(cid:13)(cid:13) L t L x + j (cid:13)(cid:13)(cid:13)(cid:0) P k + O (1) Q ≤ j − C ( ∇ t , x χ ) (cid:1) P ≤ k + O (1) Q j + O (1) ψ (cid:13)(cid:13)(cid:13) L t L x + j (cid:13)(cid:13)(cid:13)(cid:0) P ≤ k − C Q ≤ j − C ( ∇ t , x χ ) (cid:1) P k + O (1) Q j + O (1) ψ (cid:13)(cid:13)(cid:13) L t L x . (5.38)We bound the high-high case by j − C X l = k + C k j (cid:13)(cid:13)(cid:13) P l Q ≤ j − C ( ∇ t , x χ ) (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P l + O (1) Q j + O (1) ψ (cid:13)(cid:13)(cid:13) L t L x . k j − C X l = k + C (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x − l (cid:13)(cid:13)(cid:13) P l + O (1) ∇ x ψ (cid:13)(cid:13)(cid:13) X , ∞ . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x k ψ k ˜ S . The high-low case is estimated by X l ≤ k + O (1) j (cid:13)(cid:13)(cid:13) P k + O (1) Q ≤ j − C ( ∇ t , x χ ) (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P l Q j + O (1) ψ (cid:13)(cid:13)(cid:13) L t L ∞ x . X l ≤ k + O (1) (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x l (cid:13)(cid:13)(cid:13) P l ∇ x ψ (cid:13)(cid:13)(cid:13) X , ∞ . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x k ψ k ˜ S and the low-high case by2 k j (cid:13)(cid:13)(cid:13) P ≤ k − C Q ≤ j − C ( ∇ t , x χ ) (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P k + O (1) Q j + O (1) ψ (cid:13)(cid:13)(cid:13) L t L x . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x ψ (cid:13)(cid:13)(cid:13) X , ∞ . Thus, we obtain the following estimate for the large modulation term in (5.36), (cid:13)(cid:13)(cid:13) P k Q > (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ . k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k ˜ S + k (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x k ψ k ˜ S + (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x ψ (cid:13)(cid:13)(cid:13) X , ∞ . If k > P k (cid:0) ( ∇ t , x χ ) ψ (cid:1) = P k Q ≤ k (cid:0) ( ∇ t , x χ ) ψ (cid:1) + P k Q > k (cid:0) ( ∇ t , x χ ) ψ (cid:1) . We can immediately dispose of the small modulation term as follows (cid:13)(cid:13)(cid:13) P k Q ≤ k (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) X , ∞ . k (cid:13)(cid:13)(cid:13) P k (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . − k (cid:16)(cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k L ∞ t L x + (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L ∞ x k∇ x ψ k L ∞ t L x (cid:17) . − k (cid:16)(cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x + (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:17) k ψ k ˜ S . To treat the large modulation term, we ﬁnd that for any j > k ,2 j (cid:13)(cid:13)(cid:13) P k Q j (cid:0) ( ∇ t , x χ ) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P > j − C ( ∇ t , x χ ) (cid:1) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x + j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P ≤ j − C Q > j − C ( ∇ t , x χ ) (cid:1) ψ (cid:1) k L t L x + j (cid:13)(cid:13)(cid:13) P k Q j (cid:0)(cid:0) P ≤ j − C Q ≤ j − C ( ∇ t , x χ ) (cid:1) Q j + O (1) ψ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . (5.40)We estimate the ﬁrst term by2 j (cid:13)(cid:13)(cid:13) P > j − C ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k L ∞ t L x . − j (cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k∇ x ψ k L ∞ t L x . − k (cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k ˜ S . The second term on the right hand side of (5.40) is bounded by2 j (cid:13)(cid:13)(cid:13) P ≤ j − C Q > j − C ∇ t , x χ (cid:13)(cid:13)(cid:13) L t L x k ψ k L ∞ t L x . − j (cid:13)(cid:13)(cid:13) ∇ t , x ∂ t χ (cid:13)(cid:13)(cid:13) L t L x k∇ x ψ k L ∞ t L x . − k (cid:13)(cid:13)(cid:13) ∇ t , x ∂ t χ (cid:13)(cid:13)(cid:13) L t L x k ψ k ˜ S . Using a Littlewood-Paley trichotomy we obtain the following estimate of the third term in (5.40)2 − k (cid:16)(cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x + (cid:13)(cid:13)(cid:13) ∇ x ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:17) k ψ k ˜ S + (cid:13)(cid:13)(cid:13) ∇ t , x χ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x ψ (cid:13)(cid:13)(cid:13) X , ∞ . Finally, square-summing over k ∈ Z gives the desired bound for the ﬁrst term on the right hand sideof (5.35). The second term on the right hand side of (5.35) can be handled similarly. (cid:3) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 53

Localizing in physical space.

In this subsection we consider Coulomb data ( A , φ )[0] ∈ ˙ H sx × ˙ H s − x for all s ≥

1. We show that there exists T > C ∞ solution of the same regularity class oneach time slice of the space-time slab [ − T , T ] × R satisfying the required S norm bound k ( A , φ ) k S ([ − T , T ] × R ) < ∞ . To this end we ﬁx a large R >

1. For each R ≥ R , we consider a cuto ﬀ χ R ∈ C ∞ c ( R ) thatequals 1 on the ball B R (0) and has support in a dilate of B R (0). In the previous Subsection 5.1 wedemonstrated that upon writing ˜ A R : = χ R A − ∇ x γ R for the spatial components of a new connection form ˜ A R , where γ R = ∆ − ∂ j (cid:0) χ R A j (cid:1) , there is a way to pick the remaining data ∂ t ˜ A R (0) and ˜ φ R [0], so that the corresponding data are allof class H + x × L + x and of Coulomb class. Thus, we obtain local solutions with these data fromthe local well-posedness result [36]. We can also arrange that ˜ φ R [0] is supported within the ball ofradius R centered at the origin. It is then also easy to verify that( ˜ A R , ˜ φ R )[0] → ( A , φ )[0] as R → ∞ with respect to the ˙ H sx × ˙ H s − x topology for any s ≥

1. Moreover, the argument in the previoussubsection together with Proposition 4.3 implies that these solutions extend of class H sx × H s − x to aspace-time slab [ − T , T ] × R , where T > R ≥ R . It then remains to check thatthe corresponding local solutions on [ − T , T ] × R , call them again ( ˜ A R , ˜ φ R ), converge with respectto the S norm. This will essentially follow from the perturbation theory developed later on in thekey Step 3 of the proof of Proposition 7.4. The following proposition can be proved. Proposition 5.13.

The sequence (cid:8) ( ˜ A R , ˜ φ R ) (cid:9) R ≥ R converges in S ([ − T , T ] × R ) as R → ∞ . The limitis also of class ˙ H sx × ˙ H s − x for all s ≥ on each time slice of [ − T , T ] × R , hence of class C ∞ , and asmooth solution to the MKG-CG system on [ − T , T ] × R with initial data ( A , φ )[0] .Proof. A sketch of the proof is given in Subsection 7.4. (cid:3)

6. H ow to arrive at the minimal energy blowup solution

In this section we address another delicate issue arising due to the di ﬃ culties with the perturba-tion theory for the MKG-CG system. Assume that ( A n , φ n ) is an “essentially singular sequence” ofadmissible solutions to the MKG-CG system that converges at time t = A , φ )[0] with E ( A , φ ) = E crit ,lim n →∞ ( A n , φ n )[0] = ( A , φ )[0] . Using the concept of MKG-CG evolution for energy class data from Deﬁnition 5.3, we obtain anenergy class solution ( A , φ ) with maximal lifespan I . We then want to infer that(6.1) sup J ⊂ I , Jclosed k ( A , φ ) k S ( J × R ) = ∞ , while by construction it holds that E ( A , φ ) = E crit . In view of Lemma 5.4, it su ﬃ ces to consider thecase I = R . The problem here is that while we havelim n →∞ k ( A n , φ n ) k S ( I n × R ) = ∞ , where I n are suitably chosen time intervals, we cannot use an immediate perturbative argument toobtain (6.1) as is possible for wave maps in [20]. The reason comes from the fact that the ( A n , φ n )may have non-negligible low-frequency components. Nevertheless, we obtain the following result. Proposition 6.1.

Let ( A n , φ n ) be an essentially singular sequence of admissible solutions to theMKG-CG system. Assume that lim n →∞ ( A n , φ n )[0] = ( A , φ )[0] in the energy topology for some Coulomb energy class data pair ( A , φ )[0] . Let I be the maximallifespan of the MKG-CG evolution ( A , φ ) of this data pair given by Deﬁnition 5.3. Then it holds that sup J ⊂ I , Jclosed k ( A , φ ) k S ( J × R ) = ∞ . Proof.

A sketch of the proof can be found in Subsection 7.4. (cid:3)

We shall later on need certain variations of the preceding proposition.

Corollary 6.2.

Let { ( A n , φ n )[0] } n ∈ N and ( A , φ )[0] be Coulomb energy class data such that lim n →∞ ( A n , φ n )[0] = ( A , φ )[0] in the energy topology and let I be the maximal lifespan of the MKG-CG evolution of ( A , φ )[0] . IfJ ⊂ I is a compact time interval, then it holds that lim sup n →∞ k ( A n , φ n ) k S ( J × R ) < ∞ . This entails the following important corollary.

Corollary 6.3.

Let { ( A n , φ n )[0] } n ∈ N ⊂ ˙ H x × L x be a compact subset of Coulomb energy class data.Then there exists an open interval I ∗ centered at t = with the property thatI ∗ ⊂ I n for all n ∈ N , where I n denotes the maximal lifespan of the MKG-CG evolutions of ( A n , φ n )[0] given by Deﬁni-tion 5.3.Proof. We argue by contradiction. Assume that there exists a subsequence { ( A n k , φ n k )[0] } k ∈ N forwhich at least one of the lifespan endpoints of the associated MKG-CG evolutions converges to t =

0. Passing to a further subsequence, which we again denote by { ( A n k , φ n k )[0] } k ∈ N , we mayassume that lim k →∞ ( A n k , φ n k )[0] = ( A , φ )[0]in the energy topology for some Coulomb energy class data ( A , φ )[0]. The contradiction now fol-lows from Corollary 6.2. (cid:3)

7. C oncentration compactness step

General considerations.

We begin by sorting out the relationship between the conservedenergy and the ˙ H x × L x -norm of solutions ( A , φ ) to the MKG-CG system. Recall that the conservedenergy is given by the expression E ( A , φ ) = X α,β Z R ( ∂ α A β − ∂ β A α ) dx + X α Z R | ∂ α φ + iA α φ | dx . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 55

Using the Coulomb gauge condition, this can be written as E ( A , φ ) = X i < j Z R ( ∂ i A j ) dx + X i Z R ( ∂ t A i ) + ( ∂ i A ) dx + X α Z R | ∂ α φ + iA α φ | dx , which immediately implies E ( A , φ ) . X i k∇ t , x A i k L x + k∇ x A k L x + k∇ t , x φ k L x + X α k∇ x A α k L x k∇ x φ k L x . Conversely, in order to exploit the conserved energy, we also need to show that the expression X α k∇ t , x A α k L x + k∇ t , x φ k L x is bounded in terms of E ( A , φ ). Here the only issue comes from bounding the terms k∇ t , x φ k L x and k ∂ t A k L x . However, the diamagnetic inequality gives the pointwise estimate | ∂ α | φ || ≤ | ( ∂ α + iA α ) φ | and Sobolev’s inequality then yields k φ k L x . k∇ x | φ |k L x . X j k ( ∂ j + iA j ) φ k L x . Thus, we ﬁnd k ∂ α φ k L x ≤ k ( ∂ α + iA α ) φ k L x + k A α φ k L x . k ( ∂ α + iA α ) φ k L x + k A α k L x + k φ k L x . E ( A , φ ) + E ( A , φ ) . In order to bound the time derivative k ∂ t A k L x , we use the compatibility relation ∆ ∂ t A = − X j ∂ j Im (cid:0) φ D j φ )to obtain k ∂ t A k L x . k∇ x ∂ t A k L x . X j k φ D j φ k L x . X j k φ k L x k D j φ k L x . E ( A , φ ) . We also recall that the notation ( A , φ )[0] for initial data for the MKG-CG system only refers tothe prescribed data A j [0], j = , . . . ,

4, for the evolution of the spatial components of the connectionform A . The component A is determined via the compatibility relations.7.2. Setting up the induction on frequency scales.

Our ﬁnal goal will be to show the following.

Let ( A , φ )[0] be admissible Coulomb data. Then the corresponding MKG-CG evolution existsglobally in time and denoting its energy byE = X α,β Z R ( ∂ α A β − ∂ β A α ) dx + X α Z R | ∂ α + iA α φ | dx , there exists an increasing function K : R + → R + such that k ( A , φ ) k S ( R × R ) ≤ K ( E ) . To prove this result we proceed by contradiction. By the small data global well-posedness result[22] we know that the assertion holds for su ﬃ ciently small energies. So assume that it does nothold for all energies E >

0. Then the set of exceptional energies has a positive inﬁmum, which we denote by E crit , and we can ﬁnd a sequence of admissible data { ( A n , φ n )[0] } n ∈ N with evolutions { ( A n , φ n ) } n ∈ N deﬁned on ( − T n , T n ) × R such thatlim n →∞ E ( A n , φ n ) = E crit , lim n →∞ k ( A n , φ n ) k S (( − T n , T n ) × R ) = + ∞ . We call such a sequence of initial data essentially singular .We now implement a two step Bahouri-G´erard type procedure. The ﬁrst step consists in selectingfrequency atoms. Here we largely follow the setup of Subsection 9.1 and Subsection 9.2 in [20],which in turn is partially based on Section III.1 of [1]. We recall the following terminology from [1].A scale is a sequence of positive numbers { λ n } n ∈ N . We say that two scales { λ na } n ∈ N and { λ nb } n ∈ N are orthogonal if lim n →∞ λ na λ nb + λ nb λ na = + ∞ . Let { λ n } n ∈ N be a scale and let { ( f n , g n ) } n ∈ N be a bounded sequence of functions in ˙ H x ( R ) × L x ( R ).Then we say that { ( f n , g n ) } n ∈ N is λ n -oscillatory iflim R → + ∞ lim sup n →∞ Z { λ n | ξ |≤ R } | [ ∇ x f n ( ξ ) | + | b g n ( ξ ) | d ξ + Z { λ n | ξ |≥ R } | [ ∇ x f n ( ξ ) | + | b g n ( ξ ) | d ξ ! = { ( f n , g n ) } n ∈ N is λ n -singular if for all b > a > n →∞ Z { a ≤ λ n | ξ |≤ b } | [ ∇ x f n ( ξ ) | + | b g n ( ξ ) | d ξ = . We obtain the following decomposition of the essentially singular sequence of data { ( A n , φ n )[0] } n ∈ N into frequency atoms. Proposition 7.1.

Let { ( A n , φ n )[0] } n ∈ N be a sequence of admissible data with energy bounded by E.Up to passing to a subsequence the following holds. Given δ > , there exists an integer Λ =Λ ( δ, E ) > and for every n ∈ N a decompositionA n [0] = Λ X a = A na [0] + A n Λ [0] ,φ n [0] = Λ X a = φ na [0] + φ n Λ [0] . For a = , . . . , Λ , the frequency atoms ( A na , φ na )[0] are λ na -oscillatory for a family of pairwiseorthogonal frequency scales { λ na } n . The error ( A n Λ , φ n Λ )[0] is λ na -singular for every ≤ a ≤ Λ andsatisﬁes the smallness condition lim sup n →∞ (cid:13)(cid:13)(cid:13) A n Λ [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ < δ, lim sup n →∞ (cid:13)(cid:13)(cid:13) φ n Λ [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ < δ. Moreover, for a = , . . . , Λ , the frequency atoms ( A na , φ na )[0] have sharp frequency support in thefrequency intervals | ξ | ∈ [( λ na ) − R − n , ( λ na ) − R n ] for a suitable sequence R n → + ∞ . For di ﬀ erentvalues of a, these frequency intervals [( λ na ) − R − n , ( λ na ) − R n ] are mutually disjoint for su ﬃ cientlylarge n. Finally, we have asymptotic decoupling of the energyE ( A n , φ n ) = Λ X a = E ( A na , φ na ) + E ( A n Λ , φ n Λ ) + o (1) as n → ∞ , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 57 where the temporal components A na are determined via the compatibility relation (cid:0) ∆ − | φ na | (cid:1) A na = − Im (cid:0) φ na ∂ t φ na (cid:1) , and similarly for A n Λ , .Proof. We suppress the notation [0] in the proof. As in Section III.1 of [1], we obtain a decompo-sition of the data { ( A n , φ n ) } n ∈ N into frequency atoms A n = Λ X a = ˜ A na + ˜ A n Λ , φ n = Λ X a = ˜ φ na + ˜ φ n Λ , where ( ˜ A na , ˜ φ na ) are λ na -oscillatory for a family of pairwise orthogonal scales { λ na } n ∈ N for a = , . . . , Λ . The error ( ˜ A n Λ , ˜ φ n Λ ) is λ na -singular for a = , . . . , Λ and satisﬁes the smallness conditionlim sup n →∞ (cid:13)(cid:13)(cid:13) ˜ A n Λ (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ < δ, lim sup n →∞ (cid:13)(cid:13)(cid:13) ˜ φ n Λ (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ < δ. In order to get a clean separation of the frequency atoms in frequency space, we have to preparethem a bit more, because their decay from the scale ( λ na ) − might be arbitrarily slow. To this end let R n → ∞ be a sequence growing su ﬃ ciently slowly such that the intervals [( λ na ) − R − n , ( λ na ) − R n ]are mutually disjoint for n large enough and for di ﬀ erent values of a . Then we replace the error ˜ A n Λ by A n Λ = P ∩ Λ a ′ = [ µ na ′ − log R n ,µ na ′ + log R n ] c ˜ A n Λ + Λ X a = P ∩ Λ a ′ = [ µ na ′ − log R n ,µ na ′ + log R n ] c ˜ A na , where µ na = − log λ na , and the frequency atoms ˜ A na by A na = P [ µ na − log R n ,µ na + log R n ] ˜ A n Λ + P [ µ na − log R n ,µ na + log R n ] Λ X a ′ = ˜ A na ′ for a = , . . . , Λ . In order to remove the dependence on Λ in the new proﬁles, we may replace Λ by Λ n with Λ n → ∞ su ﬃ ciently slowly as n → ∞ . Analogously, we deﬁne φ na and φ n Λ . This newdecomposition(7.1) A n = Λ X a = A na + A n Λ , φ n = Λ X a = φ na + φ n Λ , has the same properties as the original one, but that we have now arranged for a sharp separation ofthe frequency supports of the frequency atoms.Finally, we turn to the asymptotic decoupling of the energy. Here we recall that the “ellipticcomponents” A na associated with a frequency atom ( A na , φ na ) are determined via the elliptic com-patibility equations. It therefore su ﬃ ces to show that the decomposition (7.1) (which only refers tothe spatial components of the connection form A n ) implies a similar frequency atom decomposition(7.2) A n = Λ X a = A na + A n Λ , + o ˙ H x (1) as n → ∞ , where A na is λ na -oscillatory and A n Λ , is λ na -singular for each a = , . . . , Λ . Then the decoupling ofthe energy is an immediate consequence of the construction of the frequency atoms. For example,we have the limiting relations lim n →∞ Z R ∂ α φ na A na ′ α φ na ′′ dx = , if not all of a , a ′ , a ′′ are equal, as well aslim n →∞ Z R A na α φ na ′ A na ′′ α φ na ′′′ dx = , if not all of a , a ′ , a ′′ , a ′′′ are equal. It remains to prove the decomposition (7.2). To show this, weﬁrst observe that (at ﬁxed time t = − Im (cid:0) φ n ∂ t φ n (cid:1) = Λ X a = − Im (cid:0) φ na ∂ t φ na (cid:1) − Im (cid:0) φ n Λ ∂ t φ n Λ (cid:1) + o L x (1) as n → ∞ . It is then enough to show that (cid:0) ∆ − | φ n | (cid:1) A na = − Im (cid:0) φ na ∂ t φ na (cid:1) + o L x (1) as n → ∞ , (cid:0) ∆ − | φ n | (cid:1) A n Λ , = − Im (cid:0) φ n Λ , ∂ t φ n Λ , (cid:1) + o L x (1) as n → ∞ . This in turn will easily follow once we have shown that each A na is λ na -oscillatory, while A n Λ , is λ na -singular for a = , . . . , Λ . We demonstrate this for a =

1, where we may assume by scalinginvariance of these assertions that λ n = ∆ A n − | φ n | A n = − Im (cid:0) φ n ∂ t φ n (cid:1) and distinguish between small and large frequencies.We begin with the small frequencies. For R ≪ − ∆ P ≤ R A n − P ≤ R (cid:0) | φ n | A n (cid:1) = − P ≤ R (cid:0) Im (cid:0) φ n ∂ t φ n (cid:1)(cid:1) , where we have lim R →−∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P ≤ R (cid:0) Im (cid:0) φ n ∂ t φ n (cid:1)(cid:1)(cid:13)(cid:13)(cid:13) L x = . Next, we split P ≤ R (cid:0) | φ n | A n (cid:1) = P ≤ R (cid:0) P ≤ R (cid:0) | φ n | (cid:1) A n (cid:1) + P ≤ R (cid:0) P > R ( | φ n | ) A n (cid:1) . Then we have lim R →−∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P ≤ R | φ n | (cid:13)(cid:13)(cid:13) L x = , whence lim R →−∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P ≤ R (cid:0) P ≤ R (cid:0) | φ n | ) A n (cid:1)(cid:13)(cid:13)(cid:13) L x = , while for the second term above, we obtain from Bernstein’s inequality that (cid:13)(cid:13)(cid:13) P ≤ R (cid:0) P > R (cid:0) | φ n | (cid:1) A n (cid:1)(cid:13)(cid:13)(cid:13) L x ≤ X k = k + O (1) , k > R (cid:13)(cid:13)(cid:13) P ≤ R (cid:0) P k ( | φ n | ) P k A n (cid:1)(cid:13)(cid:13)(cid:13) L x . R X k = k + O (1) , k > R (cid:13)(cid:13)(cid:13) P k | φ n | (cid:13)(cid:13)(cid:13) L x (cid:13)(cid:13)(cid:13) P k A n (cid:13)(cid:13)(cid:13) L x . R (cid:13)(cid:13)(cid:13) φ n (cid:13)(cid:13)(cid:13) L x (cid:13)(cid:13)(cid:13) A n (cid:13)(cid:13)(cid:13) ˙ H x . We immediately conclude thatlim R →−∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P ≤ R (cid:0) P > R ( | φ n | ) A n (cid:1)(cid:13)(cid:13)(cid:13) L x = . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 59

Using Sobolev’s inequality, it then follows thatlim sup n →∞ (cid:13)(cid:13)(cid:13) P ≤ R A n (cid:13)(cid:13)(cid:13) ˙ H x . lim sup n →∞ (cid:13)(cid:13)(cid:13) P ≤ R (cid:0) Im (cid:0) φ n ∂ t φ n (cid:1)(cid:1)(cid:13)(cid:13)(cid:13) L x + lim sup n →∞ (cid:13)(cid:13)(cid:13) P ≤ R (cid:0) | φ n | A n (cid:1)(cid:13)(cid:13)(cid:13) L x → R → −∞ . Next, we consider the large frequencies. For R ≫

1, we write ∆ P > R A n − P > R (cid:0) | φ n | A n (cid:1) = − P > R (cid:0) Im (cid:0) φ n ∂ t φ n (cid:1)(cid:1) , where we have lim R →∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P > R (cid:0) Im (cid:0) φ n ∂ t φ n (cid:1)(cid:1)(cid:13)(cid:13)(cid:13) L = . Then we split P > R (cid:0) | φ n | A n (cid:1) = P > R (cid:0) P > R ( | φ n | ) A n (cid:1) + P > R (cid:0) P ≤ R ( | φ n | ) A n (cid:1) . By frequency localization we havelim R →∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P > R ( | φ n | ) (cid:13)(cid:13)(cid:13) L x = , and thus, lim R →∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P > R (cid:0) P > R ( | φ n | ) A n (cid:1)(cid:13)(cid:13)(cid:13) L x = . On the other hand, we have (cid:13)(cid:13)(cid:13) P > R (cid:0) P ≤ R ( | φ n | ) A n (cid:1)(cid:13)(cid:13)(cid:13) L x . X k = k + O (1) , k > R (cid:13)(cid:13)(cid:13) P k (cid:0) P ≤ R ( | φ n | ) P k A n (cid:1)(cid:13)(cid:13)(cid:13) L x ≤ X k = k + O (1) , k > R (cid:13)(cid:13)(cid:13) P ≤ R ( | φ n | ) (cid:13)(cid:13)(cid:13) L x (cid:13)(cid:13)(cid:13) P k A n (cid:13)(cid:13)(cid:13) L x . X k > R R − k (cid:13)(cid:13)(cid:13) φ n (cid:13)(cid:13)(cid:13) L x (cid:13)(cid:13)(cid:13) P k A n (cid:13)(cid:13)(cid:13) ˙ H x . − R (cid:13)(cid:13)(cid:13) φ n (cid:13)(cid:13)(cid:13) L x (cid:13)(cid:13)(cid:13) A n (cid:13)(cid:13)(cid:13) ˙ H x and hence, lim R →∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P > R (cid:0) P ≤ R ( | φ n | ) A n (cid:1)(cid:13)(cid:13)(cid:13) L x = . We then conclude from Sobolev’s inequality thatlim R →∞ lim sup n →∞ (cid:13)(cid:13)(cid:13) P > R A n (cid:13)(cid:13)(cid:13) ˙ H x = . (cid:3) Given an essentially singular sequence of initial data, by Proposition 7.1 for any δ > { ( A n , φ n )[0] } n ∈ N of the form A n [0] = Λ X a = A na [0] + A n Λ [0] ,φ n [0] = Λ X a = φ na [0] + φ n Λ [0](7.3)with lim sup n →∞ (cid:13)(cid:13)(cid:13) A n Λ [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ < δ, lim sup n →∞ (cid:13)(cid:13)(cid:13) φ n Λ [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ < δ. Eventually, we will prove that necessarily only one frequency atom ( A na , φ na )[0] in the decompo-sition (7.3) is non-trivial and has to be asymptotically of energy E crit . In fact, the subsequentconsiderations will show that if there are at least two frequency atoms ( A n , φ n )[0] , ( A n , φ n )[0] that both do not vanish asymptotically, or if there is only one frequency atom ( A n , φ n )[0] with theerror satisfying lim sup n →∞ k ( A n , φ n )[0] k ˙ H x × L x > , then we get an a priori bound on the S norm of the evolutions lim inf n →∞ k ( A n , φ n ) k S (( − T n , T n ) × R ) < ∞ , contradicting the assumption that { ( A n , φ n )[0] } n ∈ N is essentially singular. We now introduce a smallness parameter ε > ﬃ ciently smalldepending only on E crit . In particular, we assume that ε is less than the small energy threshold ofthe small energy global well-posedness result [22].By passing to a suitable subsequence and by renumbering the frequency atoms, if necessary, wemay assume that for some integer Λ > X a ≥ Λ + lim sup n →∞ E ( A na , φ na ) < ε . Moreover, we may assume that the frequency atoms { ( A na , φ na )[0] } n ∈ N , a = , . . . , Λ , have “in-creasing frequency supports” in the sense that ( λ na ) − is growing in terms of a (for each ﬁxed n ).The key idea now is as follows. We approximate the initial data ( A n , φ n )[0] by low frequency truncations, obtained by removingall or some of the atoms ( A na , φ na )[0] , a = , . . . , Λ , and inductively obtain bounds on the S normof the MKG-CG evolutions of the truncated data. As this induction stops after Λ many steps, wewill have obtained the desired contradiction, forcing eventually that there has to be exactly onefrequency atom ( A n , φ n )[0] that is asymptotically of energy E crit . Evolving the “non-atomic” lowest frequency approximation.

From now on we suppressthe notation [0] for the initial data. The errors ( A n Λ , φ n Λ ) in the decomposition (7.3) are by con-struction supported away in frequency space from the frequency scales ( λ na ) − , a = , , . . . , Λ .It is then clear that the errors { ( A n Λ , φ n Λ ) } n ∈ N can be written as the sum of Λ + Λ + A na , φ na ). Thus, we can write(7.4) A n Λ = Λ + X j = A n j Λ , φ n Λ = Λ + X j = φ n j Λ , where the ﬁrst pieces ( A n Λ , φ n Λ ) have Fourier support in the region closest to the origin, i.e. in | ξ | ≤ ( λ n ) − ( R n ) − . In other words, one essentially obtains the “lowest frequency approximations” ( A n Λ , φ n Λ ) by re-moving all the atoms ( A na , φ na ), a = , . . . , Λ , from the data.We then start our grand inductive procedure by showing the following proposition. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 61

Proposition 7.2.

The parameter ε > can be chosen su ﬃ ciently small depending only on thesize of E crit such that the following holds. Constructing the lowest frequency approximations { ( A n Λ , φ n Λ ) } n ∈ N as described in (7.4) , then there exists a constant C ( E crit ) > such that for allsu ﬃ ciently large n, the data given by ( A n Λ , φ n Λ ) can be evolved globally in time and the corre-sponding solution satisﬁes k ( A n Λ , φ n Λ ) k S ( R × R ) ≤ C ( E crit ) . Proof.

The idea is to use a ﬁnite number of further low frequency approximations of { ( A n Λ , φ n Λ ) } n ∈ N and to inductively obtain bounds on the S norms of their evolutions. Here it is essential that thenumber of these further approximations is bounded by C ( E crit ) > { ( A n Λ , φ n Λ ) } n ∈ N . To begin with,for some su ﬃ ciently small δ = δ ( E crit ) >

0, in particular δ ≪ ε , we use decompositions A n Λ = Λ ( δ ) X j = A n j ) Λ + A n Λ ( Λ ) ,φ n Λ = Λ ( δ ) X j = φ n j ) Λ + φ n Λ ( Λ ) , where the frequency atoms ( A n j ) Λ , φ n j ) Λ ) have frequency support in mutually disjoint intervals[( λ n j ) ) − ( R ( j ) n ) − , ( λ n j ) ) − R ( j ) n ]with R ( j ) n → ∞ as n → ∞ , and furthermore, we have the boundlim sup n →∞ (cid:8) k A n Λ ( Λ ) k ˙ B , ∞ × ˙ B , ∞ + k φ n Λ ( Λ ) k ˙ B , ∞ × ˙ B , ∞ (cid:9) < δ . We may again assume that the atoms ( A n j ) Λ , φ n j ) Λ ) have increasing frequency support as j increases.The number of frequency atoms Λ ( δ ) here is potentially extremely large. It is crucial that thenumber of steps, i.e. the number of low frequency approximations of { ( A n Λ , φ n Λ ) } n ∈ N , required inthe inductive procedure is in fact much smaller, of size C = C ( E crit ) ≪ Λ ( δ ). As we shall see, C can be chosen independently of δ and Λ ( δ ). We now pick the low frequency approximationsof the data { ( A n Λ , φ n Λ ) } n ∈ N . For ε ﬁxed as before, we inductively construct O ( E crit ε ) closed frequencyintervals ˜ J l for the variable | ξ | , disjoint up the the endpoints and increasing. The chosen intervalswill also depend on n , but for notational ease we do not indicate this. So consider n and Λ ﬁxednow. Having picked the intervals ˜ J = ( −∞ , b ], ˜ J l = [ a l , b l ] with b l − = a l for l = , . . . , L −

1, wepick an interval [ a L , ˜ b L ] with a L = b L − as follows. First, pick ˜ b L in such a fashion that E ( P [ a L , ˜ b L ] A n Λ , P [ a L , ˜ b L ] φ n Λ ) = ε or else, if this is impossible, then pick ˜ b L = log ( λ n ) − − log R n , i.e. pick the upper endpoint ofthe frequency interval containing the lowest frequency “large atom” ( A n , φ n ). Now, in the formercase assume that ˜ b L ∈ [log ( λ n j ) ) − − log R ( j ) n , log ( λ n j ) ) − + log R ( j ) n ]for some 1 ≤ j ≤ Λ ( δ ), i.e. ˜ b L falls within the frequency support of one of the (ﬁnite numberof) “small frequency atoms” ( A n j ) Λ , φ n j ) Λ ) constituting ( A n Λ , φ n Λ ). Then we shift ˜ b L upwards to coincide with the upper limit, that is, we set b L = log ( λ n j ) ) − + log R ( j ) n . Otherwise, we set b L = ˜ b L . Then we deﬁne the interval ˜ J L = [ a L , b L ]. Observe that for su ﬃ ciently large n , we have E ( P ˜ J L A n Λ , P ˜ J L φ n Λ ) . ε . In particular, this implies that for su ﬃ ciently large n the total number of intervals ˜ J l is C = O ( E crit ε ).We now deﬁne the low frequency approximations of the data ( A n Λ , φ n Λ ) by truncating the frequencysupport of ( A n Λ , φ n Λ ) to the intervals J L : = ∪ Ll = ˜ J l . More precisely, for 1 ≤ L ≤ C we deﬁne the L -th low frequency approximation of the data( A n Λ , φ n Λ ) by the expression ( P J L A n Λ , P J L φ n Λ ) , where by construction C = C ( E crit ) . E crit ε . In particular, we have( P J C A n Λ , P J C φ n Λ ) = ( A n Λ , φ n Λ ) . We also state the following key lemma, whose proof is a consequence of the preceding construction.

Lemma 7.3.

For L = , . . . , C and for any R > , we have for all su ﬃ ciently large n that (cid:13)(cid:13)(cid:13) P [ a L − R , a L + R ] ∇ t , x A n Λ (cid:13)(cid:13)(cid:13) L x + (cid:13)(cid:13)(cid:13) P [ a L − R , a L + R ] ∇ t , x φ n Λ (cid:13)(cid:13)(cid:13) L x . R δ . In order to prove Proposition 7.2, we inductively show that for L = , . . . , C and for all su ﬃ -ciently large n , the evolutions of the data( P J L A n Λ , P J L φ n Λ )exist globally and satisfy the desired global S norm bounds, which of course get larger as L grows.For L = Proposition 7.4.

Let us assume that the evolution of the data (cid:0) P J L − A n Λ , P J L − φ n Λ (cid:1) is globally deﬁned for some ≤ L < C . We denote this evolution by ( A n , ( L − Λ , φ n , ( L − Λ ) . Further-more, assume that for all su ﬃ ciently large n, it holds that (cid:13)(cid:13)(cid:13)(cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) ≤ C < ∞ . Provided δ − ≫ C with δ > as above, there exists C = C ( C ) < ∞ such that for all su ﬃ cientlylarge n, the data (cid:0) P J L A n Λ , P J L φ n Λ (cid:1) can be evolved globally and for the corresponding evolutions (cid:0) A n , ( L ) Λ , φ n , ( L ) Λ (cid:1) , it holds that (cid:13)(cid:13)(cid:13)(cid:0) A n , ( L ) Λ , φ n , ( L ) Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) ≤ C . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 63

Proposition 7.2 is then an immediate consequence of applying Proposition 7.4 C many times.We note that there exists δ > E crit such that choosing δ < δ in each step,Proposition 7.4 can be applied. Since C = C ( E crit ) this results in a bound (cid:13)(cid:13)(cid:13) ( A n Λ , φ n Λ ) (cid:13)(cid:13)(cid:13) S ( R × R ) ≤ C ( E crit ) . (cid:3) Proof of Proposition 7.4.

We proceed in several steps.

Step 1.

The assumed bound on (cid:13)(cid:13)(cid:13)(cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) implies an exponential decay for largefrequencies, (cid:13)(cid:13)(cid:13)(cid:0) P k A n , ( L − Λ , P k φ n , ( L − Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) . − σ ( k − b L − ) for k ≥ b L − . This will follow once we can show that in fact (cid:13)(cid:13)(cid:13)(cid:0) P k A n , ( L − Λ , P k φ n , ( L − Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) . c ( L − k , where (cid:8) c ( L − k (cid:9) k ∈ Z is a su ﬃ ciently ﬂat frequency envelope covering the initial data (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) [0]at time t =

0. This in turn is a consequence of Proposition 4.2 whose proof will be given in Subsec-tion 7.4.

Step 2.

Localizing (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) to suitable space-time slices . In order to ensure that we caninduct on perturbations of size ∼ ε that are not “too small” (such as the δ ), we have to make surethat the S norms of (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) are not too large. To simplify the notation, we label thesecomponents by ( A , φ ) for the rest of this step. The idea is to localize to suitable space-time slices I × R , whose number may be very large (depending on k ( A , φ ) k S ( R × R ) and E crit ), but such that wehave on each slice k ( A , φ ) k S ( I × R ) ≤ C ( E crit ) , where the function C ( · ) grows at most polynomially. Proposition 7.5.

There exist N = N (cid:0) k ( A , φ ) k S ( R × R ) , E crit (cid:1) many time intervals I , . . . , I N partition-ing the time axis R such that we have for n = , . . . , N a decomposition (referring to the spatialcomponents of the connection form simply by A) (7.5) A | I n = A f ree , ( I n ) + A nonlin , ( I n ) , (cid:3) A f ree , ( I n ) = , where A f ree , ( I n ) and A nonlin , ( I n ) are in Coulomb gauge and satisfy (cid:13)(cid:13)(cid:13) ∇ t , x A f ree , ( I n ) (cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . E / crit , (7.6) (cid:13)(cid:13)(cid:13) A nonlin , ( I n ) (cid:13)(cid:13)(cid:13) ℓ S ( I n × R ) ≪ . (7.7) Moreover, we have for n = , . . . , N that (7.8) k φ k S ( I n × R ) . C ( E crit ) , where C ( · ) grows at most polynomially.Proof. We ﬁrst deﬁne precisely the decompositions A = A f ree + A nonlin that we are using. Thenonlinear structure inherent in A nonlin will be pivotal for controlling the equation for φ . For a timeinterval I ⊂ R , say of the form I = [ t , t ] for some t < t , we deﬁne for i = , . . . , A nonlin , ( I ) i : = − χ I X k , j (cid:3) − P k Q j Im P i (cid:0) ( χ I φ ) · ∇ x ( χ I φ ) − χ I iA | φ | (cid:1) , where χ I is a smooth cuto ﬀ to the interval I and (cid:3) − denotes multiplication by the Fourier symbol.Then we deﬁne A f ree , ( I ) to be the free wave with initial data at time t given by A [ t ] − A nonlin , ( I ) [ t ].By construction, we then have A = A f ree , ( I ) + A nonlin , ( I ) on I × R . We now describe how to partition the time axis into N = N (cid:0) k ( A , φ ) k S , E crit (cid:1) many suitable timeintervals so that the bounds (7.6) – (7.8) hold on each such interval. For this, we ﬁrst need thefollowing technical lemma. Lemma 7.6.

Given ε > , there exist M = M (cid:0) k ( A , φ ) k S ( R × R ) , ε (cid:1) many time intervals I , . . . , I M partitioning the time axis R such that for m = , . . . , M and i = , . . . , , (7.10) X k (cid:13)(cid:13)(cid:13)(cid:13) ∇ t , x X j (cid:3) − P k Q j P i (cid:0) ( χ I m φ ) · ∇ x ( χ I m φ ) − χ I m iA | φ | (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . ε and (7.11) (cid:13)(cid:13)(cid:13) P i (cid:0) ( χ I m φ ) · ∇ x ( χ I m φ ) − χ I m iA | φ | (cid:1)(cid:13)(cid:13)(cid:13) ( ℓ N ∩ ℓ L t ˙ H − x )( R × R ) . ε. In particular, it then holds that (cid:13)(cid:13)(cid:13) ∇ t , x A f ree , ( I m ) i (cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . E / crit + ε, (7.12) (cid:13)(cid:13)(cid:13) ∇ t , x A nonlin , ( I m ) i (cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . ε, (7.13) (cid:13)(cid:13)(cid:13) A nonlin , ( I m ) i (cid:13)(cid:13)(cid:13) ℓ S ( I m × R ) . ε. (7.14) Proof.

We begin with the quadratic interaction term in (7.10) and show that the time axis R can bepartitioned into M = M (cid:0) k ( A , φ ) k S , ε (cid:1) many intervals so that on each such interval I , it holds that(7.15) X k (cid:13)(cid:13)(cid:13)(cid:13) ∇ t , x X j (cid:3) − P k Q j P i (cid:0) ( χ I φ ) · ∇ x ( χ I φ ) (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . ε. To this end we exploit that there is an inherent null form in the above expression P i (cid:0) φ · ∇ x φ (cid:1) = ∆ − ∇ r N ir ( φ, φ ) , where N ir ( φ, ψ ) = ( ∂ i φ )( ∂ r ψ ) − ( ∂ r φ )( ∂ i ψ ) . We ﬁrst prove that on suitable intervals I ,(7.16) X k X j ≤ k + C (cid:13)(cid:13)(cid:13) ∇ t , x (cid:3) − P k Q j ∆ − ∇ r N ir (cid:0) Q ≤ j − C ( χ I φ ) , Q ≤ j − C ( χ I φ ) (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . ε. By a Littlewood-Paley trichotomy we may reduce to the case where both inputs are at frequency ∼ k . The singular operator (cid:3) − costs 2 − j − k , so we need to recover the factor 2 − j . From the nullform we gain 2 j − k , while the inclusion Q j L t L x ֒ → L ∞ t L x gains another 2 j . Finally, we obtaina small power in j − k from the improved Bernstein estimate P k Q j L t L x ֒ → k ( j − k ) L t L x (byinterpolating with the X s , b version of the Strichartz estimate P k Q j L t L x ֒ → k ( j − k ) L t L x ) and that ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 65 L t L x · L ∞ t L x ֒ → L t L x . Thus, we ﬁnd X k X j ≤ k + C (cid:13)(cid:13)(cid:13) ∇ t , x (cid:3) − P k Q j ∆ − ∇ r N ir (cid:0) Q ≤ j − C ( χ I φ k ) , Q ≤ j − C ( χ I φ k ) (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:18)X k (cid:16) − k k χ I ∇ x φ k k L t L x (cid:17) (cid:19) / k φ k S and smallness follows from divisibility of the L t L x ( R × R ) norm. Next, we show that on suitableintervals I , it holds that(7.17) X k X j > k + C (cid:13)(cid:13)(cid:13) ∇ t , x (cid:3) − P k Q j ∆ − ∇ r N ir (cid:0) χ I φ, χ I φ (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . ε. By a Littlewood-Paley trichotomy we may again reduce to the case where both inputs are at fre-quency ∼ k . Then we obtain, using the Bernstein inequality both in time and space, that X k X j > k + C (cid:13)(cid:13)(cid:13) ∇ t , x (cid:3) − P k Q j ∆ − ∇ r N ir (cid:0) χ I φ k , χ I φ k (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L x (cid:17) (cid:19) / k φ k S and smallness follows from the divisibility of the L t L x ( R × R ) norm. In view of (7.16) and (7.17),in order to ﬁnish the proof of (7.15) we may assume that one of the two inputs has the leadingmodulation. It therefore su ﬃ ces to show that on suitable intervals I we have bounds of the form(7.18) X k , j (cid:13)(cid:13)(cid:13)(cid:13) ∇ t , x (cid:3) − P k Q ≤ j − C ∆ − ∇ r N ir (cid:0) Q j ( χ I φ ) , Q ≤ j − C ( χ I φ ) (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . ε, where we use the convention (cid:3) − P k Q ≤ j − C = P l ≤ j − C (cid:3) − P k Q l . Using that (cid:0) (cid:3) − P k Q < j − C F (cid:1) ( t , · ) = − Z ∞ t sin(( t − s ) |∇| ) |∇| ( P k Q < j − C F )( s , · ) ds , it is enough to show X k , j (cid:13)(cid:13)(cid:13)(cid:13) P k Q ≤ j − C ∆ − ∇ r N ir (cid:0) Q j ( χ I φ ) , Q ≤ j − C ( χ I φ ) (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ε. By estimate (143) in [22] we may reduce to the case where j = k + O (1) and both inputs are atfrequency ∼ k . Then we ﬁnd X k (cid:13)(cid:13)(cid:13)(cid:13) P k Q ≤ k − C ∆ − ∇ r N ir (cid:0) Q k ( χ I φ k ) , Q ≤ k − C ( χ I φ k ) (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) L t L x . X k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) X , ∞ − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L x . k φ k S (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L x (cid:17) (cid:19) / and smallness follows by divisibility. Next, we consider the cubic term in (7.10). Here we have toprove that on suitable intervals I it holds that(7.19) X k (cid:13)(cid:13)(cid:13)(cid:13) ∇ t , x X j (cid:3) − P k Q j P i (cid:0) χ I A | φ | (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . ε. By similar arguments as above, this reduces to showing X k (cid:13)(cid:13)(cid:13) P k (cid:0) χ I A | φ | (cid:1)(cid:13)(cid:13)(cid:13) L t L x ( R × R ) . ε, which follows from estimate (64) in [22] and a divisibility argument. We note that the bound (7.10)implies that the estimates (7.12) and (7.13) hold on each such interval I .It remains to choose the intervals so that the bound (7.11) also holds. The energy estimatefor the S k and N k spaces together with the bounds (7.10) and (7.11) then also imply the bound(7.14). We pick M (cid:0) k ( A , φ ) k S , ε (cid:1) many time intervals I m , m = , . . . , M , on which the bound(7.10) already holds. We show that, if necessary, each time interval I m can be subdivided into M = M (cid:0) k ( A , φ ) k S , ε (cid:1) many intervals I ma , a = , . . . , M , such that we have(7.20) (cid:13)(cid:13)(cid:13) P i (cid:0) ( χ I ma φ ) · ∇ x ( χ I ma φ ) − χ I ma iA | φ | (cid:1)(cid:13)(cid:13)(cid:13) ( ℓ N ∩ ℓ L t ˙ H − x )( R × R ) . ε. For the rest of the proof of (7.20) we denote an interval I ma just by I and say that it is of the form I = [ t , t ] for some t < t . We only outline how to make the left hand side of (7.20) small in ℓ N for suitable intervals I , the ℓ L t ˙ H − x component being easier. We ﬁrst estimate the quadraticinteraction term in (7.20), X k (cid:13)(cid:13)(cid:13) P i (cid:0) ( χ I φ ) · ∇ x ( χ I φ ) (cid:1)(cid:13)(cid:13)(cid:13) N k = X k (cid:13)(cid:13)(cid:13) P k ∆ − ∇ r N ir (cid:0) χ I φ, χ I φ (cid:1)(cid:13)(cid:13)(cid:13) N k . By (131) in [22], it su ﬃ ces to consider the case where both inputs are at frequency ∼ k and haveangular separation ∼ X k (cid:13)(cid:13)(cid:13) P k ∆ − ∇ r N ir (cid:0) χ I φ k , χ I φ k (cid:1) ′ (cid:13)(cid:13)(cid:13) N k . Here, the prime indicates the angular separation. We split into high and low modulation output. X k (cid:13)(cid:13)(cid:13) P k ∆ − ∇ r N ir (cid:0) χ I φ k , χ I φ k (cid:1) ′ (cid:13)(cid:13)(cid:13) N k ≤ X k (cid:13)(cid:13)(cid:13) P k Q > k − C ∆ − ∇ r N ir (cid:0) χ I φ k , χ I φ k (cid:1) ′ (cid:13)(cid:13)(cid:13) N k + X k (cid:13)(cid:13)(cid:13) P k Q ≤ k − C ∆ − ∇ r N ir (cid:0) χ I φ k , χ I φ k (cid:1) ′ (cid:13)(cid:13)(cid:13) N k . The term with high modulation output is estimated by X k (cid:13)(cid:13)(cid:13) P k Q > k − C ∆ − ∇ r N ir (cid:0) χ I φ k , χ I φ k (cid:1) ′ (cid:13)(cid:13)(cid:13) N k . X k − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:17) (cid:19) / k φ k S and can be made small on suitable intervals I using the divisibility of the quantity X k (cid:16) − k (cid:13)(cid:13)(cid:13) ∇ x φ k (cid:13)(cid:13)(cid:13) L t L ∞ x ( R × R ) (cid:17) . k φ k S ( R × R ) . For the term with low modulation output we note that the angular separation of the inputs allows usto write schematically P k Q ≤ k − C ∆ − ∇ r N ir (cid:0) χ I φ k , χ I φ k (cid:1) ′ = P k Q ≤ k − C ∆ − ∇ r N ir (cid:0) Q > k − C ( χ I φ k ) , χ I φ k (cid:1) ′ + P k Q ≤ k − C ∆ − ∇ r N ir (cid:0) Q ≤ k − C ( χ I φ k ) , Q > k − C ( χ I φ k ) (cid:1) ′ . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 67

Then we estimate X k (cid:13)(cid:13)(cid:13) P k Q ≤ k − C ∆ − ∇ r N ir (cid:0) Q > k − C ( χ I φ k ) , χ I φ k (cid:1) ′ (cid:13)(cid:13)(cid:13) N k . X k − k (cid:13)(cid:13)(cid:13) Q > k − C (cid:0) χ I ∇ x φ k (cid:1)(cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L ∞ x . k φ k S (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:17) (cid:19) / and similarly for the other term. Smallness follows as before by divisibility. The cubic interactionterm in (7.20) is much simpler to treat, it can be made small on suitable intervals I using estimate(64) from [22] and divisibility of the L t ˙ W , x norm. (cid:3) It remains to prove that the bound (7.8) in the statement of Proposition 7.5 holds. For ε > ﬃ ciently small further below, depending only on the size of k ( A , φ ) k S and E crit , thereexist M (cid:0) k ( A , φ ) k S , ε (cid:1) many intervals I m , m = , . . . , M , partitioning the time axis R on which theconclusion of Lemma 7.6 holds. We pick such an interval I m and now show that, if necessary, it canbe subdivided into M (cid:0) k ( A , φ ) k S , E crit (cid:1) many intervals I ma , a = , . . . , M , such that k φ k S ( I ma × R ) ≤ C ( E crit ) , where C ( · ) grows at most polynomially. Upon renumbering the intervals I ma , we will then haveﬁnished the proof of Proposition 7.5.For the remainder of the proof, we denote an interval I ma just by I and assume that it is of theform I = [ t , t ] for some t < t . From the equation (cid:3) A φ = A | I = A f ree , ( I ) + A nonlin , ( I ) provided by Lemma 7.6, we conclude that on I × R it holds that (cid:3) pA free , ( I ) φ = − i X k (cid:0) P > k − C A f ree , ( I ) j (cid:1) P k ∂ j φ − iA nonlin , ( I ) j ∂ j φ + iA ∂ t φ + i ( ∂ t A ) φ + A α A α φ ≡ M + M , (7.22)where we use the notation (cid:3) pA free , ( I ) φ = (cid:3) φ + i X k (cid:0) P ≤ k − C A f ree , ( I ) j (cid:1) P k ∂ j φ, M = − i X k (cid:0) P > k − C A f ree , ( I ) j (cid:1) P k ∂ j φ − iA nonlin , ( I ) j ∂ j φ + iA ∂ t φ, M = i ( ∂ t A ) φ + A α A α φ. We further split the term M into M ≡ X k N (cid:0) P > k − C A f ree , ( I ) , P k φ (cid:1) + N (cid:0) A nonlin , ( I ) , φ (cid:1) + N (cid:0) A , φ (cid:1) . Since A f ree , ( I ) and A nonlin , ( I ) are in Coulomb gauge, we observe that the terms N (cid:0) P > k − C A f ree , ( I ) , P k φ (cid:1) and N (cid:0) A nonlin , ( I ) , φ (cid:1) exhibit a null structure, N (cid:0) P > k − C A f ree , ( I ) , P k φ (cid:1) = − i X j , r N jr (cid:0) ∆ − ∇ j P > k − C A f ree , ( I ) r , P k φ (cid:1) , N (cid:0) A nonlin , ( I ) , φ (cid:1) = − i X j , r N jr (cid:0) ∆ − ∇ j A nonlin , ( I ) r , φ (cid:1) . We emphasize that the right hand side of (7.22) is deﬁned on the whole space-time, but which onlycoincides with (cid:3) pA free , ( I ) φ on I × R . Using the linear estimate (3.3) for the magnetic wave operator (cid:3) pA free , ( I ) and working with suitable Schwartz extensions, we obtain that k φ k S ( I × R ) . k∇ t , x φ ( t ) k L x + (cid:13)(cid:13)(cid:13) χ I (cid:0) M + M (cid:1)(cid:13)(cid:13)(cid:13) N ∩ ℓ L t ˙ H − x ( R × R ) . E crit + (cid:13)(cid:13)(cid:13) χ I (cid:0) M + M (cid:1)(cid:13)(cid:13)(cid:13) N ∩ ℓ L t ˙ H − x ( R × R ) . We note that by Theorem 3.1, the implicit constant in the above estimate for the magnetic waveoperator depends polynomially on k∇ t , x A f ree , ( I ) k L ∞ t L x and we have k∇ t , x A f ree , ( I ) k L ∞ t L x . E / crit byLemma 7.6. In order to prove the bound (7.8), it therefore su ﬃ ces to show that we can choose theintervals I such that (cid:13)(cid:13)(cid:13) M + M (cid:13)(cid:13)(cid:13) N ∩ ℓ L t ˙ H − x ( I × R ) . E crit . Our general strategy to achieve this consists in ﬁrst using the o ﬀ -diagonal decay in the multilinearestimates from [22] to reduce to a situation in which a suitable divisibility argument works.We only outline how to obtain smallness of the term M in N ( I × R ), the estimate of M in ℓ L t ˙ H − x and of M in N ∩ ℓ L t ˙ H − x being easier. We begin with the ﬁrst term in the deﬁnition of M , (cid:13)(cid:13)(cid:13)X k N (cid:0) P > k − C A f ree , ( I ) , P k φ (cid:1)(cid:13)(cid:13)(cid:13) N ( I × R ) . From the estimate (131) in [22], we conclude that it su ﬃ ces to bound the expression(7.23) X k (cid:13)(cid:13)(cid:13) P k N (cid:0) P k A f ree , ( I ) , P k φ (cid:1) ′ (cid:13)(cid:13)(cid:13) N k ( I × R ) , where k = k = k + O (1) and both inputs have angular separation ∼

1. Similarly to the estimateof (7.21), we bound this term by (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) χ I P k ∇ x A f ree , ( I ) (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:17) (cid:19) k φ k S and a divisibility argument then yields smallness. To deal with the other two terms in M , we needto achieve (cid:13)(cid:13)(cid:13) N (cid:0) A nonlin , ( I ) , φ (cid:1) + N ( A , φ ) (cid:13)(cid:13)(cid:13) N ( I × R ) . E crit on suitable intervals I . To this end we will make similar reductions as in Section 4 of [22], peelingo ﬀ the “good parts” of N (cid:0) A nonlin , ( I ) , φ (cid:1) and of N (cid:0) A , φ (cid:1) until we are left with three quadrilinearnull form bounds.We introduce the expressions N lowhi (cid:0) A nonlin , ( I ) , φ (cid:1) = X k N (cid:0) P ≤ k − C A nonlin , ( I ) , P k φ (cid:1) and H ∗ N lowhi ( A nonlin , ( I ) , φ ) = X k X k ′ ≤ k − C X j ≤ k ′ + C Q ≤ j − C N (cid:0) Q j P k ′ A nonlin , ( I ) , Q ≤ j − C P k φ (cid:1) . By estimate (53) in [22], we have(7.24) (cid:13)(cid:13)(cid:13) N (cid:0) A nonlin , ( I ) , φ (cid:1) − N lowhi (cid:0) A nonlin , ( I ) , φ (cid:1)(cid:13)(cid:13)(cid:13) N . (cid:13)(cid:13)(cid:13) A nonlin , ( I ) (cid:13)(cid:13)(cid:13) S k φ k S ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 69 and by estimate (54) in [22], it holds that(7.25) (cid:13)(cid:13)(cid:13) N lowhi (cid:0) A nonlin , ( I ) , φ (cid:1) − H ∗ N lowhi (cid:0) A nonlin , ( I ) , φ (cid:1)(cid:13)(cid:13)(cid:13) N . (cid:13)(cid:13)(cid:13) A nonlin , ( I ) (cid:13)(cid:13)(cid:13) ℓ S k φ k S . Fixing ε > ﬃ ciently small, depending only on the size of k ( A , φ ) k S and E crit , Lemma 7.6ensures that k A nonlin , ( I ) k ℓ S is small enough so that the right hand sides of (7.24) and (7.25) arebounded by E crit . We now deﬁne H A nonlin , ( I ) i : = − χ I X k , k , kk ≤ min { k , k }− C X j ≤ k + C (cid:3) − P k Q j Im P i (cid:0) Q ≤ j − C ( χ I φ k ) · ∇ x Q ≤ j − C ( χ I φ k ) (cid:1) . By estimate (55) in [22] it holds that (cid:13)(cid:13)(cid:13) H ∗ N lowhi (cid:0) A nonlin , ( I ) − H A nonlin , ( I ) , φ (cid:1)(cid:13)(cid:13)(cid:13) N . (cid:13)(cid:13)(cid:13) A nonlin , ( I ) − H A nonlin , ( I ) (cid:13)(cid:13)(cid:13) Z k φ k S , so we have to make (cid:13)(cid:13)(cid:13) A nonlin , ( I ) − H A nonlin , ( I ) (cid:13)(cid:13)(cid:13) Z small. We recall the deﬁnition of the Z space, k φ k Z = X k k P k φ k Z k , k φ k Z k = sup l < C X ω l k P ω l Q k + l φ k L t L ∞ x . Using estimate (134) in [22] and that one obtains an extra gain for very negative l when estimatingin the Z space, we are reduced to bounding X k (cid:3) − P k Q k + O (1) ∆ − ∇ x N (cid:0) χ I φ k , χ I φ k (cid:1) . We easily ﬁnd that X k (cid:13)(cid:13)(cid:13) (cid:3) − P k Q k + O (1) ∆ − ∇ x N (cid:0) χ I φ k , χ I φ k (cid:1)(cid:13)(cid:13)(cid:13) L t L ∞ x . (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k (cid:13)(cid:13)(cid:13) L t L x (cid:17) (cid:19) / k φ k S , which can be made small by a divisibility argument. We are thus left with the term H ∗ N lowhi (cid:0) H A nonlin , ( I ) , φ (cid:1) . Carrying out similar reductions as in Section 4 of [22] for the “elliptic term” N ( A , φ ), we arriveat the key remaining term H ∗ N lowhi (cid:0) H A ( I )0 , φ (cid:1) , where H ∗ N lowhi (cid:0) H A ( I )0 , φ (cid:1) = X k X k ′ ≤ k − C X j ≤ k ′ + C Q ≤ j − C N (cid:0) Q j P k ′ H A ( I )0 , Q ≤ j − C P k φ (cid:1) and H A ( I )0 : = − χ I X k , k , kk ≤ min { k , k }− C X j ≤ k + C ∆ − P k Q j Im (cid:0) Q ≤ j − C ( χ I φ k ) · Q ≤ j − C ∂ t ( χ I φ k ) (cid:1) . As in [22], we combine the “hyperbolic term” H ∗ N lowhi (cid:0) H A nonlin , ( I ) , φ (cid:1) and the preceding “ellipticterm” H ∗ N lowhi (cid:0) H A ( I )0 , φ (cid:1) and wind up with the null forms (61) – (63) in [22]. We formulate theseas quadrilinear expressions as in [22] and then prove that smallness can be achieved for each ofthese. First null form ((61) in [22]).

By estimate (148) in [22], it su ﬃ ces to consider the following twocases. First, we show that X k X k = k + O (1) (cid:12)(cid:12)(cid:12)(cid:10) (cid:3) − P k Q j (cid:0) Q ≤ j − C ( χ I φ k ) · ∂ α Q ≤ j − C ( χ I φ k ) (cid:1) , P k Q j (cid:0) ∂ α Q ≤ j − C φ k · Q ≤ j − C ψ k (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) ≪ k ψ k N ∗ , where k > k + C , j = k + O (1) and k = k + O (1) = k + O (1). Second, we prove that X k X k = k + O (1) (cid:12)(cid:12)(cid:12)(cid:10) (cid:3) − P k Q j (cid:0) Q ≤ j − C ( χ I φ k ) · ∂ α Q ≤ j − C ( χ I φ k ) (cid:1) , P k Q j (cid:0) ∂ α Q ≤ j − C φ k · Q ≤ j − C ψ k (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) ≪ k ψ k N ∗ , where k > k + C , j = k + O (1) and k = k + O (1) = k + O (1).We begin with the ﬁrst case. Here, the inputs Q ≤ j − C ( χ I φ k ) and ∂ α Q ≤ j − C ( χ I φ k ) have Fouriersupports in identical (or opposite) angular sectors ω of size ∼ k − k . Then we bound X k X k > k + O (1) (cid:12)(cid:12)(cid:12)(cid:10) (cid:3) − P k Q k + O (1) (cid:0) Q ≤ k − C ( χ I φ k ) · ∂ α Q ≤ k − C ( χ I φ k + O (1) ) (cid:1) , P k Q k + O (1) (cid:0) ∂ α Q ≤ k − C φ k + O (1) · Q ≤ k − C ψ k + O (1) (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) . X k X k > k + O (1) ( k − k ) (cid:18)X ω k (cid:13)(cid:13)(cid:13) P ω Q ≤ k − C ( χ I φ k ) (cid:13)(cid:13)(cid:13) L t L x (cid:19) (cid:18)X ω (cid:13)(cid:13)(cid:13) P ω Q ≤ k − C ∇ t , x ( χ I φ k ) (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:19) ×× − k (cid:13)(cid:13)(cid:13) ∇ t , x φ k + O (1) (cid:13)(cid:13)(cid:13) L t L x k ψ k + O (1) k L ∞ t L x . (cid:18)X k sup l < − C X ω k (cid:13)(cid:13)(cid:13) P ω l Q ≤ k + l − C ( χ I φ k ) (cid:13)(cid:13)(cid:13) L t L x (cid:19) k φ k S k ψ k N ∗ . The desired smallness comes from the divisibility of the quantity (cid:18)X k sup l < − C X ω k (cid:13)(cid:13)(cid:13) P ω l Q ≤ k + l − C ( χ I φ k ) (cid:13)(cid:13)(cid:13) L t L x (cid:19) . To see the divisibility, we write(7.26) P ω l Q ≤ k + l − C ( χ I φ k ) = P ω l Q ≤ k + l − C ( χ I P ω l Q ≤ k + l + M φ k ) + P ω l Q ≤ k + l − C ( χ I Q > k + l + M φ k )for some M > ﬃ ciently large. By disposability of the operator P ω l Q ≤ k + l − C , weestimate the ﬁrst term on the right hand side of (7.26) by (cid:18)X k sup l < − C X ω k (cid:13)(cid:13)(cid:13) P ω l Q ≤ k + l − C ( χ I P ω l Q ≤ k + l + M φ k ) (cid:13)(cid:13)(cid:13) L t L x (cid:19) . (cid:18)X k sup l < − C X ω k (cid:13)(cid:13)(cid:13) χ I P ω l Q ≤ k + l + M φ k (cid:13)(cid:13)(cid:13) L t L x (cid:19) and smallness can be forced by divisibility of the quantity (cid:18)X k sup l < − C X ω k (cid:13)(cid:13)(cid:13) P ω l Q ≤ k + l + M φ k (cid:13)(cid:13)(cid:13) L t L x (cid:19) . k φ k S . For the second term on the right hand side of (7.26), we use P ω l Q ≤ k + l − C ( χ I Q > k + l + M φ k ) = P ω l Q ≤ k + l − C (cid:0) Q > k + l + M ( χ I ) Q > k + l + M φ k (cid:1) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 71

By Bernstein’s inequality in space and in time, we then have (cid:13)(cid:13)(cid:13) P ω l Q ≤ k + l − C (cid:0) Q > k + l + M ( χ I ) Q > k + l + M φ k (cid:1)(cid:13)(cid:13)(cid:13) L t L x . ( k + l ) k l (cid:13)(cid:13)(cid:13) Q > k + l + M χ I (cid:13)(cid:13)(cid:13) L t (cid:13)(cid:13)(cid:13) P ω l Q > k + l + M φ k (cid:13)(cid:13)(cid:13) L t L x . ( k + l ) k l − ( k + l + M ) (cid:13)(cid:13)(cid:13) P ω l Q > k + l + M φ k (cid:13)(cid:13)(cid:13) L t L x . k l − M (cid:13)(cid:13)(cid:13) P ω l Q > k + l + M φ k (cid:13)(cid:13)(cid:13) L t L x . Thus, we obtain (cid:18)X k sup l < − C X ω k (cid:13)(cid:13)(cid:13) P ω l Q ≤ k + l − C (cid:0) Q > k + l + M ( χ I ) Q > k + l + M φ k (cid:1)(cid:13)(cid:13)(cid:13) L t L x (cid:19) . (cid:18)X k sup l < − C l − M k∇ x φ k k X , ∞ (cid:19) . − M k φ k S and a smallness factor follows for su ﬃ ciently large M > X k X k > k + O (1) (cid:12)(cid:12)(cid:12)(cid:10) (cid:3) − P k Q k + O (1) (cid:0) Q ≤ k − C ( χ I φ k + O (1) ) · ∂ α Q ≤ k − C ( χ I φ k + O (1) ) (cid:1) , P k Q k + O (1) (cid:0) ∂ α Q ≤ k − C φ k · Q ≤ k − C ψ k + O (1) (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) . X k X k > k + O (1) − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k + O (1) (cid:13)(cid:13)(cid:13) L t L ∞ x − k (cid:13)(cid:13)(cid:13) ∇ t , x ( χ I φ k + O (1) ) (cid:13)(cid:13)(cid:13) L t L ∞ x ×× (cid:13)(cid:13)(cid:13) Q ≤ k − C ∇ t , x φ k (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) Q ≤ k − C ψ k + O (1) (cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) χ I ∇ x φ k + O (1) (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:17) (cid:19) k φ k S k ψ k N ∗ and immediately obtain smallness by divisibility. Second null form ((62) in [22]) . By the estimates (149) and (150) in [22], we only have to showthat X k X k = k + O (1) (cid:12)(cid:12)(cid:12)(cid:10) ( (cid:3) ∆ ) − P k Q j ∂ t ∂ α (cid:0) Q ≤ j − C ( χ I φ k ) · ∂ α Q ≤ j − C ( χ I φ k ) (cid:1) , P k Q j (cid:0) ∂ t Q ≤ j − C φ k · Q ≤ j − C ψ k (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) ≪ k ψ k N ∗ , where j = k + O (1), k = k + O (1) = k + O (1) and k > k + C . Then we estimate X k X k > k + C (cid:12)(cid:12)(cid:12)(cid:10) ( (cid:3) ∆ ) − P k Q k + O (1) ∂ t ∂ α (cid:0) Q ≤ k − C ( χ I φ k + O (1) ) · ∂ α Q ≤ k − C ( χ I φ k + O (1) ) (cid:1) , P k Q k + O (1) (cid:0) ∂ t Q ≤ k − C φ k · Q ≤ k − C ψ k + O (1) (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) . X k X k > k + C (cid:13)(cid:13)(cid:13) ( (cid:3) ∆ ) − P k Q k + O (1) ∂ t ∂ α (cid:0) Q ≤ k − C ( χ I φ k + O (1) ) · ∂ α Q ≤ k − C ( χ I φ k + O (1) ) (cid:1)(cid:13)(cid:13)(cid:13) L t L ∞ x ×× (cid:13)(cid:13)(cid:13) P k Q k + O (1) (cid:0) ∂ t Q ≤ k − C φ k · Q ≤ k − C ψ k + O (1) (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x . X k X k > k + C − k (cid:13)(cid:13)(cid:13) ∇ x ( χ I φ k + O (1) ) (cid:13)(cid:13)(cid:13) L t L ∞ x − k (cid:13)(cid:13)(cid:13) ∇ x ( χ I φ k + O (1) ) (cid:13)(cid:13)(cid:13) L t L ∞ x k ∂ t φ k k L ∞ t L x k ψ k + O (1) k L ∞ t L x . (cid:18)X k (cid:16) − k (cid:13)(cid:13)(cid:13) ∇ x ( χ I φ k + O (1) ) (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:17) (cid:19) / k φ k S k ψ k N ∗ and smallness follows by divisibility. Third null form ((63) in [22]).

By the estimates (152) – (154) in [22], it su ﬃ ces to consider thefollowing two cases. First, we show that X k X k = k + O (1) (cid:12)(cid:12)(cid:12)(cid:10) ( (cid:3) ∆ ) − P k Q j ∂ i (cid:0) Q ≤ j − C ( χ I φ k ) · ∂ i Q ≤ j − C ( χ I φ k ) (cid:1) , P k Q j ∂ α (cid:0) ∂ α Q ≤ j − C φ k · Q ≤ j − C ψ k (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) ≪ k ψ k N ∗ , where k > k + C , j = k + O (1) and k = k + O (1) = k + O (1). Second, we prove that X k X k = k + O (1) (cid:12)(cid:12)(cid:12)(cid:10) ( (cid:3) ∆ ) − P k Q j ∂ i (cid:0) Q ≤ j − C ( χ I φ k ) · ∂ i Q ≤ j − C ( χ I φ k ) (cid:1) , P k Q j ∂ α (cid:0) ∂ α Q ≤ j − C φ k · Q ≤ j − C ψ k (cid:1)(cid:11)(cid:12)(cid:12)(cid:12) ≪ k ψ k N ∗ , where k > k + C , j = k + O (1) and k = k + O (1) = k + O (1).In the ﬁrst case we note that the ﬁrst two inputs have Fourier supports in identical (or opposite)angular sectors ω of size ∼ k − k . Using Bernstein’s inequality, we then place the ﬁrst input in L t L x ,the second one in L ∞ t L x , the third one in L t L ∞ x and the fourth one in L ∞ t L x . As in the ﬁrst case ofthe ﬁrst null form we obtain the desired smallness by divisibility of the quantity (cid:18)X k sup l < − C X ω k (cid:13)(cid:13)(cid:13) P ω l Q ≤ k + l − C ( χ I φ k ) (cid:13)(cid:13)(cid:13) L t L x (cid:19) . The second case is easier to deal with and we omit the details. (cid:3)

Step 3.

Solution of perturbative problems on suitable space-time slices.

This is the crucial technicalstep. We write (cid:0) A n , ( L ) Λ , φ n , ( L ) Λ (cid:1) = (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) + (cid:0) δ A ( L ) , δφ ( L ) (cid:1) . Then we obtain the following system of equations for the perturbations (cid:0) δ A ( L ) , δφ ( L ) (cid:1) ,(7.27) (cid:3) A n , ( L − Λ + δ A ( L ) (cid:0) φ n , ( L − Λ + δφ ( L ) (cid:1) − (cid:3) A n , ( L − Λ φ n , ( L − Λ = , (cid:3) δ A ( L ) = − Im P (cid:16) φ n , ( L − Λ · ∇ x δφ ( L − + δφ ( L − · ∇ x φ n , ( L − Λ + δφ ( L − · ∇ x δφ ( L − (cid:17) + Im P (cid:16)(cid:0) A n , ( L − Λ + δ A ( L ) (cid:1)(cid:12)(cid:12)(cid:12) φ n , ( L − Λ + δφ ( L ) (cid:12)(cid:12)(cid:12) − A n , ( L − Λ (cid:12)(cid:12)(cid:12) φ n , ( L − Λ (cid:12)(cid:12)(cid:12) (cid:17) . (7.28) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 73

We have to show that if the initial data ( δ A ( L ) , δφ ( L ) )[0] are less than the absolute constant ε in theenergy sense, then we can prove frequency localized S norm bounds via bootstrap on any space-time slice on which certain “divisible” norms of (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) are small. Furthermore, thenumber of such space-time slices needed to ﬁll all of space-time depends on the a priori assumed S norm bounds for the components (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) .One technical di ﬃ culty is the formulation of the correct frequency localized S norm boundfor the propagation of δφ ( L ) , because there is a contribution from low frequencies of φ n , ( L − Λ , andsimilarly for δ A ( L ) . However, this low frequency contribution can be made arbitrarily small bypicking n large and δ small enough.We note that while (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) exists globally in time, (cid:0) δ A ( L ) , δφ ( L ) (cid:1) only exists locallyin time and we will have to prove global existence and S norm bounds for it. For now, anystatement we make about (cid:0) δ A ( L ) , δφ ( L ) (cid:1) is meant locally in time on some interval I around t = R into N = N (cid:0) k (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) k S ( R × R ) (cid:1) manytime intervals { I j } Nj = , on which the smallness conclusions (in terms of E crit ) of Proposition 7.5 hold.We tacitly assume that these intervals are intersected with I and now ﬁx the interval I , which weassume to contain t =

0. All the arguments in this step can be carried out for any of the laterintervals I , . . . , I N . Bootstrap assumptions : Suppose that there exist decompositions δ A ( L ) = δ A ( L ) ① + δ A ( L ) ② , δφ ( L ) = δφ ( L ) ① + δφ ( L ) ② satisfying the following bounds.(i) Let (cid:8) c ( L ) δ A , k (cid:9) k ∈ Z be a frequency envelope controlling the data P k δ A ( L ) [0] at time t = (cid:8) d ( L ) δ A , k (cid:9) k ∈ Z be a frequency envelope that decays exponentially for k > b L but is otherwise notlocalized and satisﬁes the smallness condition X k d ( L ) δ A , k ≤ δ = δ ( δ ) . Then we assume that for all k ∈ Z , (cid:13)(cid:13)(cid:13) P k δ A ( L ) ① (cid:13)(cid:13)(cid:13) S ( I × R ) ≤ Cc ( L ) δ A , k , (cid:13)(cid:13)(cid:13) P k δ A ( L ) ② (cid:13)(cid:13)(cid:13) S ( I × R ) ≤ Cd ( L ) δ A , k , where C ≡ C (cid:0) E crit (cid:1) is su ﬃ ciently large.(ii) Let (cid:8) c ( L ) δφ, k (cid:9) k ∈ Z be a frequency envelope controlling the data P k δφ ( L ) [0] at time t = (cid:8) d ( L ) δφ, k (cid:9) be a frequency envelope that decays exponentially for k > b L , but is otherwise not localizedand satisﬁes the smallness condition (cid:16)X k (cid:0) d ( L ) δφ, k (cid:1) (cid:17) ≤ δ = δ ( δ ) . Then we assume that for all k ∈ Z , (cid:13)(cid:13)(cid:13) P k δφ ( L ) ① (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ Cc ( L ) δφ, k , (cid:13)(cid:13)(cid:13) P k δφ ( L ) ② (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ Cd ( L ) δφ, k , where C ≡ C (cid:0) E crit (cid:1) is su ﬃ ciently large. We now show that we can improve this to a similar decomposition with (cid:13)(cid:13)(cid:13) P k δ A ( L ) ① (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ C c ( L ) δ A , k , (cid:13)(cid:13)(cid:13) P k δ A ( L ) ② (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ C d ( L ) δ A , k , (cid:13)(cid:13)(cid:13) P k δφ ( L ) ① (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ C c ( L ) δφ, k , (cid:13)(cid:13)(cid:13) P k δφ ( L ) ② (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ C d ( L ) δφ, k , (7.29)provided we make the additional assumption δ ≪ δ with implied constant depending only on E crit .Observe that we have X k (cid:0) c ( L ) δ A , k (cid:1) + (cid:0) c ( L ) δφ, k (cid:1) . ε and that our smallness parameters satisfy δ ≪ δ ≪ δ ≪ ε . For the remainder of this step we simply write I ≡ I and φ ≡ φ n , ( L − Λ , δφ ≡ δφ ( L ) , A ≡ A n , ( L − Λ , δ A ≡ δ A ( L ) . Step 3a.

Reorganizing the key equation (7.27) . We introduce the connection form ( A + δ A ) nonlin , ( I ) analogously to (7.9) by setting for i = , . . . , A + δ A ) nonlin , ( I ) i : = − χ I X k , j (cid:3) − P k Q j P i (cid:0) χ I ( φ + δφ ) · ∇ x (cid:0) χ I ( φ + δφ ) (cid:1) − χ I i ( A + δ A ) | φ + δφ | (cid:1) , and deﬁne ( A + δ A ) f ree , ( I ) as the free wave with initial data at time t = A + δ A ) f ree , ( I ) [0] = ( A + δ A )[0] − ( A + δ A ) nonlin , ( I ) [0] . Then we have ( A + δ A ) | I = ( A + δ A ) f ree , ( I ) + ( A + δ A ) nonlin , ( I ) . On I × R we may rewrite the equation (7.27) for δφ into the following frequency localized form (cid:3) p ( A + δ A ) free , ( I ) (cid:0) P δφ (cid:1) = − (cid:2) P , (cid:3) p ( A + δ A ) free , ( I ) (cid:3) δφ − P (cid:16) i X k P > k − C ( A + δ A ) f ree , ( I ) j P k ∂ j δφ (cid:17) − P (cid:16) i ( A + δ A ) nonlin , ( I ) j ∂ j δφ − i ( A + δ A ) ∂ t δφ (cid:17) − P (cid:16) i ( δ A ) j ∂ j φ − i ( δ A ) ∂ t φ (cid:17) + P (cid:16) i ( ∂ t A + ∂ t δ A )( φ + δφ ) − i ( ∂ t A ) φ (cid:17) + P (cid:16) ( A + δ A ) α ( A + δ A ) α φ − A α A α φ (cid:17) . (7.31)We immediately see that compared to (7.22), a qualitatively new feature in (7.31) is the interactionterm(7.32) P (cid:16) ( δ A ) j ∂ j φ − ( δ A ) ∂ t φ (cid:17) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 75

Step 3b.

Improving the bounds for δφ using (7.31) . In order to obtain bounds on the S ( I × R )norm of P δφ by bootstrap, we work with suitable Schwartz extensions and use the linear estimate(3.3) for the magnetic wave operator (cid:3) p ( A + δ A ) free , ( I ) . We ﬁrst consider the new interaction term (7.32).As usual the main di ﬃ culty comes from the low-high interactions, so we begin with this case, i.e.the term P < ( δ A ) j P ∂ j φ − P < ( δ A ) P ∂ t φ. For the spatial components of δ A , we deﬁne( δ A ) f ree , ( I ) : = ( A + δ A ) f ree , ( I ) − A f ree , ( I ) , ( δ A ) nonlin , ( I ) : = ( A + δ A ) nonlin , ( I ) − A nonlin , ( I ) and correspondingly have on I × R that( δ A ) | I = ( δ A ) f ree , ( I ) + ( δ A ) nonlin , ( I ) . We can therefore split on I × R , P < ( δ A ) j P ∂ j φ = P < ( δ A ) f ree , ( I ) j P ∂ j φ + P < ( δ A ) nonlin , ( I ) j P ∂ j φ. The ﬁrst term on the right hand side can in turn be split into two contributions(7.33) P < ( δ A ) f ree , ( I ) j P ∂ j φ = P < ( δ A ) f ree , ( I ) j P ∂ j φ + P < ( δ A ) f ree , ( I ) j P ∂ j φ, where ( δ A ) f ree , ( I ) is the free evolution of the data ( δ A ) ( L ) [0], while ( δ A ) f ree , ( I ) is the free wave withdata (cid:18)X k , j (cid:3) − P k Q j P (cid:0) χ I ( φ + δφ ) · ∇ x (cid:0) χ I ( φ + δφ ) (cid:1) − χ I i ( A + δ A ) | φ + δφ | (cid:1)(cid:19) [0] , − (cid:18)X k , j (cid:3) − P k Q j P (cid:0) ( χ I φ ) · ∇ x ( χ I φ ) − χ I iA | φ | (cid:1)(cid:19) [0] . In order to estimate the terms on the right hand side of (7.33), we will invoke the following estimatefrom [22] for a free wave A f ree in Coulomb gauge for k ≤ k − C ,(7.34) (cid:13)(cid:13)(cid:13) P k A f reej P k ∂ j φ (cid:13)(cid:13)(cid:13) N . (cid:13)(cid:13)(cid:13) P k A f ree (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S . (cid:13)(cid:13)(cid:13) P k A f ree [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S . We begin with the ﬁrst term on the right hand side of (7.33), P < ( δ A ) f ree , ( I ) j P ∂ j φ. Here we have to take advantage of the properties of the Fourier support of the data P < ( δ A ) f ree , ( I ) j [0].It su ﬃ ces to assume that (cid:13)(cid:13)(cid:13) P k ( δ A ) f ree j [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x ≤ C (cid:0) c δ A , k + d δ A , k (cid:1) for C ≡ C ( E crit ) su ﬃ ciently large. This is an assumption that will hold inductively at later initialtimes (for the intervals I , . . . , I N ). We observe that the frequency envelope (cid:8) c δ A , k (cid:9) k ∈ Z is “sharplylocalized” to the dyadic frequency interval [ a L , b L ] in the sense that it is exponentially decaying for k < a L and k > b L . By (7.34) we then have(7.35) (cid:13)(cid:13)(cid:13) P < ( δ A ) f ree , ( I ) j P ∂ j φ (cid:13)(cid:13)(cid:13) N ( I × R ) . X k < c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) + X k < d δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . We begin to estimate the ﬁrst term on the right hand side of (7.35), where we only consider the casewhen a L <

0. For R > ﬃ ciently large later on, we split X k < c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) = X k ≤ a L − R c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) + X a L − R < k ≤ a L + R c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) + X a L + R < k < c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . (7.36)To make the ﬁrst term on the right hand side of (7.36) small, we use the exponential decay of thefrequency envelope (cid:8) c δ A , k (cid:9) k ∈ Z to bound X k ≤ a L − R c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . X k ≤ a L − R − σ ( a L − k ) k δ A [0] k ˙ H x × L x k P φ k S ( I × R ) . E crit − σ R k P φ k S ( I × R ) . Upon replacing the output frequency 0 by general l ∈ Z , square summing over l and choosing R > ﬃ ciently large, we bound the preceding by ≪ E crit δ , as desired. In order to make thethird term on the right hand side of (7.36) small, we exploit that by Step 1 the S norms of P l φ areexponentially decaying beyond the scale l > a L . We have X a L + R < k < c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . E crit ( | a L | − R ) c , where { c l } l ∈ Z is a su ﬃ ciently ﬂat frequency envelope covering the initial data (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) [0]as in Step 1. Then replacing the frequency 0 in P φ by a general dyadic frequency l > a L + R ,square summing over l and choosing R > ﬃ ciently large, we ﬁnd X l > a L + R (cid:12)(cid:12)(cid:12)(cid:12) X a L + R < k < l c δ A , k (cid:13)(cid:13)(cid:13) P l φ (cid:13)(cid:13)(cid:13) S ( I × R ) (cid:12)(cid:12)(cid:12)(cid:12) . E crit X l > a L + R ( l − a L − R ) c l . E crit − σ R ≪ E crit δ , which is acceptable. It remains to make the second term on the right hand side of (7.36) small.To this end we exploit the frequency evacuation property of the data (cid:0) A n Λ , φ n Λ (cid:1) at the edges of thefrequency intervals [ a L , b L ] that we established in Lemma 7.3. For su ﬃ ciently small δ > ﬃ ciently large n , we then have X a L − R < k ≤ a L + R c δ A , k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . R δ (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) ≪ δ (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . Upon replacing the frequency 0 in P φ by an arbitrary dyadic frequency l ∈ Z and square summing,we obtain the desired smallness ≪ E crit δ for the last estimate.The contribution of the second term on the right hand side of (7.35) is acceptable, because, uponreplacing the output frequency 0 by l ∈ Z , square summing and using the bootstrap assumptions onthe interval I , we obtain the bound (cid:18)X l (cid:12)(cid:12)(cid:12)(cid:12)X k < l d δ A , k (cid:13)(cid:13)(cid:13) P l φ (cid:13)(cid:13)(cid:13) S ( I × R ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:19) . E crit δ ≪ δ , where the implied constant in . E crit depends at most polynomially on E crit .Next, we estimate the second term on the right hand side of (7.33), P < ( δ A ) f ree , ( I ) j P ∂ j φ. By (7.34) we have(7.37) (cid:13)(cid:13)(cid:13) P < ( δ A ) f ree , ( I ) j P ∂ j φ (cid:13)(cid:13)(cid:13) N ( I × R ) . (cid:13)(cid:13)(cid:13) P < ( δ A ) f ree , ( I ) (cid:13)(cid:13)(cid:13) ℓ S ( I × R ) (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 77

We illustrate how to obtain the desired smallness in this case by assuming for simplicity that P < ( δ A ) f ree , ( I ) is just the free evolution of the data X k < X j (cid:3) − P k Q j P (cid:0) χ I φ · ∇ x ( χ I δφ ) (cid:1) [0] . If δφ = ( δφ ) ① , we obtain by similar estimates as in the proof of Lemma 7.6 that (cid:13)(cid:13)(cid:13) P < ( δ A ) f ree , ( I ) (cid:13)(cid:13)(cid:13) ℓ S ( I × R ) . X k < (cid:13)(cid:13)(cid:13) χ I φ (cid:13)(cid:13)(cid:13) S c δφ, k . Then we achieve the desired smallness for (cid:13)(cid:13)(cid:13) P < ( δ A ) f ree , ( I ) j P ∂ j φ (cid:13)(cid:13)(cid:13) N ( I × R ) . E crit X k < c δφ, k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) by proceeding exactly as with the term (7.36). If instead δφ = ( δφ ) ② , we ﬁnd (cid:13)(cid:13)(cid:13) P < ( δ A ) f ree , ( I ) j P ∂ j φ (cid:13)(cid:13)(cid:13) N ( I × R ) . (cid:13)(cid:13)(cid:13) χ I φ (cid:13)(cid:13)(cid:13) S (cid:18)X k d δφ, k (cid:19) (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . δ (cid:13)(cid:13)(cid:13) χ I φ (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I × R ) . Upon replacing the output frequency 0 by l ∈ Z , square summing and using that k φ k S ( I × R ) . C ( E crit ) by the choice of the interval I , we obtain the bound . E crit δ . This is unfortunately not yetenough to close the bootstrap. To gain the extra smallness we partition the interval I further and usedivisibility arguments as in the proof of Lemma 7.6. However, the number of intervals needed forthis partition depends only on E crit (and not on the stage of the induction), which is acceptable.This ﬁnishes the estimate of the contribution of P < ( δ A ) f ree , ( I ) j P ∂ j φ and we now have to bound (cid:13)(cid:13)(cid:13) P < ( δ A ) nonlin , ( I ) j P ∂ j φ − P < ( δ A ) P ∂ t φ (cid:13)(cid:13)(cid:13) N ( I × R ) . At this point we can proceed by analogy to the treatment of the φ equation in the proof of Propo-sition 7.5. After a further partitioning of the interval I and similar divisibility arguments, we canreplace the output frequency 0 by l ∈ Z and upon square summing, we obtain a bound of the desiredform ≪ E crit δ .The remaining frequency interactions in the estimate of the term (7.32) as well as all other termson the right hand side of (7.31) are easier to control. We omit the details. Step 3c.

Improving the bounds for δ A using (7.28) . In order to deduce S ( I × R ) norm bounds on P δ A from the perturbation equation (7.28) by bootstrap, we perform the same kind of divisibilityarguments as in the proof of estimate (7.11) in Lemma 7.6 for the terms linear in δφ or δ A . Step 4.

Repetition of the bootstrap on suitable space-time slices; proof that the energy of pertur-bation remains small.

In this ﬁnal step we show that the crucial assumption on the energy of theperturbation E (cid:0) δ A ( L ) , δφ ( L ) (cid:1) (0) < ε remains in tact along the evolution up to a very small correction. We recall that δ A ( L ) = A n , ( L ) Λ − A n , ( L − Λ , δφ ( L ) = φ n , ( L ) Λ − φ n , ( L − Λ . Lemma 7.7.

Assuming the bounds (7.29) on the evolution of (cid:0) δ A ( L ) , δφ ( L ) (cid:1) on I × R , we have forsu ﬃ ciently small δ > and all su ﬃ ciently large n thatE (cid:0) δ A ( L ) , δφ ( L ) (cid:1) ( t ) < ε for t ∈ I . Proof.

By energy conservation for the evolutions of (cid:0) A n , ( L ) Λ , φ n , ( L ) Λ (cid:1) and of (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) , itsu ﬃ ces to show that (cid:12)(cid:12)(cid:12)(cid:12) E (cid:0) A n , ( L ) Λ , φ n , ( L ) Λ (cid:1) ( t ) − E (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) ( t ) − E (cid:0) δ A ( L ) , δφ ( L ) (cid:1) ( t ) (cid:12)(cid:12)(cid:12)(cid:12) can be made arbitrarily small uniformly for all t ∈ I by choosing δ > ﬃ ciently small and n su ﬃ ciently large. This reduces to bounding the following expression evaluated at any time t ∈ I , (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X i < j Z R (cid:0) ∂ i A n , ( L − Λ , j (cid:1)(cid:0) ∂ i δ A ( L ) j (cid:1) dx + X i Z R (cid:0) ∂ t A n , ( L − Λ , i (cid:1)(cid:0) ∂ t δ A ( L ) i (cid:1) + (cid:0) ∂ i A n , ( L ) Λ , (cid:1)(cid:0) ∂ i δ A ( L − (cid:1) dx + X α Z R (cid:12)(cid:12)(cid:12)(cid:0) A n , ( L − Λ ,α (cid:1)(cid:0) δφ ( L ) (cid:1) + (cid:0) δ A ( L ) α (cid:1)(cid:0) φ n , ( L − Λ (cid:1)(cid:12)(cid:12)(cid:12) dx + X α Re Z R (cid:0) ∂ α φ n , ( L − Λ + iA n , ( L − Λ ,α φ n , ( L − Λ (cid:1)(cid:0) ∂ α δφ ( L ) + i δ A ( L ) α δφ ( L ) (cid:1) + (cid:0) ∂ α δφ ( L ) + i δ A ( L ) α δφ ( L ) (cid:1)(cid:0) iA n , ( L − Λ ,α δφ ( L ) + i δ A ( L ) α φ n , ( L − Λ (cid:1) + (cid:0) ∂ α φ n , ( L − Λ + iA n , ( L − Λ ,α φ n , ( L − Λ (cid:1)(cid:0) iA n , ( L − Λ ,α δφ ( L ) + i δ A ( L ) α φ n , ( L − Λ (cid:1) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . We note that in this expression at least one term of the form A n , ( L − Λ or φ n , ( L − Λ is paired againstat least one term of the form δ A ( L ) or δφ ( L ) . By Plancherel’s theorem (and a Littlewood-Paleytrichotomy to deal with the multilinear interactions), we reduce to estimating a sum of the form (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) X k ∈ Z X i < j Z R P k (cid:0) ∂ i A n , ( L − Λ , j (cid:1) P k (cid:0) ∂ i δ A ( L ) j (cid:1) dx + . . . (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . By the bounds (7.29) and Step 1, we estimate this by . X k ∈ Z (cid:13)(cid:13)(cid:13) P k ∇ x A n , ( L − Λ ( t ) (cid:13)(cid:13)(cid:13) L x (cid:13)(cid:13)(cid:13) P k ∇ x δ A ( L ) ( t ) (cid:13)(cid:13)(cid:13) L x + . . . . X k ∈ Z c ( L − k (cid:0) c ( L ) δ A , k + d ( L ) δ A , k (cid:1) + . . . To see that this expression can be made arbitrarily small, we split X k ∈ Z c ( L − k c ( L ) δ A , k = X k ≤ a L − R c ( L − k c ( L ) δ A , k + X a L − R < k ≤ a L + R c ( L − k c ( L ) δ A , k + X k > a L + R c ( L − k c ( L ) δ A , k . The ﬁrst term can be made arbitrarily small for su ﬃ ciently large R > (cid:8) c ( L ) δ A , k (cid:9) k ∈ Z beyond [ a L , b L ]. Similarly, we achieve smallness for the thirdterm for su ﬃ ciently large R > (cid:8) c ( L − k (cid:9) k ∈ Z for k > a L established inStep 1. Finally, we gain smallness for the second term for all su ﬃ ciently large n from the frequencyevacuation property in Lemma 7.3. Moreover, we have by (7.29) that X k ∈ Z c ( L − k d ( L ) δ A , k . E crit δ ( δ ) ≪ . (cid:3) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 79

We now summarize how the previous steps yield the proof of Proposition 7.4. In order to derive S norm bounds on the evolutions (cid:0) A n , ( L ) Λ , φ n , ( L ) Λ (cid:1) , we ﬁrst use Proposition 7.5 from Step 2 topartition the time axis R into N = N (cid:16)(cid:13)(cid:13)(cid:13)(cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) (cid:17) many time intervals I , . . . , I N ,on which certain “divisible norms” of (cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1) are small in terms of E crit . Let I be thetime interval containing t =

0. By construction of the frequency intervals J L , 1 ≤ L ≤ C , theenergy of the perturbation (cid:0) δ A ( L ) , δφ ( L ) (cid:1) [0] = (cid:0) A n , ( L ) Λ − A n , ( L − Λ , φ n , ( L ) Λ − φ n , ( L − Λ (cid:1) [0]at time t = ε . Thus, we can prove frequency localized S normbounds for (cid:0) δ A ( L ) , δφ ( L ) (cid:1) on I × R by bootstrap as in Step 3. Crucially, Lemma 7.7 from Step 4ensures that the energy of the perturbation (cid:0) δ A ( L ) , δφ ( L ) (cid:1) [ t ], t ∈ I , is approximately conserved onthe time interval I , up to a very small error term that is controlled by the size of δ . Hence, we canensure that at the starting points of all later (or earlier) time intervals I , . . . , I N , the energy of theperturbation is still less than the absolute constant ε by choosing δ su ﬃ ciently small dependingon the number N of “divisibility intervals”, which is bounded by the size of (cid:13)(cid:13)(cid:13)(cid:0) A n , ( L − Λ , φ n , ( L − Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) ≤ C . This allows us to repeat the same bootstrap argument from Step 3 on all other time intervals I , . . . , I N . Putting all estimates together, we then obtain the desired S norm bounds (cid:13)(cid:13)(cid:13)(cid:0) A n , ( L ) Λ , φ n , ( L ) Λ (cid:1)(cid:13)(cid:13)(cid:13) S ( R × R ) ≤ C ( C ) , where the bound C only depends on the size of C . (cid:3) Interlude: Proofs of Proposition 4.3, Proposition 5.1, Proposition 5.13 and Proposition 6.1.

Proof of Proposition 4.3.

Let E denote the conserved energy of the admissible solution ( A , φ ).Analogously to the proof of Proposition 7.5, we can partition the time interval ( − T , T ) into N = N (cid:0) k ( A , φ ) k S (( − T , T ) × R ) (cid:1) many intervals I such that A | I = A f ree , ( I ) + A nonlin , ( I ) , (cid:3) A f ree , ( I ) = (cid:13)(cid:13)(cid:13) ∇ t , x A f ree , ( I ) (cid:13)(cid:13)(cid:13) L ∞ t L x ( R × R ) . E / , (cid:13)(cid:13)(cid:13) A nonlin , ( I ) (cid:13)(cid:13)(cid:13) ℓ S ( I × R ) ≪ , k φ k S ( I × R ) . C ( E ) , where C ( · ) grows at most polynomially. For each such interval I , say of the form I = [ t , t ] forsome t < t , we let { c k } k ∈ Z be a su ﬃ ciently ﬂat frequency envelope covering the data ( A , φ )[ t ] attime t . Then we show that the bootstrap assumption (cid:13)(cid:13)(cid:13) P k A (cid:13)(cid:13)(cid:13) S k ( I × R ) + (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ Dc k for D = D ( E ) su ﬃ ciently large, implies the improved bound (cid:13)(cid:13)(cid:13) P k A (cid:13)(cid:13)(cid:13) S k ( I × R ) + (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k ( I × R ) ≤ D c k . We only discuss the equation for φ , because the equation for A is easier. It su ﬃ ces to consider thecase k =

0. On I × R we may rewrite the equation for φ into the following frequency localizedform (cid:3) pA free , ( I ) (cid:0) P φ (cid:1) = − (cid:2) P , (cid:3) pA free , ( I ) (cid:3) φ − iP (cid:18)X k P > k − C A f ree , ( I ) j P k ∂ j φ (cid:19) − iP (cid:0) A nonlin , ( I ) j ∂ j φ − A ∂ t φ (cid:1) + P (cid:0) i ( ∂ t A ) φ + A α A α φ (cid:1) . (7.38)In order to close the bootstrap argument we now translate the estimates in the proof of Proposition7.5 into the language of frequency envelopes. For example, to bound the high-high interactions inthe term P (cid:18)X k P > k − C A f ree , ( I ) j P k ∂ j φ (cid:19) , we use estimate (131) from [22] to obtain X k = k + O (1) k > O (1) (cid:13)(cid:13)(cid:13) P (cid:0) P k A f ree , ( I ) j P k ∂ j φ (cid:1)(cid:13)(cid:13)(cid:13) N ( I × R ) . X k = k + O (1) k > O (1) − δ k (cid:13)(cid:13)(cid:13) P k A f ree , ( I ) (cid:13)(cid:13)(cid:13) S ( I × R ) (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S ( I × R ) . X k = k + O (1) k > O (1) − δ k (cid:13)(cid:13)(cid:13) P k A f ree , ( I ) (cid:13)(cid:13)(cid:13) S ( I × R ) c k . Summing over all su ﬃ ciently large k ≫ ≤ D c . This allows us to reduce to the case k = k + O (1) = O (1). Here we gain the necessary smallnessby further partitioning the interval I (where the total number of subintervals depends only on thesize of E ), using exactly the same divisibility argument as for the term (7.23) in the proof of Propo-sition 7.5. All other terms on the right hand side of (7.38) can be treated analogously to the aboveargument. (cid:3) Proof of Proposition 5.1.

Here we are in the situation of Step 3 of the proof of Proposition 7.4. Weobtain the bound (cid:13)(cid:13)(cid:13) ( δ A , δφ ) (cid:13)(cid:13)(cid:13) S ([ − T , T ] × R ) . L δ for su ﬃ ciently small δ by means of a bootstrap argument performed on a ﬁnite number of space-time slices, whose number depends on L . We select these space-time slices as in Proposition 7.5.The main di ﬃ culty arises from the equation for φ . As in Step 3a of the proof of Proposition 7.4, welocalize the equation for φ to frequency 0 on a suitable space-time slice I × R with 0 ∈ I . Then, asin Step 3b there, the main di ﬃ culty comes from the new low-high interaction term P < ( δ A ) j P ∂ j φ − P < ( δ A ) P ∂ t φ. Using notation from the proof of Proposition 7.4, the worst contribution comes from(7.39) P < δ A f ree , ( I ) j P ∂ j φ, where we recall that δ A f ree , ( I ) is the free evolution of the data δ A [0]. We observe that for δ A f ree , ( I ) the interaction term (7.39) vanishes by assumption on the frequency support of δ A [0] unless K ≤ A , φ )[0] and by Proposition 4.3, we obtain (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ([ − T , T ] × R ) . L σ K . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 81

More generally, replacing the frequency 0 by l ∈ Z with l ≥ K , we have (cid:13)(cid:13)(cid:13) P l φ (cid:13)(cid:13)(cid:13) S ([ − T , T ] × R ) . L − σ ( l − K ) . By estimate (7.34), we then ﬁnd for l ≥ K that (cid:13)(cid:13)(cid:13) P < l δ A f ree , ( I ) j P l ∂ j φ (cid:13)(cid:13)(cid:13) N l ( I × R ) . L δ | l − K | − σ ( l − K ) , where the extra factor | l − K | arises due to the ℓ summation over the frequencies of P < l δ A f ree , ( I ) .But then we get the bound (cid:18)X l ≥ K (cid:13)(cid:13)(cid:13) P < l δ A f ree , ( I ) j P l ∂ j φ (cid:13)(cid:13)(cid:13) N l ( I × R ) (cid:19) . L δ , which gives the required smallness for this term. Then the argument proceeds as for Proposition 7.4. (cid:3) Proof of Proposition 5.13.

We write for large R ≥ R ≥ R (cid:0) ˜ A R , ˜ φ R (cid:1) = (cid:0) ˜ A R + δ A , ˜ φ R + δφ (cid:1) . Then we analyze the equations for ( δ A , δφ ). In fact, the only new feature occurs for the δφ equationand so we explain this here. We obtain the equation (cid:3) ˜ A R + δ A ( δφ ) + (cid:0) (cid:3) ˜ A R + δ A − (cid:3) ˜ A R (cid:1) ˜ φ R = . Here we only retain the key di ﬃ cult term that cannot be treated via a perturbative argument, us-ing suitable divisibility properties as for example done in great detail in Step 3 of the proof ofProposition 7.4. This term is given by X k ∈ Z P < k ( δ A f ree ) j P k ∂ j ˜ φ R . However, since we localize to a small time interval [ − T , T ] around t =

0, it will be possible to obtaingood N norm bounds. Note that on account of the estimates in Subsection 5.1, we may assume thatlim sup R →∞ (cid:13)(cid:13)(cid:13) ( ˜ A R , ˜ φ R ) (cid:13)(cid:13)(cid:13) S ([ − T , T ] × R ) < ∞ for all R su ﬃ ciently large, provided that T is su ﬃ ciently small. We shall assume, as we may that T <

1. Then write X k P < k ( δ A f ree ) j P k ∂ j ˜ φ R = X k P < min {− log R , k } ( δ A f ree ) j P k ∂ j ˜ φ R + X k P [ − log R , k ] ( δ A f ree ) j P k ∂ j ˜ φ R . (7.40)The last term will be estimated by taking advantage of Huygens’ principle as well as our particularchoice of initial data, namely that ˜ φ R [0] is supported on the set (cid:8) | x | ≤ R (cid:9) , while δ A [0] is supportedon (cid:8) | x | ≥ R (cid:9) up to tails that essentially decay exponentially fast. We now estimate both terms on the right hand side of (7.40). For the ﬁrst term we ﬁnd (cid:13)(cid:13)(cid:13) X k P < min {− log R , k } ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) N ([ − T , T ] × R ) . (cid:18)X k (cid:13)(cid:13)(cid:13) P < min {− log R , k } ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) L t L x ([ − T , T ] × R ) (cid:19) . sup l (cid:13)(cid:13)(cid:13) P < min {− log R , l } δ A f ree (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:18)X k (cid:13)(cid:13)(cid:13) P k ∇ x ˜ φ R (cid:13)(cid:13)(cid:13) L ∞ t L x ([ − T , T ] × R ) (cid:19) . R − (cid:13)(cid:13)(cid:13) δ A [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x (cid:13)(cid:13)(cid:13) ˜ φ R (cid:13)(cid:13)(cid:13) S ([ − T , T ] × R ) and so this converges to 0 as R → + ∞ . For the second term we have (cid:13)(cid:13)(cid:13) X k P [ − log R , k ] ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) N ([ − T , T ] × R ) . (cid:18)X k (cid:13)(cid:13)(cid:13) P [ − log R , k ] ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) L t L x ([ − T , T ] × R ) (cid:19) . (cid:18)X k (cid:13)(cid:13)(cid:13) χ {| x | < R } P [ − log R , k ] ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) L t L x ([ − T , T ] × R ) (cid:19) + (cid:18)X k (cid:13)(cid:13)(cid:13) χ {| x |≥ R } P [ − log R , k ] ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) L t L x ([ − T , T ] × R ) (cid:19) . For the last term but one we use the localization properties of δ A f ree to conclude (cid:18)X k (cid:13)(cid:13)(cid:13) χ {| x | < R } P [ − log R , k ] ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) L t L x ([ − T , T ] × R ) (cid:19) . sup l > − log( R ) (cid:13)(cid:13)(cid:13) χ {| x | < R } P [ − log R , l ] ( δ A f ree ) (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ([ − T , T ] × R ) (cid:18)X k (cid:13)(cid:13)(cid:13) P k ∇ x ˜ φ R (cid:13)(cid:13)(cid:13) L ∞ t L x ([ − T , T ] × R ) (cid:19) . R − M (cid:13)(cid:13)(cid:13) ˜ φ R (cid:13)(cid:13)(cid:13) S ([ − T , T ] × R ) . R − M , while for the last term, we get (cid:18)X k (cid:13)(cid:13)(cid:13) χ {| x |≥ R } P [ − log R , k ] ( δ A f ree ) j P k ∂ j ˜ φ R (cid:13)(cid:13)(cid:13) L t L x ([ − T , T ] × R ) (cid:19) . (cid:18)X k (cid:13)(cid:13)(cid:13) P [ − log R , k ] ( δ A f ree ) (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:13)(cid:13)(cid:13) χ | x |≥ R P k ∇ x ˜ φ R (cid:13)(cid:13)(cid:13) L ∞ t L x ([ − T , T ] × R ) (cid:19) . Here we use the localization properties of ˜ φ R to bound the second factor by (cid:13)(cid:13)(cid:13) χ | x |≥ R P k ∇ x ˜ φ R (cid:13)(cid:13)(cid:13) L ∞ t L x ([ − T , T ] × R ) . (cid:16) max { k , } R (cid:17) − M as long as k > − log R as we may assume and also we have the crude bound (cid:13)(cid:13)(cid:13) P [ − log R , k ] ( δ A f ree ) (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x . k , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 83 whence we ﬁnally obtain the bound (cid:18)X k (cid:13)(cid:13)(cid:13) P [ − log R , k ] ( δ A f ree ) (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:13)(cid:13)(cid:13) χ | x |≥ R P k ∇ x ˜ φ R (cid:13)(cid:13)(cid:13) L ∞ t L x ([ − T , T ] × R ) (cid:19) . R − M . Letting R → + ∞ then again gives the required smallness. (cid:3) Proof of Proposition 6.1.

In view of Lemma 5.4, it su ﬃ ces to consider the case I = R . We argueby contradiction. Assume that we have(7.41) k ( A , φ ) k S ( R × R ) < ∞ . Then the idea is that using this ingredient as well as a correct perturbative ansatz for the evolu-tions ( A n , φ n ) for n large enough, we can show that the corresponding S norms of ( A n , φ n ) muststay ﬁnite, contradicting the assumption. We introduce the perturbative term δ A n for the magneticpotential by A n = A + δ A n and the perturbative term δφ n by means of φ n = χ I φ + χ I ˜ φ A ,δ A freen + δφ n , where I is a very large time interval centered around t = I represents the complement. Thefunction ˜ φ A ,δ A freen solves the wave equation˜ (cid:3) A + δ A freen ( ˜ φ A ,δ A freen ) = , ˜ φ A ,δ A freen [0] = φ [0] , where ˜ (cid:3) A + δ A freen = (cid:3) + i ( A + δ A f reen ) ν ∂ ν and in this context we let δ A f reen be the actual free evolution of the data δ A n [0] (as usual only thespatial components). We let χ I , χ I be a smooth partition of unity subordinate to dilates of theintervals I , I . We note that in this argument one has to in fact replace the energy class solution( A , φ ) by the evolution of a low frequency approximation of the energy class data very close to itand then show that this implies S norm bounds for ( A n , φ n ) uniformly for all su ﬃ ciently close lowfrequency approximations.To begin with, observe that we can show by a variant of the proof of Lemma 7.9, proved laterindependently, that given any γ > I suitably large (depending on A , φ, γ ), we canarrange that ˜ φ A ,δ A freen = (cid:0) ˜ φ A ,δ A freen (cid:1) + (cid:0) ˜ φ A ,δ A freen (cid:1) with (cid:13)(cid:13)(cid:13)(cid:0) ˜ φ A ,δ A freen (cid:1) (cid:13)(cid:13)(cid:13) S < γ, (cid:13)(cid:13)(cid:13) χ I (cid:0) ˜ φ A ,δ A freen (cid:1) (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x < γ. Now the equation for δφ n becomes the following (cid:3) A + δ A n δφ n = − χ I (cid:3) A + δ A n φ − χ I (cid:3) A + δ A n ˜ φ A ,δ A freen + ( ∂ t χ I ) (cid:0) φ − ˜ φ A ,δ A freen (cid:1) + ∂ t χ I ) (cid:0) ∂ t φ − ∂ t ˜ φ A ,δ A freen (cid:1) + i ( ∂ t χ I )( A + δ A n ) (cid:0) φ − ˜ φ A ,δ A freen (cid:1) . The error term ∂ t ( χ I )( φ − ˜ φ A ,δ A freen ) is potentially problematic, because we cannot place the factor (cid:0) φ − ˜ φ A ,δ A freen (cid:1) into L ∞ t L x . In fact, the latter is only possible provided we have compact spatial support (precisely,in a ball of radius R with 1 ≪ R ≤ | I | ) according to the Huygens principle, because then the extrafactor | I | − stemming from ∂ t ( χ I ) will counterbalance the factor | I | in (cid:13)(cid:13)(cid:13) φ − ˜ φ A ,δ A freen (cid:13)(cid:13)(cid:13) L ∞ t L x ( I × R ) . | I | (cid:13)(cid:13)(cid:13) ∇ x (cid:0) φ − ˜ φ A ,δ A freen (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ( I × R ) . Here it is natural to truncate the data φ [0] in physical space to force this spatial localization later intime via Huygens’ principle, but one needs to ensure that this does not destroy the good S normbounds for ( A , φ ). In fact, since we use the same A data, the argument for Proposition 5.1 appliesto yield a global S norm bound for the new ( A , φ ). We then incorporate the error due to truncatingthe data φ [0] into δφ n (while δ A n [0] remains unchanged!), and hence infer the desired bound (cid:13)(cid:13)(cid:13) ∂ t ( χ I ) (cid:0) φ − ˜ φ A ,δ A freen (cid:1)(cid:13)(cid:13)(cid:13) N . (cid:13)(cid:13)(cid:13) ∇ x (cid:0) φ − ˜ φ A ,δ A freen (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ( I × R ) . This gives the required smallness provided we can make (cid:13)(cid:13)(cid:13) ∇ x (cid:0) φ − ˜ φ A ,δ A freen (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x ( I × R ) small. Forthis observe that (we omit the cubic interaction terms) (cid:3) A (cid:0) φ − ˜ φ A ,δ A freen (cid:1) = i ( δ A f reen ) j ∂ j ˜ φ A ,δ A freen + . . . , and further (cid:13)(cid:13)(cid:13) χ I i ( δ A f reen ) j ∂ j ˜ φ A ,δ A freen (cid:13)(cid:13)(cid:13) N ≤ C ( I , φ, A ) (cid:13)(cid:13)(cid:13) δ A n [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x . This implies (cid:13)(cid:13)(cid:13) ∇ t , x (cid:0) φ − ˜ φ A ,δ A freen ) (cid:13)(cid:13)(cid:13) L ∞ t L x ( I × R ) ≤ C ( I , φ, A ) (cid:13)(cid:13)(cid:13) δ A n [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x and so we conclude that (cid:13)(cid:13)(cid:13) ∂ t ( χ I ) (cid:0) ∂ t φ − ∂ t ˜ φ A ,δ A freen (cid:1)(cid:13)(cid:13)(cid:13) N + (cid:13)(cid:13)(cid:13) ∂ t ( χ I ) (cid:0) φ − ˜ φ A ,δ A freen (cid:1)(cid:13)(cid:13)(cid:13) N . C ( I , φ, A ) (cid:13)(cid:13)(cid:13) δ A n [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x . Furthermore, we can write − χ I (cid:3) A + δ A n φ = − χ I (cid:3) A + δ A nonlinn φ − error , where we use the decomposition δ A n = δ A f reen + δ A nonlinn with the ﬁrst term on the right hand side the free propagation of δ A n [0]. For the error term we get (cid:13)(cid:13)(cid:13) error (cid:13)(cid:13)(cid:13) N ≤ C ( | I | , A ) (cid:13)(cid:13)(cid:13) δ A n [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x . Furthermore, by using a divisibility argument and subdividing time axis into N (cid:0) k ( A , φ ) k S (cid:1) manytime intervals J , using the argument for Proposition 7.4, we can force (for each such J ) (cid:13)(cid:13)(cid:13) χ I (cid:3) A + δ A nonlinn φ (cid:13)(cid:13)(cid:13) N ( J × R ) ≪ k δφ n k S + k δ A n k S . Similarly, we have (cid:13)(cid:13)(cid:13) χ I (cid:3) A + δ A n ˜ φ A ,δ A freen (cid:13)(cid:13)(cid:13) N ( J × R ) ≪ k δφ n k S + k δ A n k S , which then su ﬃ ces for the bootstrap for δφ n .Next, we consider the equation for δ A n , which is of the schematic form (cid:3) δ A n = φ · ∇ x φ − (cid:0) χ I φ (cid:1) · ∇ x (cid:0) χ I φ (cid:1) − (cid:0) χ I φ (cid:1) · ∇ x (cid:0) χ I ˜ φ A ,δ A freen + δφ n (cid:1) − (cid:0) χ I ˜ φ A ,δ A freen + δφ n (cid:1) · ∇ x (cid:0) χ I φ + χ I ˜ φ A ,δ A freen + δφ n (cid:1) + ( A + δ A n ) (cid:12)(cid:12)(cid:12) χ I φ + χ I ˜ φ A ,δ A freen + δφ n (cid:12)(cid:12)(cid:12) − A | φ | . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 85

Then we make the following observations. The ﬁrst line on the right hand side satisﬁes (cid:13)(cid:13)(cid:13) φ · ∇ x φ − (cid:0) χ I φ (cid:1) · ∇ x (cid:0) χ I φ (cid:1)(cid:13)(cid:13)(cid:13) N ≤ ν for any prescribed ν >

0, provided we pick I su ﬃ ciently large. The reason for this is that this termis supported around the endpoints of I (which is centered around t = (cid:3) A φ =

0, weobtain similarly to the proof of Lemma 7.9 the dispersive decay for φ at large times, which easilygives the desired smallness for the N norm. For the second and third line on the right, we ﬁnd (cid:13)(cid:13)(cid:13)(cid:0) χ I φ (cid:1) · ∇ x (cid:0) χ I ˜ φ A ,δ A freen + δφ n (cid:1)(cid:13)(cid:13)(cid:13) N ( J × R ) + (cid:13)(cid:13)(cid:13)(cid:0) χ I ˜ φ A ,δ A freen + δφ n (cid:1) · ∇ x (cid:0) χ I φ + χ I ˜ φ A ,δ A freen + δφ n (cid:1)(cid:13)(cid:13)(cid:13) N ( J × R ) . ν + M − k δφ n k S ( J × R ) + C k δφ n k S ( J × R ) , where J is a member of a suitable partition of the time axis into N (cid:0) k ( A , φ ) k S , M (cid:1) many intervalsand C is a universal constant. Here we exploit the uniform dispersive decay of ˜ φ A ,δ A freen . The lastline is handled similarly, (cid:13)(cid:13)(cid:13) ( A + δ A n ) (cid:12)(cid:12)(cid:12) χ I φ + χ I ˜ φ A ,δ A freen + δφ n (cid:12)(cid:12)(cid:12) − A | φ | (cid:13)(cid:13)(cid:13) N ( J × R ) . ν + C M − (cid:0) k δφ n k S ( J × R ) + k δ A n k S ( J × R ) (cid:1) + C (cid:0) k δφ n k S ( J × R ) + k δ A n k S ( J × R ) (cid:1) + k δ A n k S ( J × R ) k δφ n k S ( J × R ) . Combining these bounds, we then ﬁnally infer for the interval J containing t = k δ A n k S ( J × R ) . k δ A n [0] k ˙ H x × L x + ν + C M − (cid:0) k δφ n k S ( J × R ) + k δ A n k S ( J × R ) (cid:1) + C (cid:0) k δφ n k S ( J × R ) + k δ A n k S ( J × R ) (cid:1) + k δ A n k S ( J × R ) k δφ n k S ( J × R ) , which su ﬃ ces to bootstrap the bound for k δ A n k S on J . The bootstrap on the remaining intervalsfollows by induction (and choosing ν and k δ A n [0] k ˙ H x × L x su ﬃ ciently small depending on M and k ( A , φ ) k S ). Finally, we observe that the S norm bounds on φ , ˜ φ A ,δ A freen , and δφ n are “inherited” bythe expression χ I φ + χ I ˜ φ A ,δ A freen + δφ n on account of the support properties of the functions φ , ˜ φ A ,δ A freen . (cid:3) Selecting concentration proﬁles and adding the ﬁrst large frequency atom.

We recall thatwe decomposed the essentially singular sequence of data (cid:8) ( A n , φ n )[0] (cid:9) n ∈ N into frequency atoms A n [0] = Λ X a = A na [0] + A n Λ [0] ,φ n [0] = Λ X a = φ na [0] + φ n Λ [0] , where Λ was chosen such that X a ≥ Λ + lim sup n →∞ E ( A na , φ na ) < ε . Moreover, we remind the reader that the frequency atoms split the errors (cid:0) A n Λ , φ n Λ (cid:1) [0] into Λ + (cid:0) A n j Λ , φ n j Λ (cid:1) [0], 1 ≤ j ≤ Λ +

1, ordered by the size of | ξ | in their Fourier supports. Having established control over the evolution of the data (cid:0) A n Λ , φ n Λ (cid:1) [0] in the preceding subsec-tions, we now add the components (cid:0) A n , φ n (cid:1) [0], i.e. we pass to the initial data(7.42) (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0] . Here we ﬁrst have to understand the lack of compactness of the “large” added term (cid:0) A n , φ n (cid:1) [0].To this end we carry out a careful proﬁle decomposition in physical space of the added data (cid:0) A n , φ n (cid:1) [0]. To obtain a proﬁle decomposition for the magnetic potential components A n j [0], j = , . . . ,

4, we just use the standard Bahouri-G´erard method [1] to extract the proﬁles via the freewave evolution. However, for the φ ﬁeld, we mimic [20] and select the concentration proﬁles byevolving the data φ n [0] using the following “covariant” wave operator(7.43) e (cid:3) A n : = (cid:3) + i (cid:0) A n Λ ,ν + A n , f ree ν (cid:1) ∂ ν . Here, the functions A n , f ree ν are deﬁned as the solutions to the free wave equation  (cid:3) A n , f reej = , A n , f reej [0] = A n j [0]for j = , . . . ,

4, while we simply put A n , f ree ≡ . It follows from standard results that the solution u to e (cid:3) A n u = u [0] ∈ ˙ H x ( R ) × L x ( R ) exists globally in time. Moreover, the parametrix construc-tion from Section 3 together with suitable divisibility arguments yields that this solution satisﬁesthe global S norm bound(7.44) k u k S ( R × R ) . E crit k∇ t , x u (0) k L x . At this point we emphasize that both the inﬂuence of the evolution of the low frequency mag-netic potential A n Λ and the inﬂuence of the free wave evolution of the data A n [0] are built into the“covariant” wave operator e (cid:3) A n . This is di ﬀ erent from the situation for critical wave maps in [20],where only the corresponding low frequency components are built into the “covariant” wave oper-ator there, see Deﬁnition 9.18 in [20]. The reason for this is that the interaction term A n ν ∂ ν φ n where both factors are essentially supported at frequency ∼

1, cannot be bounded due to the con-tribution of the free term A n , f ree ν . Thus, the φ ﬁeld experiences not only an “asymptotic” twistingdue to the contribution of the extremely low frequency components A n Λ (as is the case for criticalwave maps), but also from the frequency ∼ A n , f ree ν . This needs to be reﬂected by our choiceof concentration proﬁles.An important fact about the wave operator e (cid:3) A n is that solutions to e (cid:3) A n u = n → ∞ . By rescaling we may assume that λ n = (cid:0) A n , φ n (cid:1) [0] is uniformly concentrated around | ξ | ∼ ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 87

Lemma 7.8.

Assume that the Schwartz data u [0] is essentially supported at frequency | ξ | ∼ with k u [0] k ˙ H x × L x . . Moreover, assume that A n is -oscillatory and that A n Λ satisﬁes a uniform S norm bound lim sup n →∞ (cid:13)(cid:13)(cid:13) A n Λ (cid:13)(cid:13)(cid:13) S < ∞ as well as sup n (cid:13)(cid:13)(cid:13) A n j [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x < ∞ for j = , . . . , . Let e (cid:3) A n be deﬁned as in (7.43) . Then thesolutions u ( t , x ) of the linear problem (with implicit n dependence suppressed) e (cid:3) A n u = with ﬁxed initial data u [0] satisfy (7.45) lim R → + ∞ lim n →∞ sup t ∈ R + (cid:12)(cid:12)(cid:12) k∇ t , x u ( R + t , · ) k L x − k∇ t , x u ( R , · ) k L x (cid:12)(cid:12)(cid:12) = . The same holds even when replacing + ∞ by −∞ and R + by R − . Furthermore, assume that u k is asequence of solutions to (again suppressing the n dependence) e (cid:3) A n u k = , supported at frequency | ξ | ∼ (in the sense of -oscillatory), and satisfying S norm bounds uni-form in k, while u is as above (with ﬁxed data u [0] ). Then we have (7.46) lim R → + ∞ lim n →∞ sup t ∈ R + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z R (cid:16) ∇ t , x u ( t + R , x ) · ∇ t , x u k ( t , x ) − ∇ t , x u ( R , x ) · ∇ t , x u k (0 , x ) (cid:17) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = uniformly in k, and the same holds when replacing + ∞ by −∞ , and R + by R − . In the proof of Lemma 7.8 we shall need the following uniform dispersive type bounds. Thesewill also play a crucial role to control the interactions of the concentration proﬁles to be discussedbelow. Note that this is an analogue of Proposition 9.20 in [20] and is proved in an analogousfashion.

Lemma 7.9.

Let u [0] ∈ ˙ H x ( R ) × L x ( R ) be ﬁxed initial data and consider the solution u ( t , x ) ofthe linear problem e (cid:3) A n u = with given data u [0] at time t = . Then for any γ > , there exists a decompositionu = u + u such that k u k S < γ and there exists a time t = t (cid:0) u [0] , γ (cid:1) such that for any | t | > t , k u ( t , · ) k L ∞ x < γ. Proof.

We ﬁrst prove the dispersive type bounds for solutions to the microlocalized magnetic waveequation (cid:3) pA u ≡ (cid:3) u + i X k ∈ Z P ≤ k − C A f reej P k ∂ j u = u [0] = ( f , g ) ∈ ˙ H x ( R ) × L x ( R ). Here, the spatial components A f reej of themagnetic potential are in Coulomb gauge and are solutions to the free wave equation. We recallthat the magnetic wave operator (cid:3) pA was treated in detail in Section 3. The asserted dispersive typebounds for solutions to e (cid:3) A n u = The main di ﬀ erence over the argument for wave maps in [20, Proposition 9.20] is that we needto use a nested double iteration, on account of the fact that our parametrix for (cid:3) pA u = ψ ± ( t , x , ξ ) deﬁned in (3.17) forthe construction of the parametrix for, say, the frequency 0 mode is truncated to low frequencies k ≤ − C σ . This generates the additional error terms2 i X − C σ ≤ k < P k A f reej P ∂ j u . These can only be iterated away by using divisibility, i.e. by restricting to a ﬁnite number of suitabletime intervals. In fact, due to the summation over k ∈ [ − C σ , C (and also depends on the energy and σ , of course). Now we formallydenote the (exact) Duhamel propagator for the equation (cid:3) pA u = F by u ( t , · ) = Z t ˜ U ( t − s ) F ( s ) ds . Moreover, we denote by J , J , . . . , J N the partition of the forward time axis [0 , ∞ ) into consecutivetime intervals on which the error terms N lh ( u ) : = i X m ∈ Z X − C σ + m ≤ k < m P k A f reej P m ∂ j u as well as the remaining errors generated by the parametrix ˜ U need to be handled by divisibility. Asobserved before, their number depends linearly on C and implicitly on the energy and σ . We write J i = [ t i , t i + ] for 0 ≤ i ≤ N − t = J N = [ t N , ∞ ). Then, proceeding by exact analogy tothe proof of Proposition 9.20 in [20], we can write for u ( i ) : = u | J i , u ( i ) = ∞ X l = u ( J i , l ) , u ( J i , ( t ) = ˜ S ( t − t i ) u ( i − [ t i ] , u ( J i , l ) ( t ) = − Z tt i ˜ U ( t − s ) N lh ( u ( J i , l − )( s ) ds , where ˜ S is the homogeneous data propagator for (cid:3) pA , while ˜ U is the homogeneous propagator fordata of the special form (0 , g ). Then the inductive nature of the construction is revealed by therelation (see (9.74) in [20]) u ( J i , ( t ) = ˜ S ( t )( f , g ) − i − X k = ∞ X l = Z t k + t k ˜ U ( t − s ) N lh ( u ( J k , l ) )( s ) ds . The new aspect of our setting is that the propagators ˜ U , ˜ S themselves are only obtained as inﬁniteconvergent sums of further terms, which need to be analyzed. Our strategy is to reduce preciselyto the situation treated in [20], by using the error analysis in [22]. Thus, denote the approximateinhomogeneous Duhamel parametrix by Z t ˜ U ( app ) ( t − s ) F ( s ) ds . Note that due to Proposition 7 in [22], the parametrix ˜ U ( app ) ( t − s ) is given by an integral kernelthat satisﬁes the same decay estimates as the standard d’Alembertian propagator, independent of ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 89 the precise potential A f ree used (but with implicit constants depending on its energy, of course).Then recall from the proof of Theorem 4 in [22] that we may write Z t ˜ U ( t − s ) F ( s ) ds = ∞ X j = Z t ˜ U ( app ) ( t − s ) F j ( s ) ds , where we have F = F and writing inductively B j : = Z t ˜ U ( app ) ( t − s ) F j ( s ) ds , we have for j ≥ F j = F j + F j + F j + F j with (schematic notation following [22]) P F j = (cid:16) (cid:3) pA < e − i ψ ± < ( t , xD ) − e − i ψ ± < ( t , xD ) (cid:3) (cid:17) P B j − , P F j = (cid:16) e − i ψ ± < ( t , x , D ) e i ψ ± < ( D , y , t ) − (cid:17) P F j − , P F j = (cid:16) e − i ψ ± < ( t , x , D ) | D | − e i ψ ± < ( D , y , t ) − | D | − (cid:17) P ∂ t F j − , P F j = (cid:16) e − i ψ ± < ( t , x , D ) | D | − ∂ t e i ψ ± < ( D , y , t ) − | D | − (cid:17) P F j − . Here the ﬁrst term, which is treated in Section 10.2 in [22], gains a smallness factor of the form2 − σ C , which of course overwhelms any losses polynomial in C for C ≫

1. However, the re-maining three terms do not gain smallness from C , but rather by divisibility, and so we have to bemore careful to force smallness for them (we cannot make the number of intervals depend on theprescribed smallness threshold γ ). Here we exploit the fact that due to Proposition 6 in [22], thekernels of the operators12 (cid:16) e − i ψ ± < ( t , x , D ) e i ψ ± < ( D , y , t ) − (cid:17) , (cid:16) e − i ψ ± < ( t , x , D ) | D | − e i ψ ± < ( D , y , t ) − | D | − (cid:17) , (cid:16) e − i ψ ± < ( t , x , D ) | D | − ∂ t e i ψ ± < ( D , y , t ) − | D | − (cid:17) are rapidly decaying away from the diagonal x = y . This means that up to small errors (which maybe incorporated into the small energy part of u ), we may think of these operators as local ones, andthen the estimates in the proof of Proposition 9.20 in [20] which rely on the inductive bound (9.81)there, go through for the error terms F rj , r = , ,

4, as long as F rj − , r = , ,

4, satisfy these bounds.This means that the inductive argument in [20] goes through here as well. (cid:3)

We are now in a position to prove the asymptotic energy conservation for solutions to e (cid:3) A n u = Proof of Lemma 7.8.

We consider the natural energy functional E A n ( u )( t ) = Z R X α = (cid:12)(cid:12)(cid:12)(cid:0) ∂ α u + i (cid:0) A n Λ ,α + A n , f ree α (cid:1) u (cid:1) ( t , x ) (cid:12)(cid:12)(cid:12) dx , where it is to be kept in mind that the potential A is in Coulomb gauge. Di ﬀ erentiating this energyfunctional with respect to t and using that e (cid:3) A n u =

0, we infer the following relation E A n ( u )( R + T ) − E A n ( u )( R ) = Re Z R + TR Z R (cid:0) ∂ t A n Λ , (cid:1) u (cid:0) ∂ t + iA n Λ , (cid:1) u dx dt + Re Z R + TR Z R (cid:16) − (cid:0) A n Λ , (cid:1) + X j (cid:0) A n Λ , j + A n , f reej (cid:1) (cid:17) u (cid:0) ∂ t + iA n Λ , (cid:1) u dx dt + X j Re Z R + TR Z R (cid:0) ∂ j + i (cid:0) A n Λ , j + A n , f reej (cid:1)(cid:1) u i (cid:0) ∂ t A n Λ , j + ∂ t A n , f reej − ∂ j A n Λ , (cid:1) u dx dt . (7.47)We now show that uniformly in T ≥

0, the terms on the right hand side converge to zero as n → ∞ and then R → + ∞ .The quartic and quintic terms are all expected to be straightforward and so we focus on the moredi ﬃ cult cubic interaction terms. Here we note that the cubic interaction terms Z R + TR Z R (cid:0) ∂ t A n Λ , (cid:1) u ∂ t u dx dt and X j Z R + TR Z R ∂ j u i (cid:0) ∂ j A n Λ , (cid:1) u dx dt are also easier to treat due to the inherent quadratic nonlinear struture of the temporal components A n Λ , as solutions to the elliptic compatibility equation of MKG-CG.So we now consider the delicate cubic interaction terms X j Re Z R + TR Z R ∂ j u i (cid:0) ∂ t A n Λ , j (cid:1) u dx dt = − X j Z R + TR Z R Im (cid:0) ∂ j uu (cid:1) (cid:0) ∂ t A n Λ , j (cid:1) dx dt , (7.48) X j Re Z R + TR Z R ∂ j u i (cid:0) ∂ t A n , f reej (cid:1) u dx dt = − X j Z R + TR Z R Im (cid:0) ∂ j uu (cid:1) (cid:0) ∂ t A n , f reej (cid:1) dx dt . (7.49)We begin with the ﬁrst term (7.48). The Coulomb condition satisﬁed by ∂ t A n Λ , j allows us to projectthe term Im (cid:0) ∂ j uu (cid:1) onto its divergence-free part, which means that we can replace this by a nullform of the schematic type Im (cid:0) ∂ j uu (cid:1) −→ ∆ − ∂ i N i j (cid:0) u , u (cid:1) . Thus we reduce to bounding uniformly the following schematic integral X j Z R + TR Z R ∆ − ∂ i N i j (cid:0) u , u (cid:1) (cid:0) ∂ t A n Λ , j (cid:1) dx dt . Now we claim the microlocalized bound (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z R + TR Z R ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k (cid:0) ∂ t A n Λ , j (cid:1) dx dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . σ (min { k , k , k }− max { k , k , k } ) (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k A n Λ (cid:13)(cid:13)(cid:13) S for suitable σ >

0. Since there are at least two comparable frequencies in the above, this is enoughto give the desired result in view of the frequency localizations of u and A n Λ . In order to prove ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 91 this, we localize the above expression further and also omit the localization to the time interval[ R , R + T ], as we may get rid of it via a suitable cuto ﬀ (which is compatible with the S norms), Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k (cid:0) ∂ t A n Λ , j (cid:1) dx dt = Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q > k (cid:0) ∂ t A n Λ , j (cid:1) dx dt + Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q ≤ k (cid:0) ∂ t A n Λ , j (cid:1) dx dt . Here we only estimate the more di ﬃ cult second term on the right hand side. We write this term as Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q ≤ k (cid:0) ∂ t A n Λ , j (cid:1) dx dt = X l ≤ k Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q l (cid:0) ∂ t A n Λ , j (cid:1) dx dt . By symmetry we may assume k ≤ k . Then we distinguish the following cases. Case 1: k = k + O (1) > k + O (1). Since the Q l transfers to the null form N i j , we save2 k − k + ( l − k ) . Thus, we obtain (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q l (cid:0) ∂ t A n Λ , j (cid:1) dx dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . k − k + ( l − k ) − k (cid:13)(cid:13)(cid:13) P k ∇ x u (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P k ∇ x u (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P k Q l (cid:0) ∂ t A n Λ , j (cid:1)(cid:13)(cid:13)(cid:13) L t L x , where we observe that the exponent pair (4 ,

3) is Strichartz admissible in four space dimensions.Then we use the improved Sobolev type bound (cid:13)(cid:13)(cid:13) P k Q l (cid:0) ∂ t A n Λ , j (cid:1)(cid:13)(cid:13)(cid:13) L t L x . k γ ( l − k ) (cid:13)(cid:13)(cid:13) P k Q l (cid:0) ∂ t A n Λ , j (cid:1)(cid:13)(cid:13)(cid:13) L t L x for suitable γ > (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q l (cid:0) ∂ t A n Λ , j (cid:1) dx dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . k − k + ( l − k ) − k k k − l k γ ( l − k ) (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k A n Λ (cid:13)(cid:13)(cid:13) S , which in turn can be bounded by . ( k − k ) γ ( l − k ) (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k A n Λ (cid:13)(cid:13)(cid:13) S . Summing over l ≤ k yields the desired bound in this case. Case 2: k = k + O (1) > k . We distinguish between the cases l ≤ k and l > k . Case 2a: l ≤ k . Here we estimate (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q l (cid:0) ∂ t A n Λ , j (cid:1) dx dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . − k ( l − k ) (cid:13)(cid:13)(cid:13) ∇ t , x P k u (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) ∇ t , x P k u (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P k Q l (cid:0) ∂ t A n Λ (cid:1)(cid:13)(cid:13)(cid:13) L t L x . − k k ( l − k ) − l (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k A n Λ (cid:13)(cid:13)(cid:13) S . To get summability over l one can replace the norm k · k L t L x by k · k L t L + x and then use k · k L t L − x insteadfor the second factor. Case 2b: l > k . Here we simply get the bound (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) Z R + ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k Q l (cid:0) ∂ t A n Λ , j (cid:1) dx dt (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . ( k − k ) (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k u (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k A n Λ (cid:13)(cid:13)(cid:13) S , which can then be summed over k > l > k to give the desired bound. This in essence ﬁnishes theestimate of the cubic interaction term (7.48).Next, we consider the other delicate cubic interaction term (7.49). Using that ∂ t A n , f reej alsosatisﬁes the Coulomb condition, we reduce as before to bounding uniformly the expression X j Z R + TR Z R ∆ − ∂ i N i j (cid:0) P k u , P k u (cid:1) P k (cid:0) ∂ t A n , f reej (cid:1) dx dt . Compared with the treatment of the previous cubic term, the issue here is how to deal with theinteractions of u and A n , f ree , which are now both 1-oscillatory. We may assume that all frequencies2 k , , ∼

1, otherwise smallness follows from the treatment of the previous cubic interaction term(7.48). Choosing R > ﬃ ciently large, we obtain from the dispersive decay from Lemma 7.9and interpolation with the endpoint Strichartz estimate that (cid:13)(cid:13)(cid:13) P k , u (cid:13)(cid:13)(cid:13) L t L + x ([ R , R + T ] × R ) ≪ T ≥ n (recalling that the implicit dependence of u on n is suppressed). Onthe other hand, for the factor P k (cid:0) ∂ t A n , f reej (cid:1) , we can use L t L − x instead.The last statement of the lemma follows similarly, by expressing the inner product in terms ofthe energies of u and u k , and reducing to bounding expressions such as X j Z R + TR Z R ∆ − ∂ i N i j (cid:0) P k u , P k u k (cid:1) P k (cid:0) ∂ t A n , f reej (cid:1) dx dt . (cid:3) We now begin to quantify the lack of compactness for the functions (cid:8) ( A n , φ n )[0] (cid:9) n ∈ N . To clarifythe notation and make it adapted to the ensuing induction procedure, we replace the superscript 1in ( A n , φ n )[0] by a to indicate the frequency level of the large frequency atom, although we areonly considering a = (cid:8) φ na [0] (cid:9) n ∈ N . We evolveeach of these using the ﬂow of the covariant wave operator e (cid:3) A na and extract concentration proﬁles.The method for this follows along the lines of the modiﬁed Bahouri-G´erard proﬁle extraction pro-cedure of Lemma 9.23 in [20]. However, we have to use the asymptotic energy conservation fromLemma 7.8 instead of the stronger asymptotic energy conservation in [20, Lemma 9.19], whichforces us to modify the asymptotic orthogonality relation for the free energies of the proﬁles. Weﬁrst introduce the following terminology. Deﬁnition 7.10.

Given initial data u [0] ∈ ˙ H x ( R ) × L x ( R ) , we denote byS A na (cid:0) u [0] (cid:1) the solution to the initial value problem e (cid:3) A na u = with data u [0] at time t = . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 93

Following [20], which in turn mimics [1], we introduce the set U A na ( φ na [0]), which consists ofall functions that can be extracted as weak limits in the following fashion U A na ( φ na [0]) = n V ∈ L t , loc H x ∩ C L x : ∃ (cid:8) ( t n , x n ) (cid:9) n ≥ ⊂ R × R s.t. S A na (cid:0) φ na [0] (cid:1) ( t + t n , x + x n ) ⇀ V ( t , x ) o . Here the weak limit is in the sense of L t , loc H x . We emphasize that the sequences (cid:8) ( t n , x n ) (cid:9) n ≥ ⊂ R × R are completely arbitrary. We observe that for a non-trivial proﬁle V ∈ U A na ( φ na [0]) withassociated sequence of space-time translations (cid:8) ( t abn , x abn ) (cid:9) n ≥ , by passing to a further subsequence,we may assume that either S (cid:0) A na [0] (cid:1) ( t + t abn , x + x abn ) ⇀ S (cid:0) A na [0] (cid:1) ( t + t abn , x + x abn ) ⇀ A ab ( t , x ).Here, S (cid:0) · (cid:1) ( t , x ) denotes the free wave propagator and A ab are free waves, see Proposition 7.12 below.Noting that the contribution of A n Λ in the deﬁnition of e (cid:3) A na vanishes in the limit, then in the formercase we have (cid:3) V =

0, i.e. V is actually a weak solution to the free wave equation, while in the lattersituation, V solves the linear magnetic wave equation (cid:0) (cid:3) + iA abj ∂ j (cid:1) V = η A na (cid:0) φ na [0] (cid:1) : = sup n E ( V ) : V ∈ U A na (cid:0) φ na [0] (cid:1)o < ∞ , where E refers to the functional E ( V ) = Z R (cid:12)(cid:12)(cid:12) ∇ t , x V (0 , x ) (cid:12)(cid:12)(cid:12) dx . Observe that for temporally unbounded sequences, i.e. | t n | → ∞ , the energy E ( V ) is identi-cal to the “asymptotic free energy” associated with solutions to (cid:0) (cid:3) + iA abj ∂ j (cid:1) V = (cid:8) φ na [0] (cid:9) n ∈ N , which is at the core of the second stage of the modiﬁed Bahouri-G´erardprocedure for MKG-CG. Recall that we consider a = Proposition 7.11.

There exists a collection of sequences { ( t abn , x abn ) } n ∈ N ⊂ R × R , b ≥ , as well asa corresponding family of concentration proﬁles φ ab [0] ∈ ˙ H x ( R ) × L x ( R ) , b ≥ , with the following properties: Introducing the space-time translated gauge potentials ˜ A nab ν ( t , x ) : = A na Λ ,ν ( t + t abn , x + x abn ) + A na , f ree ν ( t + t abn , x + x abn ) , ν = , , . . . , , we have • For any B ≥ , there exists a decomposition (7.50) S A na (cid:0) φ na [0] (cid:1) ( t , x ) = B X b = S ˜ A nab (cid:0) φ ab [0] (cid:1) ( t − t abn , x − x abn ) + φ naB ( t , x ) , where each of the functions ˜ φ nab ( t , x ) : = S ˜ A nab (cid:0) φ ab [0] (cid:1) ( t − t abn , x − x abn ) , φ naB ( t , x ) solves the covariant wave equation e (cid:3) A na u = . Moreover, the error satisﬁes the crucial asymptotic vanishing condition (7.51) lim B →∞ η A na (cid:0) φ naB [0] (cid:1) = . • The sequences are mutually divergent, by which we mean that for b , b ′ , (7.52) lim n →∞ (cid:0) | t abn − t ab ′ n | + | x abn − x ab ′ n | (cid:1) = ∞ . • There is asymptotic energy partition (7.53) E ( φ na [0]) = B X b = E ( ˜ φ nab [0]) + E ( φ naB [0]) + o (1) , where the meaning of o (1) here is lim sup n →∞ o (1) = . • All proﬁles φ ab [0] as well as all errors φ naB [0] are -oscillatory. Before we begin with the proof of Proposition 7.11, we introduce the following important dis-tinction between two possible types of proﬁles. • Temporally unbounded proﬁles:

Those proﬁles for whichlim n →∞ | t abn | = ∞ . • Temporally bounded proﬁles:

Those proﬁles for whichlim inf n →∞ | t abn | < ∞ . By passing to a subsequence we may then as well assume that for all n ∈ N , t abn = . For two distinct such proﬁles corresponding to b , b ′ , we must havelim n →∞ | x abn − x ab ′ n | = ∞ . Proof of Proposition 7.11.

There is nothing to do if η A na (cid:0) φ na [0] (cid:1) = . Let us therefore assume that this quantity is strictly greater than 0. Then we pick a proﬁle φ a ∈ L t , loc H x ∩ C L x and an associated sequence (cid:8) ( t a n , x a n ) (cid:9) n ∈ N ⊂ R × R such that(7.54) S A na (cid:0) φ na [0] (cid:1) ( t + t a n , x + x a n ) ⇀ φ a ( t , x )with E ( φ a ) ≥ η A na (cid:0) φ na [0] (cid:1) . Then we have S A na (cid:0) φ na [0] (cid:1) ( t + t a n , x + x a n ) − S A na (cid:16) S ˜ A na (cid:0) φ a [0] (cid:1) (0 − t a n , · − x a n ) (cid:17) ( t + t a n , x + x a n ) = S A na (cid:0) φ na [0] (cid:1) ( t + t a n , x + x a n ) − S ˜ A na (cid:0) φ a [0] (cid:1) ( t , x ) ⇀ n → ∞ by the construction. Furthermore, it holds that E (cid:0) φ na [0] (cid:1) = E (cid:16) ˜ φ na [0] (cid:17) + E (cid:16) φ na [0] − ˜ φ na [0] (cid:17) + Z R ∇ t , x (cid:16) S ˜ A na (cid:0) φ a [0] (cid:1) (0 − t a n , x − x a n ) (cid:17) ·· ∇ t , x (cid:16) φ na (0 , x ) − S ˜ A na (cid:0) φ a [0] (cid:1) (0 − t a n , x − x a n ) (cid:17) dx , (7.55) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 95 where in the last term we ignored that the φ ﬁeld is complex-valued. If φ a [0] is a temporallyunbounded proﬁle, we may without loss of generality assume that t a n → + ∞ . In view of (7.46)from Lemma 7.8 the last term on the right hand side of (7.55) can be arbitrarily well approximatedby 2 Z Z R ∇ t , x S A na (cid:16) S ˜ A na (cid:0) φ a [0] (cid:1) (0 − t a n , · − x a n ) (cid:17) ( t − R + t a n , x + x a n ) ·· ∇ t , x S A na (cid:16) φ na [0] − S ˜ A na (cid:0) φ a [0] (cid:1) (0 − t a n , · − x a n ) (cid:17) ( t − R + t a n , x + x a n ) dx dt (7.56)as n → ∞ by choosing R > ﬃ ciently large. Then we observe that the ﬁrst factor in the integrandin (7.56) satisﬁes ∇ t , x S A na (cid:16) S ˜ A na (cid:0) φ a [0] (cid:1) (0 − t a n , · − x a n ) (cid:17) ( t − R + t a n , x + x a n ) = ∇ t , x φ a ( t − R , x ) + o L x (1)as n → ∞ , while by construction S A na (cid:16) φ na [0] − S ˜ A na (cid:0) φ a [0] (cid:1) (0 − t a n , · − x a n ) (cid:17) ( · + t a n , · + x a n ) ⇀ L t , loc H x as n → ∞ . Thus, we conclude that E (cid:0) φ na [0] (cid:1) = E (cid:16) ˜ φ na [0] (cid:17) + E (cid:16) φ na [0] − ˜ φ na [0] (cid:17) + o (1)as n → ∞ . If instead φ a [0] is a temporally bounded proﬁle, we may and shall have t a n = n ∈ N . Then the last term on the right hand side of (7.55) is given by2 Z R ∇ t , x φ a (0 , x − x a n ) · ∇ t , x (cid:0) φ na (0 , x ) − φ a (0 , x − x a n ) (cid:1) dx , which vanishes as n → ∞ by the weak convergence (7.54) and therefore yields the desired asymp-totic energy partition (7.53).Now we repeat this procedure, but replace φ na [0] by φ na [0] − ˜ φ na [0] . Thus, if η A na (cid:0) φ na [0] − ˜ φ na [0] (cid:1) >

0, we select a sequence (cid:8) ( t a n , x a n ) (cid:9) n ∈ N and a concentration proﬁle φ a ( t , x ) such that E ( φ a ) ≥ η A na (cid:0) φ na [0] − ˜ φ na [0] (cid:1) and S A na (cid:0) φ na [0] − ˜ φ na [0] (cid:1) ( t + t a n , x + x a n ) ⇀ φ a ( t , x ) . We observe that we must necessarily havelim n →∞ (cid:0) | t a n − t a n | + | x a n − x a n | (cid:1) = ∞ . Iterating this process yields the decomposition (7.50) together with (7.52) and (7.53).Finally, we turn to proving the crucial asymptotic vanishing conditionlim B →∞ η A na (cid:0) φ naB [0] (cid:1) = . Here we observe that the ﬁxed proﬁles φ ab [0] satisfy φ ab (0 , x ) = S A na (cid:0) ˜ φ nab [0] (cid:1) (0 + t abn , x + x abn ) . Then the global S norm bounds (7.44) for solutions to the covariant wave equation e (cid:3) A na u = k∇ t , x φ ab (0 , · ) k L x . (cid:13)(cid:13)(cid:13) S A na (cid:0) ˜ φ nab [0] (cid:1)(cid:13)(cid:13)(cid:13) S . E crit k∇ t , x ˜ φ nab (0 , · ) k L x . E crit E ( ˜ φ nab [0]) , where the implied constant is independent of n . From the asymptotic energy partition (7.53) weconclude that for any B ≥

1, by passing to a subsequence in n , if necessary, we have B X b = lim sup n →∞ E ( ˜ φ nab [0]) ≤ lim sup n →∞ E ( φ na [0]) . E crit . Thus, we have that uniformly in B , B X b = E ( φ ab [0]) . E crit . By construction, the error η A na ( φ naB ) must therefore vanish as B → ∞ . This ﬁnishes the proof ofProposition 7.11. (cid:3) We emphasize that in the preceding linear proﬁle decomposition for the φ na ﬁelds, the asymp-totic energy partition (7.53) does not yield a sharp energy bound for the actual proﬁles φ ab [0] oftemporally unbounded character, which is in contrast to the standard Bahouri-G´erard proﬁle decom-position [1] and the modiﬁed Bahouri-G´erard proﬁle decomposition in the context of critical wavemaps [20, Lemma 9.23]. Fortunately, this will not doom the construction of the nonlinear concen-tration proﬁles, because there is a kind of “asymptotic orthogonality statement”, see Lemma 7.13,in particular (7.59). This will allow us to circumvent the problem.Having selected the linear concentration proﬁles for the φ na ﬁelds, it remains to pick correspond-ing proﬁles for the magnetic potential components A naj for j = , . . . ,

4. In fact, for the latter, wesimply use the standard Bahouri-G´erard method [1] to extract the proﬁles via the free wave evolu-tion. By passing to suitable subsequences, one obtains an intertwined linear proﬁle decompositionfor ( A na , φ na )[0]. Thus, the same sequences of space-time shifts { ( t abn , x abn ) } n ≥ , b ≥ , are being usedfor the linear concentration proﬁles for A na [0] and for φ na [0]. This will be crucial later on whenwe construct the associated nonlinear proﬁles, as the truly nonlinear behavior of both ( A , φ ) willbe exhibited in space-time boxes centered around the points ( t abn , x abn ), see Step 1 in the proof ofTheorem 7.14. We quote Proposition 7.12.

There exists a collection of sequences { ( t abn , x abn ) } n ∈ N ⊂ R × R , b ≥ , as well asa corresponding family of concentration proﬁlesA abj [0] ∈ ˙ H x ( R ) × L x ( R ) , b ≥ for j = , . . . , with the following properties: • For any B ≥ , we have a decompositionS (cid:0) A naj [0] (cid:1) ( t , x ) = B X b = S (cid:0) A abj [0] (cid:1) ( t − t abn , x − x abn ) + A naBj ( t , x ) , where S ( · )( t , x ) denotes the free wave propagator. Then each of the functionsS (cid:0) A abj [0] (cid:1) ( t − t abn , x − x abn ) , A naBj ( t , x ) solves the linear wave equation (cid:3) u = . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 97

Moreover, the error satisﬁes the crucial asymptotic vanishing condition (7.57) lim B →∞ η (cid:0) A naBj [0] (cid:1) = . • The sequences are mutually divergent, by which we mean that for b , b ′ , lim n →∞ (cid:0) | t abn − t ab ′ n | + | x abn − x ab ′ n | (cid:1) = ∞ . • There is asymptotic energy partitionE ( A naj [0]) = B X b = E ( A abj [0]) + E ( A naBj [0]) + o (1) , where the meaning of o (1) is lim sup n →∞ o (1) = . • All proﬁles A abj [0] as well as all errors A naBj [0] are -oscillatory. Moreover, they all satisfy theCoulomb condition. In the preceding propositions on the linear proﬁle decompositions for the φ na ﬁelds and forthe spatial components A naj of the connection form, we established an asymptotic orthogonalityof the proﬁles with respect to the standard free energy functional. However, for our inductionon energy procedure, we have to use the energy functional of the Maxwell-Klein-Gordon system,which involves nonlinear interactions between the φ ﬁeld and the connection form A . In the nextproposition we carefully analyze the asymptotic orthogonality relations of the linear proﬁles withrespect to this proper energy functional. Lemma 7.13.

Given any δ > , there exists B = B ( δ ) such that (7.58) lim sup n →∞ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E ( A na , φ na )(0) − B X b = E ( ˜ A nab , ˜ φ nab )(0) − E ( A naB , φ naB )(0) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) < δ , where E refers to the energy functional of the Maxwell-Klein-Gordon system. Here we denote ˜ φ nab : = S ˜ A nab (cid:0) φ ab [0] (cid:1) (0 − t abn , x − x abn ) , ˜ A nabj : = S (cid:0) A abj [0] (cid:1) (0 − t abn , x − x abn ) , j = , . . . , , and the temporal components ˜ A nab (0) are determined in terms of ˜ φ nab [0] via the elliptic compati-bility equation, and similarly for A naB (0) . In particular, if there are at least two non-zero concen-tration proﬁles ( A ab , φ ab )[0] (corresponding to two distinct values of b), then there exists δ > suchthat for all b, lim sup n →∞ E ( ˜ A nab , ˜ φ nab )(0) < E crit − δ. Moreover, for a temporally unbounded proﬁle ( ˜ A nab , ˜ φ nab ) with, say, t abn → + ∞ as n → ∞ , we have (7.59) E ( ˜ A nab , ˜ φ nab )(0) = E ( ˜ A nab , ˜ φ nab )( t abn − R b ) + κ ab ( n , R b ) , where lim R b → + ∞ lim sup n →∞ κ ab ( n , R b ) = . The B also depends on the sequence of linear concentration proﬁles, but we omit this dependency here. Proof.

We check the various interaction terms and show that they become small when choosing B as well as n su ﬃ ciently large. (1) Two temporally bounded proﬁles. This is straightforward since lim n →∞ | x abn − x ab ′ n | = ∞ . In fact,we immediately infer that schematicallylim n →∞ X temporally bounded proﬁles , b , b ′ (cid:12)(cid:12)(cid:12)(cid:12) Z R Re (cid:0) ( ∂ α ˜ φ nab + i ˜ A nab α ˜ φ nab ) · ( ∂ α ˜ φ nab ′ + i ˜ A nab ′ α ˜ φ nab ′ ) (cid:1) dx (cid:12)(cid:12)(cid:12)(cid:12) + lim n →∞ X temporally bounded proﬁles , b , b ′ X j = (cid:12)(cid:12)(cid:12)(cid:12) Z R ∇ t , x ˜ A nabj · ∇ t , x ˜ A nab ′ j dx (cid:12)(cid:12)(cid:12)(cid:12) + lim n →∞ X temporally bounded proﬁles , b , b ′ X j = (cid:12)(cid:12)(cid:12)(cid:12) Z R ∇ x ˜ A nab · ∇ x ˜ A nab ′ dx (cid:12)(cid:12)(cid:12)(cid:12) = . (2) One temporally bounded and one temporally unbounded proﬁle. Here we exploit that the ampli-tude of the temporally unbounded proﬁle vanishes asymptotically (at time t =

0) as n → ∞ , whilethe temporally bounded proﬁle has bounded support. We conclude that schematicallylim n →∞ X b temporally bounded b ′ temporally unbounded (cid:12)(cid:12)(cid:12)(cid:12) Z R Re (cid:0) ( ∂ α ˜ φ nab + i ˜ A nab α ˜ φ nab ) · ( ∂ α ˜ φ nab ′ + i ˜ A nab ′ α ˜ φ nab ′ ) (cid:1) dx (cid:12)(cid:12)(cid:12)(cid:12) + lim n →∞ X b temporally bounded b ′ temporally unbounded 4 X j = (cid:12)(cid:12)(cid:12)(cid:12) Z R ∇ t , x ˜ A nabj · ∇ t , x ˜ A nab ′ j dx (cid:12)(cid:12)(cid:12)(cid:12) + lim n →∞ X b temporally bounded b ′ temporally unbounded 4 X j = (cid:12)(cid:12)(cid:12)(cid:12) Z R ∇ x ˜ A nab · ∇ x ˜ A nab ′ dx (cid:12)(cid:12)(cid:12)(cid:12) = . (3) Two temporally unbounded proﬁles. Here we exploit the asymptotic energy conservation andthat the functions φ ab [0] , S ˜ A nab ′ (cid:0) φ ab ′ [0] (cid:1) ( t abn − t ab ′ n , x − x ab ′ n )are asymptotically orthogonal. Similarly, we argue for the interaction terms between the compo-nents of the proﬁles ˜ A nab and ˜ A nab ′ . (4) Weakly small error φ naB and proﬁles. This is handled like the interaction of a temporallybounded and a temporally unbounded proﬁle. One uses the fact that we get φ naB = φ naB + φ naB , where we have the bounds (cid:13)(cid:13)(cid:13) φ naB (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x < δ , (cid:13)(cid:13)(cid:13) ∇ t , x φ naB (cid:13)(cid:13)(cid:13) L ∞ t L x < δ , provided B is su ﬃ ciently large. Of course, choosing B large means that more and more inter-actions have to be controlled, and we can no longer simply use the choice of extremely large n to ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 99 “asymptotically kill” all such interactions as in the preceding cases. Thus, one has to argue carefullyas follows: Given δ >

0, we pick ˜ B su ﬃ ciently large such that for any B ≥ ˜ B , we havelim sup n →∞ B X b = ˜ B (cid:16) E ( ˜ φ nab ) + E ( ˜ A nab ) (cid:17) ≪ δ , where E indicates the standard free energy. Then, passing to the interaction terms in the Maxwell-Klein-Gordon energy functional corresponding to φ naB and A naB with the sum B X b = ˜ B ˜ φ nab , B X b = ˜ B ˜ A nab leads to terms bounded by ≪ δ for any B ≥ ˜ B , provided n is chosen su ﬃ ciently large (dependingon B ). But then picking B large enough, we can also ensure that the sum of all the interactionsin E ( A , φ ) generated by the proﬁles ˜ φ nab , ˜ A nab , 1 ≤ b ≤ ˜ B are small, since B ≥ ˜ B can be chosenindependently.The last assertion (7.59) is again a consequence of the asymptotic energy conservation fromLemma 7.8 and the asymptotic vanishing of the amplitude of a temporally unbounded proﬁle at t = n → ∞ . (cid:3) We now begin with the construction of the nonlinear concentration proﬁles. In what follows, weassume that the linear concentration proﬁles (cid:0) A ab , φ ab (cid:1) [0], b ≥

1, have been chosen, as well as theparameter sequences (cid:8) ( t abn , x abn ) (cid:9) n ≥ . We recall that when the proﬁle is temporally bounded, i.e.lim sup n →∞ | t abn | < ∞ , we may and shall have t abn = φ nab ( t , x ) : = S ˜ A nab (cid:0) φ ab [0] (cid:1) ( t − t abn , x − x abn ) , ˜ A nabj ( t , x ) : = S (cid:0) A abj [0] (cid:1) ( t − t abn , x − x abn ) , j = , . . . , . Thus, if the proﬁle is temporally bounded, it holds that˜ A nab [0] = A ab [0] , ˜ φ nab [0] = φ ab [0] . We can now state the key result of this subsection.

Theorem 7.14.

Let a = . Assume that there exist at least two non-zero proﬁles (cid:0) A ab , φ ab (cid:1) [0] , orall such proﬁles are zero, or else there exists only one such proﬁle but with lim inf n →∞ E ( ˜ A nab , ˜ φ nab )(0) < E crit . Then the initial data (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0] can be evolved globally in time, resulting in a solutionwith ﬁnite S norm bounds uniformly for all su ﬃ ciently large n.Proof. We proceed in several steps.

Step 1:

Construction of the nonlinear concentration proﬁles.

We distinguish between temporallybounded and unbounded ( ˜ A nab , ˜ φ nab ). In what follows we shall use the notation A na , low : = A n Λ , φ na , low : = φ n Λ .

00 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Temporally bounded case:

Here we have ( ˜ A nab [0] , ˜ φ nab [0]) = ( A ab [0] , φ ab [0]) with A ab as usual inthe Coulomb gauge. Then we deﬁne the nonlinear concentration proﬁle (cid:0) A nab , Φ nab (cid:1) as follows. Pick a large time T b > − T b , T b ] × R , we deﬁnethe proﬁles to be the solutions ( A ab , Φ ab ) to the MKG-CG system with data ( A ab , φ ab )[0] at time t =

0, which exist globally in time by Lemma 7.13 and the assumption of the theorem with a globalﬁnite S norm bound (cid:13)(cid:13)(cid:13)(cid:0) A ab , Φ ab (cid:1)(cid:13)(cid:13)(cid:13) S < ∞ . Here the proﬁles do not depend on n , but we include this superscript since the proﬁles on the restof space-time will be n -dependent. On the complement [ − T b , T b ] c × R , we deﬁne the proﬁles asfollows. On [ T b , ∞ ) × R , we let (cid:3) A nab = A ab [ T b ] given by the proﬁle constructed on [ − T b , T b ] × R , and we proceed analogouslyon ( −∞ , − T b ] × R . As for the Φ -ﬁeld, we postulate on [ − T b , T b ] c × R the linear equation (cid:3) A na , low + P Bb ′ = A nab ′ + A naB Φ nab = T b , respectively − T b , by the proﬁle on [ − T b , T b ] × R . Note that in orderfor this to make sense, we also need to know the deﬁnition of the temporally unbounded A nab ′ ,which is, of course, accomplished below without knowing the temporally bounded Φ nab to avoidcircularity. Temporally unbounded case:

Assume, for example, that lim n →∞ t abn = + ∞ . Using Lemma 7.13 andthe assumption of the theorem, we can pick R b > ﬃ ciently large such that˜ φ nab ( t abn − R b , · ) = S ˜ A nab ( φ ab [0])( − R b , · − x abn )satisﬁes E (cid:16) S ( A ab [0])( − R b , · − x abn ) , S ˜ A nab ( φ ab [0])( − R b , · − x abn ) (cid:17) < E crit . Then we use the data (cid:16) S ( A ab [0])[ − R b ]( · − x abn ) , S ˜ A nab ( φ ab [0])[ − R b ]( · − x abn ) (cid:17) at time t = t abn − R b , and evolve them forward in time using the MKG-CG system up to time t abn + R b ,say, resulting in the nonlinear proﬁles (cid:0) A nab , Φ nab (cid:1) on [ t abn − R b , t abn + R b ] × R . Observe that this construction does not require knowledge of the otherproﬁles (cid:0) A nab ′ , Φ nab ′ (cid:1) . Finally, on the complement [ t abn − R b , t abn + R b ] c × R , we evolve A nab viathe free equation (cid:3) A nab =

0, and Φ nab via the linear evolution (cid:3) A na , low + P Bb ′ = A nab ′ + A naB Φ nab = , with data given at time t abn − R b , respectively t abn + R b , by the proﬁles constructed on [ t abn − R b , t abn + R b ] × R . Step 2:

Making an ansatz for the evolution ( A n , Φ n ) of the full data (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0] . Wenow assemble the pieces that we have constructed. We shall write(7.60) A n : = A na , low + B X b = A nab + A naB + δ nA , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 101 where A naB is actually simply given by A naB from Proposition 7.12. We immediately observe thecrucial fact that δ nA [0] = , i.e. the choice of proﬁles matches the data. We proceed analogously for Φ n , writing(7.61) Φ n : = φ na , low + B X b = Φ nab + Φ naB + δ n Φ , where Φ naB is actually simply given by φ naB from Proposition 7.11. We ﬁnally observe that bytruncating the frequency support of the data of the Φ nab to a set {| ξ | ≤ K } for some very large K andincorporating the error into δ n Φ , we may assume that the Φ nab have frequency support in | ξ | ≤ K upto (slowly) exponentially decaying tails. This will be of use later on when controlling the errors. Step 3:

Showing accuracy of the ansatz.

Here we ﬁnally prove the following key proposition.

Proposition 7.15.

Assuming the conditions of Theorem 7.14 and given any δ > , there exists Bsu ﬃ ciently large (depending on the bounds on ( A na , low , φ na , low ) , the actual concentration proﬁlesand on δ ) such that for all su ﬃ ciently large n, (cid:13)(cid:13)(cid:13) δ n Φ (cid:13)(cid:13)(cid:13) S + (cid:13)(cid:13)(cid:13) δ nA (cid:13)(cid:13)(cid:13) ℓ S < δ . In light of the immediately veriﬁed facts thatlim sup n →∞ B X b = (cid:13)(cid:13)(cid:13) Φ nab (cid:13)(cid:13)(cid:13) S + lim sup n →∞ B X b = (cid:13)(cid:13)(cid:13) A nab (cid:13)(cid:13)(cid:13) S < ∞ and lim sup n →∞ (cid:13)(cid:13)(cid:13) Φ naB (cid:13)(cid:13)(cid:13) S + lim sup n →∞ (cid:13)(cid:13)(cid:13) A naB (cid:13)(cid:13)(cid:13) S < ∞ , this proposition then implies Theorem 7.14. (cid:3) Proof of Proposition 7.15.

For the most part, this consists in checking that the (very large numberof) interaction terms sum up to something negligible upon correct choice of B and n . We start withthe equation for δ n Φ . To begin with, we note that δ n Φ [0] is not necessarily 0, since the asymptoticevolution of the proﬁles Φ nab given by (cid:3) A na , low + P Bb ′ = A nab ′ + A naB Φ nab = ﬀ erent than the one used to extract the concentration proﬁles, i.e. e (cid:3) A na u =

0. But we alsoobserve that each proﬁle A nab ′ di ﬀ ers from the corresponding linear component in Proposition 7.12given by S ( A ab ′ [0])( t − t ab ′ n , x − x ab ′ n )by a possibly large term, which however lives in a better space (cid:13)(cid:13)(cid:13) A nab ′ ( t , x ) − S ( A ab ′ [0])( t − t ab ′ n , x − x ab ′ n ) (cid:13)(cid:13)(cid:13) ℓ S < ∞ . Denote this di ﬀ erence by B nab ′ ( t , x ). Then it su ﬃ ces to show Lemma 7.16.

For any temporally unbounded proﬁle Φ nab we have lim n →∞ X ≤ b ′ ≤ B , b ′ , b (cid:13)(cid:13)(cid:13) i B nab ′ ν ∂ ν Φ nab (cid:13)(cid:13)(cid:13) N = .

02 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Proof.

We proceed as in the proof of Proposition 7.5, expressing the di ﬀ erence B nab ′ in the schematicform X k , j (cid:3) − P k Q j P (cid:0) φ · ∇ x φ + A | φ | ) , or else as a free wave satisfying a Besov ℓ -bound for the data instead of the weaker energy bound.Using the multilinear estimates from [22] and that all factors as well as Φ nab are 1-oscillatory, wereduce to a diagonal situation, where the frequency of all factors as well as the output modulationare essentially restricted to ∼

1, and have generic position, i.e. the Fourier supports do not haveangular alignment. Then, using that the proﬁles A nab ′ disperse away from t ab ′ n uniformly in n byLemma 7.9, we easily infer the claim.To be more precise, we ﬁrst consider the case when A nj , j = , , ,

4, are free waves, which are1-oscillatory, obey the Coulomb condition, and satisfy (cid:13)(cid:13)(cid:13) A nj (cid:13)(cid:13)(cid:13) ℓ S < ∞ . Moreover, assume that Φ n is 1-oscillatory and satisﬁessup n (cid:13)(cid:13)(cid:13) Φ n (cid:13)(cid:13)(cid:13) S < ∞ and in view of the dispersive bounds from Lemma 7.9 alsolim n →∞ (cid:13)(cid:13)(cid:13) Φ n (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x = . We now prove that lim n →∞ (cid:13)(cid:13)(cid:13) iA nj ∂ j Φ n (cid:13)(cid:13)(cid:13) N = . By the 1-oscillatory character of the inputs and the ℓ -Besov bound for A n , one may restrict tofrequencies ∼ ∼ (cid:0) , (cid:1) for the ﬁrst factor, andan interpolate of ( ∞ , ∞ ) with that same space for the second factor to place the output into L t L x .Next, consider the case where A nj is of the schematic form X k , j (cid:3) − P k Q j P i (cid:0) φ · ∇ φ + A | φ | ) . We only consider the most di ﬃ cult case, where the space-time frequency localizations have beenimplemented and the null form structure revealed as in [22, Theorem 12.1]. For example, consideran expression (cid:3) − P k Q j (cid:0) Q ≤ j − C P k φ n ∂ α Q ≤ j − C P k φ n (cid:1) ∂ α Q ≤ j − C P k Φ n , where the k j indicate frequency localizations, all inputs are 1-oscillatory, and satisfy uniform S norm bounds, and Φ n satisﬁes the same vanishing relation as above. Also, from [22] we havethe alignments k = k + O (1), k ≥ k + O (1), j ≤ k + O (1). One may then in fact assume j = k + O (1), since else one gets smallness, and the 1-oscillatory character allows us to assume k , , = k + O (1) = O (1). Then one places the output into L t L x by using the Strichartz exponents (cid:0) , (cid:1) for the ﬁrst two factors, and an interpolate of (cid:0) − , + (cid:1) with (cid:0) ∞ , ∞ (cid:1) for the last factor. Theremaining null forms (see (62) and (63) in [22]) are handled similarly. (cid:3) From the preceding lemma, we infer that we can force (cid:13)(cid:13)(cid:13) δ n Φ [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x ≪ δ , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 103 provided we pick n su ﬃ ciently large. The equation for δ n Φ is given by(7.62) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA (cid:16) φ na , low + B X b = Φ nab + Φ naB + δ n Φ (cid:17) = . We rewrite this in the following form(7.63) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA δ n Φ = − I − II − III , where we put I : = (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA (cid:16) φ na , low (cid:17) , II : = (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA (cid:16) B X b = Φ nab (cid:17) , III : = (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA (cid:16) Φ naB (cid:17) . Now the idea is to show smallness of all these terms (in the N norm sense) provided B and then n are chosen su ﬃ ciently large. Of course, one needs to be careful with the fact that increasing B alsoleads to more and more terms in the sums B X b ′ = A nab ′ , B X b = Φ nab . To deal with this, we use

Lemma 7.17.

Given δ > , there is a B > such that for all B ≥ B and all su ﬃ ciently large n(depending on B), it holds that (cid:13)(cid:13)(cid:13)(cid:13) B X b = B A nab (cid:13)(cid:13)(cid:13)(cid:13) S < δ , (cid:13)(cid:13)(cid:13)(cid:13) B X b = B Φ nab (cid:13)(cid:13)(cid:13)(cid:13) S < δ . Proof.

By construction we have (cid:3) A nab = − χ I nb P Im (cid:0) Φ nab D x Φ nab (cid:1) , where I nb = [ − T b , T b ] for temporally bounded proﬁles and I nb = [ t abn − R b , t abn + R b ] for temporallyunbounded ones. By picking B su ﬃ ciently large, so that E ( A nab , Φ nab ) ≪ , b ≥ B , we get B X b = B (cid:13)(cid:13)(cid:13)(cid:13) χ I nb P Im (cid:0) Φ nab D Φ nab (cid:1)(cid:13)(cid:13)(cid:13)(cid:13) N . B X b = B E ( ˜ A nab , ˜ φ nab ) , where we recall the notation˜ φ nab = S ˜ A nab (cid:0) φ ab [0] (cid:1) (0 − t abn , x − x abn ) , ˜ A nab = S (cid:0) A abj [0] (cid:1) (0 − t abn , x − x abn ) , j = , . . . , . But then since lim sup n →∞ (cid:13)(cid:13)(cid:13)(cid:13) B X b = B A nab [0] (cid:13)(cid:13)(cid:13)(cid:13) ˙ H x × L x < δ , lim sup n →∞ B X b = B E ( ˜ A nab , ˜ φ nab ) < δ ,

04 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION upon choosing B large enough, the ﬁrst bound of the lemma follows. To get the second bound, oneuses that for B and n large enough, as well as making some small additional assumption on the data φ ab [0] (see Remark 7.18 below),(7.64) lim n →∞ (cid:13)(cid:13)(cid:13)(cid:13) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB (cid:18) B X b = B χ I nb Φ nab + (1 − χ I nb ) ˜ Φ nab (cid:19)(cid:13)(cid:13)(cid:13)(cid:13) N ≪ δ , where now χ I nb are suitable smooth time cuto ﬀ s and Φ nab is as in Step 1 with I nb = [ − T b , T b ] or I nb = [ t abn + R b , t abn − R b ], while ˜ Φ nab is as in Step 1 but on the complement ( I nb ) c . Againlim sup n →∞ (cid:13)(cid:13)(cid:13)(cid:13) B X b = B Φ nab [0] (cid:13)(cid:13)(cid:13)(cid:13) ˙ H x × L x < δ , provided B is chosen su ﬃ ciently large. We then infer (cid:13)(cid:13)(cid:13)(cid:13) B X b = B Φ nab (cid:13)(cid:13)(cid:13)(cid:13) S ≪ δ , provided δ is su ﬃ ciently small. Note that there are small error terms due to the cuto ﬀ , whichhowever are harmless and can be made arbitrarily small by picking the cuto ﬀ suitably, see [20]. Infact, we make the Remark 7.18.

To ensure smallness of the errors generated by the cuto ﬀ s χ I nb and − χ I nb , it su ﬃ cesto localize each φ ab [0] in physical space to a ball of radius | I nb | , and each A ab [0] to a ball ofradius | I nb | , say. The errors committed thereby may be included in Φ naB , respectively A naB . Observe that for the term (cid:3) A na , low + P Bb ′ = A nab ′ + A naB P Bb = B χ I nb Φ nab , one generates errors of the schematicform χ ′′ I nb Φ nab , χ ′ I nb ( A na , low + A nab ′ + A naB ) Φ nab , χ I nb (cid:0) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB − (cid:3) A nab (cid:1) Φ nab . Then by using the crude bound (cid:13)(cid:13)(cid:13) χ I nb A nab ′ ∇ t , x Φ nab (cid:13)(cid:13)(cid:13) L t L x . | I nb | (cid:13)(cid:13)(cid:13) χ C bn A nab ′ (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) ∇ t , x Φ nab (cid:13)(cid:13)(cid:13) L ∞ t L x , where C bn is a suitable space-time cube of width ∼ | I nb | centered around ( t abn , x abn ), and the impliedconstant depends on the frequency support cuto ﬀ for the Φ nab (see the end of Step 2), we see thatin light of the decay properties of the A nab ′ for b ′ , b , the norm converges to zero as n → ∞ . Oneargues similarly for (cid:13)(cid:13)(cid:13) χ I nb A nab ′ A nab ′′ Φ nab (cid:13)(cid:13)(cid:13) L t L x , b ′ , b , as well as those terms generated when we replace A nab ′ by A na , low or A naB , which then takes careof the third expression χ I nb (cid:0) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB − (cid:3) A nab (cid:1) Φ nab . Note that since we can force things to be arbitrarily small here if we simply choose n large enough,we can also sum over b ∈ [ B , B ], while maintaining smallness. The terms χ ′′ I nb Φ nab , χ ′ I nb ( A na , low + A nab ′ + A naB ) Φ nab ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 105 almost cancel the corresponding ones generated by (cid:3) A na , low + P Bb ′ = A nab ′ + A naB (cid:16) B X b = B (1 − χ I nb ) ˜ Φ nab (cid:17) , except the ˜ Φ nab used in the latter di ﬀ ers from Φ nab by a term δ Φ nab whose energy is bounded by (cid:13)(cid:13)(cid:13) ∇ t , x δ Φ nab (cid:13)(cid:13)(cid:13) L ∞ t L x ( I nb × R ) . (cid:13)(cid:13)(cid:13) χ I nb (cid:0) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB − (cid:3) A nab (cid:1) Φ nab (cid:13)(cid:13)(cid:13) N , and the expression on the right here, even when summed over b ∈ [ B , B ], is ≪ δ if we pick n su ﬃ ciently large. This then su ﬃ ces to bound B X b = B (cid:13)(cid:13)(cid:13) χ ′′ I nb δ Φ nab (cid:13)(cid:13)(cid:13) L t L x + B X b = B (cid:13)(cid:13)(cid:13) χ ′ I nb ( A na , low + A nab ′ + A naB ) δ Φ nab (cid:13)(cid:13)(cid:13) L t L x ≪ δ for su ﬃ ciently large n , where we take advantage of the spatial support properties that we assumeabout the data for Φ nab . (cid:3) Next, we show that each of the terms I – III can be made arbitrarily small up to certain errorterms by picking B and then n su ﬃ ciently large. The contribution of I.

One writes schematically (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA φ na , low = (cid:3) A na , low + P Bb ′ = A nab ′ + A naB φ na , low + i ( δ nA ) ν ∂ ν φ na , low + (cid:16) A na , low + B X b ′ = A nab ′ + A naB + δ nA (cid:17) δ nA φ na , low . Then one has for any B , lim n →∞ (cid:13)(cid:13)(cid:13) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB φ na , low (cid:13)(cid:13)(cid:13) N = P Bb ′ = A nab ′ , A naB and φ na , low as well as due to the fact that by construction we have (cid:3) A na , low φ na , low = . More precisely, one uses an argument as in the proof of Proposition 7.4. We then still have the errorterms(7.65) 2 i ( δ nA ) ν ∂ ν φ na , low and (cid:16) A na , low + B X b ′ = A nab ′ + A naB + δ nA (cid:17) δ nA φ na , low . The second term here shall be straightforward to treat by means of a simple divisibility argument,while the ﬁrst will require the equation satisﬁed by δ nA in conjunction with a divisibility argument.

06 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

The contribution of II.

We write schematically (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA (cid:16) B X b = Φ nab (cid:17) = B X b = χ I nb (cid:16) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB − (cid:3) A nab (cid:17) Φ nab + B X b = i ( δ nA ) ν ∂ ν Φ nab + B X b = (cid:16) A na , low + B X b ′ = A nab ′ + A naB + δ nA (cid:17) δ nA Φ nab . Here the time intervals I nb correspond to [ − T b , T b ] for the temporally bounded proﬁles and to [ t abn − R b , t abn + R b ] for the temporally unbounded ones. We shall henceforth make the following additionalassumption that | I nb | = M ∀ b chosen very large (eventually depending on δ and the proﬁles). Then we observe that given any δ >

0, we can pick B large enough such that for any su ﬃ ciently large n , we have (cid:13)(cid:13)(cid:13)(cid:13) B X b = χ I nb (cid:0) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB − (cid:3) A nab (cid:1) Φ nab (cid:13)(cid:13)(cid:13)(cid:13) N ≪ δ . To show this, we need (cid:13)(cid:13)(cid:13)(cid:13) B X b = χ I nb i (cid:16) A na , low + B X b ′ = b ′ , b A nab ′ + A naB (cid:17) ν ∂ ν Φ nab (cid:13)(cid:13)(cid:13)(cid:13) N ≪ δ , (cid:13)(cid:13)(cid:13)(cid:13) B X b = χ I nb (cid:16) ( A na , low + B X b ′ = A nab ′ + A naB ) − ( A nab ) (cid:17) Φ nab (cid:13)(cid:13)(cid:13)(cid:13) N ≪ δ . For the ﬁrst expression, observe that the interactions of A nab ′ , b ′ , b , with Φ nab are easily seento vanish as n → ∞ , using crude bounds, due to the time localization from χ I nb , and the divergingsupports of these proﬁles or their dispersive decay. Similarly, the interaction of A na , low with Φ nab isseen to vanish asymptotically as n → ∞ , due to the divergent frequency supports and again takingadvantage of the extra cuto ﬀ χ I nb . Note that at this point we have not yet used the parameter B .Finally, we also need to bound (cid:13)(cid:13)(cid:13)(cid:13) B X b = χ I nb A naB ν ∂ ν Φ nab (cid:13)(cid:13)(cid:13)(cid:13) N , and it is here that we shall take advantage of the size of B . Precisely, we divide the above term intotwo. First, pick B very large, depending on the parameter M (which controls the I nb via | I nb | ≤ M ),such that we have for any B ≥ B ,lim sup n →∞ (cid:13)(cid:13)(cid:13)(cid:13) B X b = B χ I nb A naB ν ∂ ν Φ nab (cid:13)(cid:13)(cid:13)(cid:13) N ≪ δ . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 107

That this is possible follows from Lemma 7.17. Then, with B chosen, pick B ≥ B su ﬃ cientlylarge such that lim sup n →∞ (cid:13)(cid:13)(cid:13)(cid:13) B X b = χ I nb A naB ν ∂ ν Φ nab (cid:13)(cid:13)(cid:13)(cid:13) N ≪ δ . Here we take advantage of the fact that we essentially havelim sup n →∞ (cid:13)(cid:13)(cid:13) A naB ν (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x + L ∞ t ˙ H x → B → ∞ . In fact, we have to be a bit careful here, because in Remark 7.18 we assume thatwe have incorporated some extra errors into the tail terms A naB and Φ naB , which do not vanish as B → ∞ . However, considering the term corresponding to a ﬁxed b ∈ [1 , B ], we have that the extracontribution to A naB (coming from truncating A nab [0]) interacts weakly with Φ nab (in the sensethat it vanishes as | I b | → ∞ ), see e.g. the proof of Proposition 5.13. The remaining contributionsfrom truncating A nab ′ [0] are easily seen to result in interactions vanishing as n → ∞ . The cubicterm (cid:13)(cid:13)(cid:13)(cid:13) B X b = χ I nb (cid:16) ( A na , low + B X b ′ = A nab ′ + A naB ) − ( A nab ) (cid:17) Φ nab (cid:13)(cid:13)(cid:13)(cid:13) N is actually simpler, because the temporal cuto ﬀ s χ I nb are not even necessary to get the desired bound.This completes the estimate for II except for the error terms(7.66) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA (cid:16) B X b = Φ nab (cid:17) − (cid:3) A na , low + P Bb ′ = A nab ′ + A naB (cid:16) B X b = Φ nab (cid:17) . The contribution of III.

Here we take advantage of the fact that Φ naB satisﬁes the equation e (cid:3) A na u = B large enough such that for all su ﬃ ciently large n , (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB Φ naB (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ≪ δ . Of course, the proﬁles A nab ′ are not free waves, but they di ﬀ er from free waves by terms that arenegligible as far as interactions with Φ naB are concerned. In fact, we recall that (cid:13)(cid:13)(cid:13) A nab ′ ( t , x ) − S ( A ab ′ [0])( t − t ab ′ n , x − x ab ′ n ) (cid:13)(cid:13)(cid:13) ℓ S < ∞ . Using Lemma 7.17, we can reﬁne this to a tail estimate as follows. There exists B su ﬃ ciently largesuch that denoting B nab ′ : = A nab ′ ( t , x ) − S ( A ab ′ [0])( t − t ab ′ n , x − x ab ′ n ) , we have for any B ≥ B , lim sup n →∞ B X b ′ = B (cid:13)(cid:13)(cid:13) i B nab ′ ν ∂ ν Φ naB (cid:13)(cid:13)(cid:13) N ≪ δ . On the other hand, with this B ﬁxed, we can use the argument for Lemma 7.16 to conclude thatthere exists B ≥ B such that we havelim sup n →∞ B X b ′ = (cid:13)(cid:13)(cid:13) i B nab ′ ν ∂ ν Φ naB (cid:13)(cid:13)(cid:13) N ≪ δ . Finally, we are still left with the error terms(7.67) (cid:3) A na , low + P Bb ′ = A nab ′ + A naB + δ nA Φ naB − (cid:3) A na , low + P Bb ′ = A nab ′ + A naB Φ naB .

08 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

We have now shown smallness of the terms I – III up to errors that are at least linear in δ nA givenby (7.65) – (7.67).Having dealt with the equation for δ n Φ , we now come to the equation for δ nA given by (cid:3) (cid:16) A na , lowj + B X b ′ = A nab ′ j + A naBj + ( δ nA ) j (cid:17) = −P j Im (cid:18)(cid:16) φ na , low + B X b = Φ nab + Φ naB + δ n Φ (cid:17) D x (cid:16) φ na , low + B X b = Φ nab + Φ naB + δ n Φ (cid:17)(cid:19) , (7.68)where the covariant derivative D x uses the underlying connection form A na , low + B X b ′ = A nab ′ + A naB + ( δ nA ) . We rewrite this in the form (cid:3) ( δ nA ) = − IV − V , (7.69)where we put schematically IV : = Im (cid:18)(cid:16) φ na , low + B X b = Φ nab + Φ naB + δ n Φ (cid:17) D x (cid:16) φ na , low + B X b = Φ nab + Φ naB + δ n Φ (cid:17)(cid:19) − Im (cid:18)(cid:16) B X b = Φ nab + Φ naB + δ n Φ (cid:17) D x (cid:16) B X b = Φ nab + Φ naB + δ n Φ (cid:17)(cid:19) − Im (cid:16) φ na , low D x φ na , low (cid:17) , V : = Im (cid:18)(cid:16) B X b = Φ nab + Φ naB + δ n Φ (cid:17) D x (cid:16) B X b = Φ nab + Φ naB + δ n Φ (cid:17)(cid:19) − B X b = (cid:3) A nab . The term IV can be written in terms of null forms as well as cubic terms involving at least one lowfrequency factor φ na , low as well as at least one high frequency term from B X b = Φ nab + Φ naB , or else error terms involving at least one factor δ n Φ . The former type of interaction is easily seen toconverge to zero with respect to k · k N as n → ∞ , and so only the latter type of error term needs tobe kept. As for term V , again ignoring the terms involving at least one factor δ n Φ , we reduce this toIm (cid:18)(cid:16) B X b = Φ nab + Φ naB (cid:17) D x (cid:16) B X b = Φ nab + Φ naB (cid:17)(cid:19) − B X b = (cid:3) A nab . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 109

Then from the deﬁnition of the proﬁles Φ nab , we can write this for some large B and B ≥ B as B X b = χ ( I nb ) c Im (cid:16) Φ nab D x Φ nab (cid:17) + Im (cid:18)(cid:16) B X b = Φ nab (cid:17) D x (cid:16) B X b = Φ nab (cid:17)(cid:19) − B X b = Im (cid:16) Φ nab D x Φ nab (cid:17) + Im (cid:18)(cid:16) B X b = B Φ nab (cid:17) D x (cid:16) B X b = Φ nab + Φ naB (cid:17)(cid:19) + Im (cid:18) Φ naB D x (cid:16) B X b = B Φ nab (cid:17)(cid:19) − B X b = B χ I nb Im (cid:16) Φ nab D x Φ nab (cid:17) + Im (cid:16) Φ naB D x Φ naB (cid:17) ≡ ( V ) + ( V ) + ( V ) − ( V ) + ( V ) . Then given a δ > B su ﬃ ciently large such that for all su ﬃ cientlylarge n we have (cid:13)(cid:13)(cid:13) ( V ) (cid:13)(cid:13)(cid:13) N + (cid:13)(cid:13)(cid:13) ( V ) (cid:13)(cid:13)(cid:13) N ≪ δ , using Lemma 7.17. Then one picks n large enough such that (cid:13)(cid:13)(cid:13) ( V ) (cid:13)(cid:13)(cid:13) N ≪ δ . Further, with B ﬁxed, pick B ≥ B su ﬃ ciently large such that (cid:13)(cid:13)(cid:13) ( V ) (cid:13)(cid:13)(cid:13) N ≪ δ . Finally, with B ﬁxed, we choose M = | I nb | large enough (depending on the proﬁles Φ nab , b = , . . . , B , where these of course depend on the n -independent φ ab [0]), such that (cid:13)(cid:13)(cid:13) ( V ) (cid:13)(cid:13)(cid:13) N ≪ δ . This is then the M that needs to be used in the analysis of the δ n Φ equation in the “additionalassumption” there. (cid:3) Conclusion of the induction on frequency process.

In the preceding subsection we obtainedglobal S norm bounds for the MKG-CG evolution of the data (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0]under the assumption that (cid:0) A n , φ n (cid:1) [0] has at least two non-zero concentration proﬁles, or all suchproﬁles are zero, or else there exists only one such proﬁle ( ˜ A n b , ˜ φ n b ) but withlim inf n →∞ E ( ˜ A n b , ˜ φ n b )(0) < E crit . We now make this assumption and continue the process by considering the data(7.70) (cid:0) A n Λ + A n + A n Λ , φ n Λ + φ n + φ n Λ (cid:1) [0]at time t =

0. Proceeding almost identically to Subsection 7.3, we prove that the MKG-CG evolu-tion of this data exists globally and satisﬁes a priori S norm bounds. These bounds depend on E crit

10 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION and the a priori bounds on the evolution of the data (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0]. The only di ﬀ erencehere is that in the decompositions (see Subsection 7.3) A n Λ [0] = Λ ( δ ) X j = A n j ) Λ [0] + A n Λ ( Λ ) [0] ,φ n Λ [0] = Λ ( δ ) X j = φ n j ) Λ [0] + φ n Λ ( Λ ) [0] , we now have to make sure that (cid:13)(cid:13)(cid:13) A n Λ ( Λ ) [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ + (cid:13)(cid:13)(cid:13) φ n Λ ( Λ ) [0] (cid:13)(cid:13)(cid:13) ˙ B , ∞ × ˙ B , ∞ is small enough depending both on E crit and the a priori bounds for the MKG-CG evolution of thedata (cid:0) A n Λ + A n , φ n Λ + φ n (cid:1) [0]. Then we continue by adding the second frequency atom (cid:0) A n , φ n (cid:1) [0]to the data at time t = e (cid:3) A n : = (cid:3) + i (cid:0) A n Λ ,ν + A n ν + A n Λ ,ν + A n , f ree ν (cid:1) ∂ ν , where A n Λ ,ν + A n ν + A n Λ ,ν is given by the global MKG-CG evolution of the data (7.70).All in all we may carry out this process Λ many times in order to ﬁnally conclude that if eitherthere are at least two frequency atoms, or else there is only one frequency atom but withlim inf n →∞ E ( A n , φ n ) < E crit , or if we do have lim n →∞ E ( A n , φ n ) = E crit , but such that there are at least two concentration proﬁles, or ﬁnally if there is only one frequencyatom of asymptotic energy E crit and only one concentration proﬁle ( ˜ A n b , ˜ φ n b ) withlim inf n →∞ E ( ˜ A n b , ˜ φ n b )(0) < E crit , then the sequence ( A n , φ n ) cannot possibly have been essentially singular, resulting in a contradic-tion to our assumption. We can then formulate the following Corollary 7.19.

Assume that ( A n , φ n ) is an essentially singular sequence. Then by re-scaling wemay assume that the sequence of data ( A n , φ n )[0] is -oscillatory, and that there exist sequences { ( t n , x n ) } n ∈ N ⊂ R × R and ﬁxed proﬁles ( A , φ )[0] ∈ ( ˙ H x × L x ) × ( ˙ H x × L x ) with A satisfying the Coulomb condition, such that we have for j = , . . . , ,A nj [0] = S (cid:0) A j [0] (cid:1) ( · − t n , · − x n )[0] + o ˙ H x × L x (1) as n → ∞ . Here, S ( · )( t , x ) denotes the standard free wave propagator. Furthermore, deﬁne for j = , . . . , , ˜ A j ( t , x ) = S (cid:0) A j [0] (cid:1) ( t , x ) and denote by S ˜ A (cid:0) u [0] (cid:1) ( t , x ) the solution to (cid:0) (cid:3) + i ˜ A j ∂ j (cid:1) u = ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 111 with data u [0] ∈ ˙ H x × L x at time t = . Then we have φ n [0] = S ˜ A (cid:0) φ [0] (cid:1) ( · − t n , · − x n )[0] + o ˙ H x × L x (1) as n → ∞ . If the sequence ( t n ) n ∈ N admits a subsequence that is bounded, then by passing to this subse-quence, we may as well replace t n by t n = n , and correspondingly obtain up to rescalingand spatial translations that ( A n , φ n )[0] = ( A , φ )[0] + o ˙ H x × L x (1) . Then Proposition 6.1 implies that the evolution ( A ∞ , Φ ∞ ) of ( A , φ )[0] is in fact a minimal energyblowup solution. In the case that t n → + ∞ (or t n → −∞ ), we need to introduce the concept of aminimum regularity MKG-CG evolution associated with scattering data, or “a solution at inﬁnity”.Here we have the following Proposition 7.20.

Let ( A , φ )[0] be Coulomb energy class data and let { ( t n , x n ) } n ∈ N ⊂ R × R witht n → + ∞ . We introduce the scattering data A nj [0] = S (cid:0) A j [0] (cid:1) ( · − t n , · − x n )[0] , j = , . . . , , Φ n [0] = S ˜ A (cid:0) φ [0] (cid:1) ( · − t n , · − x n )[0] , (7.71) where we use the notation ˜ A j ( t , x ) = S (cid:0) A j [0] (cid:1) ( t , x ) for j = , . . . , . Moreover, we denote by ( A n , Φ n )( t , x ) the MKG-CG evolution (in the sense of Section 5) of the Coulomb data ( A n , Φ n )[0] .Then there exists a su ﬃ ciently large C ∈ R + such that there exists an energy class solution (cid:0) A ∞ , Φ ∞ (cid:1) to MKG-CG on ( −∞ , − C ) × R , which is the limit of admissible solutions as in Section 5 with (cid:13)(cid:13)(cid:13)(cid:0) A ∞ , Φ ∞ (cid:1)(cid:13)(cid:13)(cid:13) S (( −∞ , − C ] × R ) < ∞ ∀ C > C , and such that for any t ∈ ( −∞ , − C ) we have in the energy topology lim n →∞ (cid:0) A n , Φ n (cid:1) ( t + t n , x + x n ) = (cid:0) A ∞ , Φ ∞ (cid:1) ( t , x ) . In particular, the expressions on the left are well-deﬁned (in the sense of Section 5) for n su ﬃ cientlylarge.Proof. This is a perturbative argument, which exploits the dispersive behaviour as evidenced byamplitude decay of the functions A n [0] and Φ n [0]. We write A n ( t , x ) = A n ( t , x ) + δ A n ( t , x ) , Φ n ( t , x ) = Φ n ( t , x ) + δ Φ n ( t , x ) , where we use the notation A nj ( t , x ) = S (cid:0) A j [0] (cid:1) ( t − t n , x − x n ) , j = , . . . , , Φ n ( t , x ) = S ˜ A (cid:0) φ [0] (cid:1) ( t − t n , x − x n ) . Also, keep in mind that ( A n , Φ n )( t , x ) denotes the MKG-CG evolution (in the sense of Section 5) ofthe data ( A n , Φ n )[0]. Then we show that ( δ A n , δ Φ n ) satisfy good S -bounds on ( −∞ , t n − C ) × R for some C > ﬃ ciently large, and all n large enough. This means that the evolutions ( A n , Φ n )are well-deﬁned on ( −∞ , t n − C ) × R . Furthermore, assuming as we may that t n is monotonouslyincreasing, we will show that for n ′ > n , we havelim n , n ′ →∞ n ′ > n (cid:13)(cid:13)(cid:13) A n ′ [ t n ′ − t n ] − A n [0] (cid:13)(cid:13)(cid:13) ℓ ˙ H x × ℓ L x + lim n , n ′ →∞ n ′ > n (cid:13)(cid:13)(cid:13) Φ n ′ [ t n ′ − t n ] − Φ n [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x = ,

12 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION which together with standard perturbation theory then results in the fact thatlim n →∞ (cid:0) A n , Φ n (cid:1) ( t + t n , x + x n ) = (cid:0) A ∞ , Φ ∞ (cid:1) ( t , x ) , provided t ∈ ( −∞ , − C ), and the right hand side is a solution to MKG-CG in the sense of Section 5.To get the desired bounds on ( δ A n , δ Φ n ), we record the schematic system of equations that theysatisfy 0 = (cid:3) A n Φ n + (cid:0) (cid:3) A n + δ A n − (cid:3) A n (cid:1) Φ n + (cid:3) A n + δ A n δ Φ n , (7.72) (cid:3) ( δ A nj ) = P j (cid:0) Φ n D x Φ n (cid:1) + P j (cid:0) δ Φ n D x Φ n (cid:1) + P j (cid:0) Φ n D x δ Φ n (cid:1) + P j (cid:0) δ Φ n D x δ Φ n (cid:1) . (7.73)We then show that given δ >

0, there exists C = C ( δ, A [0] , φ [0]) such that we have (cid:13)(cid:13)(cid:13) δ A n (cid:13)(cid:13)(cid:13) ℓ S (( −∞ , t n − C ] × R ) + (cid:13)(cid:13)(cid:13) δ Φ n (cid:13)(cid:13)(cid:13) S (( −∞ , t n − C ] × R ) < δ. This follows as usual via a bootstrap argument. We show here how to obtain smallness of thenon-perturbative source terms on the right hand side, i.e. the terms (cid:3) A n Φ n , P i (cid:0) Φ n D x Φ n (cid:1) , while the remaining terms are handled either via the smallness of δ (provided they are quadratic in δ A n , δ Φ n ), or else via a standard divisibility argument, just as in the proof of Proposition 7.4. Nowthe ﬁrst term on the right is in e ﬀ ect equal to A n ν A n ,ν Φ n . To treat it, we note that we may reduce all inputs as well as the output to frequency ∼

1, since elsewe gain smallness for the L t L x -norm of the output by using standard Strichartz norms. Then weestimate the remainder by (cid:13)(cid:13)(cid:13) P O (1) A n ν P O (1) A n ,ν P O (1) Φ n (cid:13)(cid:13)(cid:13) L t L x (( −∞ , t n − C ] × R ) . (cid:13)(cid:13)(cid:13) P O (1) A n ν (cid:13)(cid:13)(cid:13) L t L x (( −∞ , t n − C ] × R ) (cid:13)(cid:13)(cid:13) P O (1) A n ,ν (cid:13)(cid:13)(cid:13) L t L x (( −∞ , t n − C ] × R ) (cid:13)(cid:13)(cid:13) P O (1) Φ n (cid:13)(cid:13)(cid:13) L t L ∞ x (( −∞ , t n − C ] × R ) . Then by exploiting the L ∞ x decay and interpolation, for example, we get (cid:13)(cid:13)(cid:13) P O (1) A n ,ν (cid:13)(cid:13)(cid:13) L t L x (( −∞ , t n − C ] × R ) ≪ δ for C su ﬃ ciently large, uniformly in n , and this su ﬃ ces to get the necessary smallness on accountof the fact that uniformly in n , (cid:13)(cid:13)(cid:13) P O (1) A n ν (cid:13)(cid:13)(cid:13) L t L x (( −∞ , t n − C ] × R ) + (cid:13)(cid:13)(cid:13) P O (1) Φ n (cid:13)(cid:13)(cid:13) L t L ∞ x (( −∞ , t n − C ] × R ) . (cid:13)(cid:13)(cid:13) ( A [0] , φ [0]) (cid:13)(cid:13)(cid:13) ˙ H x × L x . As for the quadratic term P j (cid:0) Φ n D x Φ n (cid:1) , its inherent null structure allows to reduce to the case of frequencies ∼ ∼

1. In that situationwe have (cid:13)(cid:13)(cid:13) P j (cid:0) Φ n D x Φ n (cid:1)(cid:13)(cid:13)(cid:13) N (( −∞ , t n − C ] × R ) . (cid:13)(cid:13)(cid:13) P j (cid:0) Φ n D x Φ n (cid:1)(cid:13)(cid:13)(cid:13) L t L x (( −∞ , t n − C ] × R ) , which can be estimated by placing one input into L t L x (( −∞ , t n − C ] × R ) and the other one into L t L x (( −∞ , t n − C ] × R ). The latter norm is small uniformly in n for C su ﬃ ciently large on accountof (a variant of) Lemma 7.9. (cid:3) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 113

Remark 7.21.

The preceding proof implies in particular that if ( A , φ )[0] is Coulomb energy classdata, then there exists t > su ﬃ ciently large such that the initial data (cid:16) S (cid:0) A [0] (cid:1) ( · − t , · ) , S ˜ A (cid:0) φ [0] (cid:1) ( · − t , · ) (cid:17) [0] , where for j = , . . . , , ˜ A j ( t , x ) = S (cid:0) A j [0] (cid:1) ( t , x ) , can be evolved in the sense of Section 5 on ( −∞ , × R and satisﬁes a global S -bound there. Thisis the analogue of Proposition 7.15 in [20]. Extracting a minimal blowup solution in the case of one temporally unbounded proﬁle is still nota direct consequence of the preceding proposition on account of the somewhat delicate perturbationtheory, but follows by a slightly indirect argument. Here we state

Proposition 7.22.

Assume that the essentially singular sequence ( A n , φ n ) satisﬁesA n [0] = S (cid:0) A [0] (cid:1) ( · − t n , · − x n )[0] + o ˙ H x × L x (1) ,φ n [0] = S ˜ A (cid:0) φ [0] (cid:1) ( · − t n , · − x n )[0] + o ˙ H x × L x (1) , where t n → + ∞ , say, and we use the same notation as in the preceding Corollary 7.19. Thendenoting the corresponding MKG-CG evolution of these data by ( A n , φ n )( t , x ) , its lifespan comprises ( −∞ , t n − C ) for C su ﬃ ciently large, uniformly in n. Also, the sequence (cid:8) ( A n , φ n )[ t n − C ] (cid:9) n ∈ N forms a pre-compact set in the energy topology. Denoting a limit point (any such satisﬁes theCoulomb condition) by ( A ∞ , Φ ∞ )[0] , we have E ( A ∞ , Φ ∞ ) = E crit , and moreover, denoting thelifespan of its MKG-CG evolution by I, we get sup J ⊂ I (cid:13)(cid:13)(cid:13) ( A ∞ , Φ ∞ ) (cid:13)(cid:13)(cid:13) S ( J × R ) = ∞ . Proof.

The fact that the evolution of ( A n , φ n )( t , x ) is deﬁned and has ﬁnite S -bounds on ( −∞ , t n − C )follows by exactly the same method as in the proof of the preceding proposition. We set A n ( t , x ) = A n ( t , x ) + δ A n ( t , x ) ,φ n ( t , x ) = S ˜ A (cid:0) φ [0] (cid:1) ( t − t n , x − x n ) + δφ n ( t , x ) , where we let A n be the free wave evolution of A n [0], i.e. for j = , . . . , A nj ( t , x ) = S (cid:0) A j [0] (cid:1) ( t − t n , x − x n ) + o ˙ H x × L x (1) , and ˜ A j ( t , x ) = S (cid:0) A j [0] (cid:1) ( t , x ) for j = , . . . ,

4. Also, note that δ A n [0] =

0. Then choosing C largeenough, we infer the bounds (cid:13)(cid:13)(cid:13) ( δ A n , δφ n ) (cid:13)(cid:13)(cid:13) ( ℓ S × S )( −∞ , t n − C ) × R ≪ A n , φ n ) is essentially singular, we know by the preceding results that the data( A n , φ n )[ t n − C ]are concentrated at ﬁxed frequency ∼ (cid:8) ( A n , φ n )[ t n − C ] (cid:9) n ≥ is pre-compact in the energy topology. Extracting a limiting proﬁle ( A ∞ , Φ ∞ )[0], the last statementof the proposition follows directly from Proposition 6.1. (cid:3)

14 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

To conclude this section, we ﬁnally state the following crucial compactness property of theminimal blowup solution ( A ∞ , Φ ∞ ) extracted in the preceding. Theorem 7.23.

Denote the lifespan of ( A ∞ , Φ ∞ ) by I. There exist continuous functions x : I → R , λ : I → R + , so that each of the family of functions ((cid:18) λ ( t ) A ∞ j (cid:18) t , · − x ( t ) λ ( t ) (cid:19) , λ ( t ) ∂ t A ∞ j (cid:18) t , · − x ( t ) λ ( t ) (cid:19)(cid:19) : t ∈ I ) for j = , . . . , and ((cid:18) λ ( t ) Φ ∞ (cid:18) t , · − x ( t ) λ ( t ) (cid:19) , λ ( t ) ∂ t Φ ∞ (cid:18) t , · − x ( t ) λ ( t ) (cid:19)(cid:19) : t ∈ I ) is pre-compact in ˙ H x ( R ) × L x ( R ) . The proof of this follows exactly as for Corollary 9.36 in [20], using the preceding Remark 7.21.8. R igidity argument

In this ﬁnal section we rule out the existence of a minimal blowup solution ( A ∞ , Φ ∞ ) with thecompactness property from Theorem 7.23. To this end we largely follow the scheme of the rigidityargument by Kenig-Merle [10].In Subsection 8.1 we derive several energy and virial identities for energy class solutions toMKG-CG. Then we prove some preliminary properties of the minimal blowup solution ( A ∞ , Φ ∞ ),in particular that its momentum must vanish. Denoting by I the lifespan of ( A ∞ , Φ ∞ ), we distin-guish between I + : = I ∩ [0 , ∞ ) being a ﬁnite or an inﬁnite time interval. In the next Subsection 8.2,we exclude the existence of a minimal blowup solution ( A ∞ , Φ ∞ ) with inﬁnite time interval I + using the virial identities, the fact that the momentum of ( A ∞ , Φ ∞ ) must vanish and an additionalVitali covering argument introduced in [20]. Moreover, we reduce the case of ﬁnite lifespan I + to a self-similar blowup scenario. In the last Subsection 8.3, we then derive a suitable Lyapunovfunctional for the Maxwell-Klein-Gordon system in self-similar variables, which will ﬁnally enableus to also rule out the self-similar case.8.1. Preliminary properties of minimal blowup solutions with the compactness property.

Wewill sometimes use the following notation for the covariant derivatives D α = ∂ α + i A ∞ α and the curvature components F ∞ αβ = ∂ α A ∞ β − ∂ β A ∞ α associated with the minimal blowup solution ( A ∞ , Φ ∞ ). Lemma 8.1.

Let ( A , φ ) be an energy class solution to MKG-CG in the sense of Deﬁnition 5.3 withlifespan I containing . For given ε > , let R > be such that Z {| x |≥ R } (cid:18) X α,β F αβ (0 , x ) + X α | D α φ (0 , x ) | (cid:19) dx ≤ ε. Then we have for any t ∈ I + that Z {| x |≥ R + t } (cid:18) X α,β F αβ ( t , x ) + X α | D α φ ( t , x ) | (cid:19) dx ≤ ε. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 115

Proof.

Let ( A , φ ) be an admissible solution to MKG-CG with lifespan I containing 0. For R > t ∈ I + , we deﬁne E R ( t ) = Z {| x |≥ R + t } (cid:18) X α,β F αβ ( t , x ) + X α | D α φ ( t , x ) | (cid:19) dx . Using that the energy-momentum tensor for the Maxwell-Klein-Gordon system T αβ = F αγ F βγ − m αβ F γδ F γδ + Re (cid:0) D α φ D β φ (cid:1) − m αβ D γ φ D γ φ, with m αβ denoting the Minkowski metric, is divergence free ∂ α T αβ = , we easily obtain from the divergence theorem that for any t , t ∈ I + with t < t ,(8.1) E R ( t ) = E R ( t ) + Z M t t (cid:18) T ( t , x ) + x j | x | T j ( t , x ) (cid:19) d σ ( t , x ) . Here, M t t denotes the part of the mantle of the forwards light cone { ( t , x ) ∈ I + × R : | x | ≤ R + t } enclosed by the time slices { t } × R and { t } × R , and d σ denotes the standard surface measure.One easily veriﬁes that the ﬂux T ( t , x ) + x j | x | T j ( t , x )is non-negative using the general identity X j , k (cid:0) ω j r k − ω k r j (cid:1) = (cid:0) r − ( r · ω ) (cid:1) ≤ r for r , ω ∈ R with | ω | =

1. We conclude that(8.2) E R ( t ) ≤ E R ( t ) . Since an energy class solution to MKG-CG in the sense of Deﬁnition 5.3 is a locally uniform limitof admissible solutions, the corresponding inequality (8.2) follows by passing to the limit. Thisimplies the claim. (cid:3)

Next, we prove the following energy and virial identities for energy class solutions to MKG-CG.

Proposition 8.2.

Let ( A , φ ) be an energy class solution to MKG-CG in the sense of Deﬁnition 5.3.Then the following identities hold. • Energy conservation (8.3) ddt Z R (cid:18) X α,β F αβ + X α | D α φ | (cid:19) dx = . • Momentum conservation (8.4) ddt Z R (cid:16) F j F k j + Re (cid:0) D φ D k φ (cid:1)(cid:17) dx = for k = , . . . , . • Weighted energy (8.5) ddt Z R x k ϕ R (cid:18) X α,β F αβ + X α | D α φ | (cid:19) dx = − Z R (cid:16) F j F k j + Re (cid:0) D φ D k φ (cid:1)(cid:17) dx + O ( r ( R )) for k = , . . . , .

16 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION • Weighted momentum monotonicityddt Z R x k ϕ R (cid:16) F j F k j + Re (cid:0) D φ D k φ (cid:1)(cid:17) dx + ddt Z R ϕ R Re (cid:0) φ D φ (cid:1) dx = − Z R (cid:18)X k F k + | D φ | (cid:19) dx + O ( r ( R )) . (8.6) Here, ϕ ∈ C ∞ c ( R ) is a smooth cuto ﬀ with ϕ ( x ) = for | x | ≤ and ϕ ( x ) = for | x | ≥ . Moreover,for R > we deﬁne ϕ R ( x ) = ϕ (cid:0) xR (cid:1) and (8.7) r ( R ) : = Z {| x |≥ R } (cid:18)X α,β F αβ + X α | D α φ | + | φ | | x | (cid:19) dx . Proof.

It su ﬃ ces to verify these identities for admissible solutions to MKG-CG. Since energy classsolutions in the sense of Deﬁnition 5.3 are locally uniform limits of admissible solutions, the cor-responding identities follow by passing to the limit in an integrated formulation.So let ( A , φ ) be an admissible solution to MKG-CG. Then the energy conservation (8.3) andmomentum conservation (8.4) identities follow immediately from the divergence theorem and thefact that the energy-momentum tensor of the Maxwell-Klein-Gordon system T αβ = F αγ F βγ − m αβ F γδ F γδ + Re (cid:0) D α φ D β φ (cid:1) − m αβ D γ φ D γ φ for α, β ∈ { , , . . . , } is divergence free(8.8) ∂ α T αβ = . To prove the weighted energy identity (8.5), we also use the divergence-free property (8.8) of T αβ and compute for k = , . . . , ddt Z R x k ϕ R ( x ) T dx = Z R x k ϕ R ( x ) ∂ j T j dx = − Z R ϕ R ( x ) T k dx − Z R x k R ( ∂ j ϕ )( xR ) T j dx = − Z R T k dx + O ( r ( R )) , where we integrated by parts in the second to last step. This yields (8.5). Finally, to show theweighted momentum monotonicity identity (8.6), we compute ddt Z R x k ϕ R ( x ) T k dx = Z R x k ϕ R ( x ) ∂ j T jk dx = − Z R ϕ R ( x ) (cid:18) X k = T kk (cid:19) dx − Z R x k R ( ∂ j ϕ )( xR ) T jk dx = − Z R (cid:18) X k = F k + | D φ | − X k = | D k φ | (cid:19) dx + O ( r ( R )) . (8.9) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 117

Since the right hand side of (8.9) does not yet exhibit the desired monotonicity, we also consider ddt Z R ϕ R ( x ) Re (cid:0) φ D φ (cid:1) dx = Z R ϕ R ( x ) Re (cid:0) ∂ t φ D φ (cid:1) dx + Z R ϕ R ( x ) Re (cid:0) φ∂ t D φ (cid:1) dx = Z R ϕ R ( x ) | D φ | dx + Z R ϕ R ( x ) Re (cid:16) φ D φ (cid:17) dx . Inserting the equation for φ and integrating by parts leads to ddt Z R ϕ R ( x ) Re (cid:0) φ D φ (cid:1) dx = Z R ϕ R ( x ) | D φ | dx − X k = Z R ϕ R ( x ) | D k φ | dx − Z R R ( ∂ k ϕ )( xR ) Re (cid:16) ϕ D k φ (cid:17) dx = Z R (cid:18) | D φ | − X k = | D k φ | (cid:19) dx + O ( r ( R )) . (8.10)Putting together (8.9) and (8.10), we obtain (8.6). (cid:3) If I + is a ﬁnite time interval, we obtain a lower bound on λ ( t ) from Theorem 7.23. Lemma 8.3.

Assume that I + is ﬁnite and after re-scaling that I + = [0 , . Let λ : I + → R + be as inTheorem 7.23. Then there exists a constant C ( K ) > such that < C ( K )1 − t ≤ λ ( t ) for all ≤ t < .Proof. The proof follows exactly as in [20, Lemma 10.4] by combining Corollary 6.3 and Theo-rem 7.23. (cid:3)

Moreover, when I + is a ﬁnite time interval, we conclude the following sharp support propertiesof Φ ∞ and the curvature components F ∞ αβ . Lemma 8.4.

Under the same assumptions as in Lemma 8.3 there exists x ∈ R such thatsupp (cid:16) F ∞ αβ ( t , · ) , Φ ∞ ( t , · ) (cid:17) ⊂ B ( x , − t ) for all ≤ t < and all α, β ∈ { , , . . . , } .Proof. We follow the proof of Lemma 4.8 in [10]. Consider a sequence { t n } n ⊂ [0 ,

1) with t n → n → ∞ . From the preceding Lemma 8.3 we know that λ ( t n ) → ∞ as n → ∞ . Together with thecompactness property expressed in Theorem 7.23, we obtain for every R > ε > ﬃ ciently large n , it holds that Z(cid:8) | x + x ( tn ) λ ( tn ) |≥ R (cid:9)(cid:18)X α (cid:12)(cid:12)(cid:12) ∇ t , x A ∞ α ( t n , x ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) ∇ t , x Φ ∞ ( t n , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ≤ ε Z(cid:8) | x + x ( tn ) λ ( tn ) |≥ R (cid:9)(cid:18)X α (cid:12)(cid:12)(cid:12) A ∞ α ( t n , x ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) Φ ∞ ( t n , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ≤ ε .

18 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Applying Lemma 8.1 backwards in time, we conclude for every R > ε >

0, and s ∈ [0 ,

1) thatwe have for all su ﬃ ciently large n ,(8.11) Z(cid:8) | x + x ( tn ) λ ( tn ) |≥ R + t n − s (cid:9)(cid:18) X α,β F ∞ αβ ( s , x ) + X α (cid:12)(cid:12)(cid:12) D α Φ ∞ ( s , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ≤ ε . Next, we show that there exists M > (cid:12)(cid:12)(cid:12)(cid:12) x ( t ) λ ( t ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ M for all 0 ≤ t <

1. Suppose not. Thenit su ﬃ ces to consider a sequence t n → (cid:12)(cid:12)(cid:12)(cid:12) x ( t n ) λ ( t n ) (cid:12)(cid:12)(cid:12)(cid:12) → ∞ . For all R >

0, we have for su ﬃ cientlylarge n that (cid:8) x : | x | ≤ R (cid:9) ⊂ (cid:26) x : (cid:12)(cid:12)(cid:12)(cid:12) x + x ( t n ) λ ( t n ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ R + t n (cid:27) . But then we obtain from (8.11) with s = R > Z {| x |≤ R } (cid:18) X α,β F ∞ αβ (0 , x ) + X α (cid:12)(cid:12)(cid:12) D α Φ ∞ (0 , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ≤ ε . Since ε > t n → x ( t n ) λ ( t n ) → − x ∈ R . Now observe that for every η > s ∈ [0 , ﬃ ciently large n that (cid:8) x : | x − x | ≥ η + − s (cid:9) ⊂ (cid:26) x : (cid:12)(cid:12)(cid:12)(cid:12) x + x ( t n ) λ ( t n ) (cid:12)(cid:12)(cid:12)(cid:12) ≥ η + t n − s (cid:27) . Hence, we obtain from (8.11) that for every ε > η > s ∈ [0 , Z {| x − x |≥ η + − s } (cid:18) X α,β F ∞ αβ ( s , x ) + X α (cid:12)(cid:12)(cid:12) D α Φ ∞ ( s , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ≤ ε . We conclude that supp (cid:16) F ∞ αβ ( t , · ) , (cid:0) D α Φ ∞ (cid:1) ( t , · ) (cid:17) ⊂ B ( x , − t )for all 0 ≤ t < α, β = , , . . . ,

4. The claim then follows from the diamagnetic inequality. (cid:3)

In the next key proposition we prove that the momentum of the minimal blowup solution ( A ∞ , Φ ∞ )must vanish. This will later allow us to control the movement of the “center of mass”, or more pre-cisely a weighted energy of ( A ∞ , Φ ∞ ). For technical reasons we have to distinguish between thecase of ﬁnite and inﬁnite lifespan. Proposition 8.5.

Let ( A ∞ , Φ ∞ ) be as above. Assume that I + is a ﬁnite interval. Then we have fork = , . . . , and all t ∈ I + that (8.12) Z R (cid:18) X j = F ∞ j F ∞ k j + Re (cid:0) D Φ ∞ D k Φ ∞ (cid:1)(cid:19) ( t , x ) dx = . As for the critical focusing nonlinear wave equation [10] and for critical wave maps [20], theLorentz invariance of the Maxwell-Klein-Gordon system and transformational properties of theenergy under Lorentz transformations are essential ingredients in the proof of Proposition 8.5. Webegin by considering the relativistic invariance properties of our system. Assume that L : R + → R + is a Lorentz transformation, acting on column vectors via multiplication with the matrix L . Then φ transforms according to(8.13) φ φ L : = φ (cid:0) L ( t , x ) (cid:1) , which results in ∇ t , x φ L t ∇ t , x φ (cid:0) L ( t , x ) (cid:1) . Then the potential A α needs to transform accordingly, i.e. writing this as a column vector indexedby α , we transform(8.14) A A L : = L t A (cid:0) L ( t , x ) (cid:1) . Then the expression ∂ β F αβ , when interpreted as a column vector in α , also transforms according tomultiplication with L t , as does the expressionIm (cid:0) φ D α φ (cid:1) . Under these transformations, the Maxwell-Klein-Gordon system is then invariant. However, theconserved energy does not remain invariant under general Lorentz transformations, and our ﬁrststep is to quantify this. In the sequel we only consider very speciﬁc Lorentz transformations of theform(8.15) L =  √ − d − d √ − d − d √ − d √ − d  for small d ∈ R . Lemma 8.6.

Let ( A , φ ) be an admissible global solution to MKG-CG and let L : R + → R + bea Lorentz transformation of the form (8.15) for some d ∈ R . Then we have for all t ∈ R thatE (cid:0) A L , φ L (cid:1) ( t ) = Z R (cid:18) X α,β F αβ + X α | D α φ | (cid:19)(cid:0) L ( t , x ) (cid:1) dx + d − d Z R (cid:18) X j = ( F j + F j ) + X α = | D α φ | (cid:19)(cid:0) L ( t , x ) (cid:1) dx − d − d Z R (cid:18) X j = F j F j + Re (cid:0) D φ D φ (cid:1)(cid:19)(cid:0) L ( t , x ) (cid:1) dx . (8.16) Proof.

The potential A is transformed into A L as follows A L ( t , x ) = √ − d A ( L ( t , x )) − d √ − d A ( L ( t , x )) , A L ( t , x ) = − d √ − d A ( L ( t , x )) + √ − d A ( L ( t , x )) , A Lj ( t , x ) = A j ( L ( t , x )) , j = , , .

20 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Then we compute the corresponding curvature components F L = ∂ t A L − ∂ A L = − d √ − d (cid:18) √ − d ∂ t A − d √ − d ∂ A (cid:19) + √ − d (cid:18) √ − d ∂ t A − d √ − d ∂ A (cid:19) − √ − d (cid:18) − d √ − d ∂ t A + √ − d ∂ A (cid:19) + d √ − d (cid:18) − d √ − d ∂ t A + √ − d ∂ A (cid:19) = F . Here the right hand side has to be evaluated at L ( t , x ). We use this convention for the remainder ofthe proof. Further, we obtain F L = √ − d ∂ t A − d √ − d ∂ A − ∂ (cid:18) √ − d A − d √ − d A (cid:19) = √ − d F − d √ − d F as well as F L = √ − d F − d √ − d F , F L = √ − d F − d √ − d F . Similarly, we compute F L = − d √ − d ∂ t A + √ − d ∂ A + d √ − d ∂ A − √ − d ∂ A = − d √ − d F + √ − d F and F L = − d √ − d F + √ − d F , F L = − d √ − d F + √ − d F . Finally, we have for i , j ≥ F Li j = F i j . In summary, we have found that X α,β (cid:0) F L αβ (cid:1) = X α,β F αβ + d − d X j = (cid:0) F j + F j (cid:1) − d − d X j = F j F j . (8.17)We have to carry out the analogous computations for the part of the energy associated with thescalar ﬁeld φ . Here we have (cid:12)(cid:12)(cid:12)(cid:0) ∂ t + iA L (cid:1) φ L (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:0) ∂ + iA L (cid:1) φ L (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) √ − d ∂ t φ − d √ − d ∂ φ + i (cid:18) √ − d A − d √ − d A (cid:19) φ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) − d √ − d ∂ t φ + √ − d ∂ φ + i (cid:18) − d √ − d A + √ − d A (cid:19) φ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) = + d − d (cid:16) | D φ | + | D φ | (cid:17) − d − d Re (cid:0) D φ D φ (cid:1) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 121 and for j = , , (cid:12)(cid:12)(cid:12)(cid:0) ∂ j + iA Lj (cid:1) φ L (cid:12)(cid:12)(cid:12) = | D j φ | . Thus, we obtain that X α (cid:12)(cid:12)(cid:12)(cid:0) ∂ α + iA L α (cid:1) φ L (cid:12)(cid:12)(cid:12) = X α | D α φ | + d − d (cid:0) | D φ | + | D φ | (cid:1) − d − d Re (cid:0) D φ D φ (cid:1) . (8.18)The assertion now follows from (8.17) and (8.18). (cid:3) The identity (8.16) strongly suggests that if it is impossible to lower the energy by means ofa Lorentz transform of the form (8.15) for very small d with a suitable sign, then the momentummust vanish. To make this observation rigorous, we also need to establish a relation between the S norm of an admissible global solution ( A , φ ) to MKG-CG and the S norm of a suitable evolutionof the data ( A L , φ L )[0] obtained from the Lorentz transformed solution ( A L , φ L ). Here we ﬁrstobserve that for an admissible global solution ( A , φ ) to MKG-CG, the Lorentz transformed solution( A L , φ L ) is actually globally deﬁned. We can therefore consider the data pair ( A L , φ L )[0] and notethat ( A L , φ L )[0] is C ∞ -smooth, but not in Coulomb gauge. Moreover, if ( t , x ) ∈ R + are restrictedto a space-like hyperplane containing the origin, then we have (cid:12)(cid:12)(cid:12) F jk ( t , x ) (cid:12)(cid:12)(cid:12) . (cid:0) + | t | + | x | (cid:1) − N for j , k ∈ { , . . . , } and any N ≥

1. From the equation satisﬁed by F αβ we obtain after integrationin time that (cid:12)(cid:12)(cid:12) F k ( t , x ) (cid:12)(cid:12)(cid:12) . (cid:0) + | t | + | x | (cid:1) − for k = , . . . ,

4. Thus, the curvature components of ( A L , φ L )[0] decay like h x i − as | x | → ∞ , whichensures L x -integrability, and the components ∇ t , x φ L decay rapidly with respect to x . In particu-lar, upon transforming ( A L , φ L )[0] into Coulomb gauge, it is meaningful to consider its MKG-CGevolution and its S norm. Then we prove the following technical Proposition 8.7.

Let ( A , φ ) be an admissible global solution to MKG-CG and let L : R + → R + be a Lorentz transformation of the form (8.15) for su ﬃ ciently small | d | . Let ( A L , φ L )[0] be thedata pair obtained from the Lorentz transformed solution ( A L , φ L ) . Assume that ( A L , φ L )[0] , whentransformed into the Coulomb gauge, results in a smooth global solution (cid:0) ˜ A L , ˜ φ L (cid:1) to MKG-CGsatisfying (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S < ∞ . Then we have for the original evolution ( A , φ ) that (cid:13)(cid:13)(cid:13) ( A , φ ) (cid:13)(cid:13)(cid:13) S ≤ C (cid:16)(cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S , L (cid:17) . We defer the technical proof of Proposition 8.7 to the end of this subsection and ﬁrst proveProposition 8.5 by combining Lemma 8.6 and Proposition 8.7.

Proof of Proposition 8.5.

In order to be able to apply Proposition 8.7, we have to use smooth solu-tions that are globally deﬁned, because otherwise we cannot meaningfully apply a Lorentz trans-formation. In fact, we may exploit that by the preceding Lemma 8.4, the function Φ ∞ is compactlysupported, which means that its Fourier transform cannot also be compactly supported (we may ofcourse assume Φ ∞ [0] to be non-vanishing, since otherwise, the solution extends trivially in a globalfashion and cannot be singular). But then, truncating the data ( A ∞ , Φ ∞ )[0] in Fourier space as inProposition 5.1 and the discussion following it, we may construct a sequence of smooth Coulomb

22 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION data ( A n , φ n )[0] converging to ( A ∞ , Φ ∞ )[0], and if necessary, multiplying the φ n [0] in the resulting( A n , φ n )[0] by a small scalar λ n ∈ [0 ,

1] with λ n → n → ∞ , we may force that for all n ≥ E ( A n , φ n ) < E crit . Note that then the perturbation theory developed in Proposition 5.1 still applies in relation to( A ∞ , Φ ∞ ), since we have not changed the data for A n . This means that the data ( A n , φ n )[0] doadmit a global MKG-CG evolution by deﬁnition of E crit , and can thus be Lorentz transformed. Inorder to justify various conservation laws for the Lorentz transformed ( A n , φ n ), we observe that wemay also localize the data ( A n , φ n )[0] in physical space to a su ﬃ ciently large ball, using the argu-ment in Subsection 5.2 as well as [7] such that the Lorentz transformed solution also has compactsupport on bounded time slices, and we still have the above inequality (8.19) for the energy.We make the hypothesis that the momentum of ( A ∞ , Φ ∞ ) does not vanish. Then without loss ofgenerality, there exists γ > ﬃ ciently large n , we have(8.20) Z R (cid:18) X j = F n , j F n , j + Re (cid:0) ( ∂ t + iA n , ) φ n ( ∂ + iA n , ) φ n (cid:1)(cid:19) ( t , x ) dx ≥ γ, where F n ,αβ denote the curvature components of ( A n , φ n ). It su ﬃ ces to show that a suitable Lorentztransformation L of the form (8.15) exists such that the transformed solutions ( A Ln , φ Ln ) to theMaxwell-Klein-Gordon system have energies(8.21) E ( A Ln , φ Ln ) ≤ E crit − κ ( γ, A ∞ , Φ ∞ )uniformly in n for some κ ( γ, A ∞ , Φ ∞ ) >

0. Then, upon transforming ( A Ln , φ Ln )[0] into the Coulombgauge, we obtain a global solution to MKG-CG with a ﬁnite global S norm bound, and usingProposition 8.7, we can infer a global S norm bound for ( A n , φ n ) uniformly in n , which contradictsthat ( A ∞ , Φ ∞ ) is a singular solution. To implement this strategy, we combine the argument forProposition 4.10 in [10] with Lemma 8.6.By energy conservation for ( A Ln , φ Ln ), we have the relation14 E (cid:0) A Ln , φ Ln (cid:1) (0) = Z E (cid:0) A Ln , φ Ln (cid:1) ( t ) dt , where we recall that for a solution ( A , φ ) to the Maxwell-Klein-Gordon system the energy at time t ∈ R is given by E (cid:0) A , φ (cid:1) ( t ) = Z R (cid:18) X α,β F αβ + X α | D α φ | (cid:19) ( t , x ) dx . According to Lemma 8.6, we can write14 E (cid:0) A Ln , φ Ln (cid:1) (0) = I + I , where I = Z Z R (cid:18) X α,β F n ,αβ + X α (cid:12)(cid:12)(cid:12) ( ∂ α + iA n ,α ) φ n (cid:12)(cid:12)(cid:12) (cid:19)(cid:0) L ( t , x ) (cid:1) dx dt + d − d Z Z R (cid:18) X j = (cid:0) F n , j + F n , j (cid:1) + X α = (cid:12)(cid:12)(cid:12) ( ∂ α + iA n ,α ) φ n (cid:12)(cid:12)(cid:12) (cid:19)(cid:0) L ( t , x ) (cid:1) dx dt ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 123 and I = − d − d Z Z R (cid:18) X j = F n , j F n , j + Re (cid:0) ( ∂ t + iA n , ) φ n ( ∂ + iA n , ) φ n (cid:1)(cid:19)(cid:0) L ( t , x ) (cid:1) dx dt . We recall that the above integrands are evaluated at L ( t , x ) = (cid:18) t − dx √ − d , x − dt √ − d , x , x , x (cid:19) . Next we compute the derivative of I + I with respect to d . To this end we note that for a regularfunction f of compact support, it holds that (see page 173 in [10]) ∂∂ d Z R f (cid:0) L ( t , x ) (cid:1) dx = − − d ∂∂ t Z R x f (cid:0) L ( t , x ) (cid:1) dx . Using our assumption that the spatial components of A n as well as φ n are compactly supported onﬁxed time slices, we thus obtain ∂∂ d I ( d ) = − − d Z ∂∂ t Z R x (cid:18) X α,β F n ,αβ + X α (cid:12)(cid:12)(cid:12) ( ∂ α + iA n ,α ) φ n (cid:12)(cid:12)(cid:12) (cid:19)(cid:0) L ( t , x ) (cid:1) dx dt + d (1 − d ) Z Z R (cid:18) X j = (cid:0) F n , j + F n , j (cid:1) + X α = (cid:12)(cid:12)(cid:12) ( ∂ α + iA n ,α ) φ n (cid:12)(cid:12)(cid:12) (cid:19)(cid:0) L ( t , x ) (cid:1) dx dt − d (1 − d ) Z ∂∂ t Z R x (cid:18) X j = (cid:0) F n , j + F n , j (cid:1) + X α = (cid:12)(cid:12)(cid:12) ( ∂ α + iA n ,α ) φ n (cid:12)(cid:12)(cid:12) (cid:19)(cid:0) L ( t , x ) (cid:1) dx dt and ∂∂ d I ( d ) = − + d (1 − d ) Z Z R (cid:18) X j = F n , j F n , j + Re (cid:0) ( ∂ t + iA n , ) φ n ( ∂ + iA n , ) φ n (cid:1)(cid:19)(cid:0) L ( t , x ) (cid:1) dx dt + d (1 − d ) Z ∂∂ t Z R x (cid:18) X j = F n , j F n , j + Re (cid:0) ( ∂ t + iA n , ) φ n ( ∂ + iA n , ) φ n (cid:1)(cid:19)(cid:0) L ( t , x ) (cid:1) dx dt . But then ∂∂ d ( I + I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) d = = − Z R x (cid:18) X α,β F n ,αβ + X α (cid:12)(cid:12)(cid:12) ( ∂ α + iA n ,α ) φ n (cid:12)(cid:12)(cid:12) (cid:19) ( t , x ) dx (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) t = t = − Z Z R (cid:18) X j = F n , j F n , j + Re (cid:0) ( ∂ t + iA n , ) φ n ( ∂ + iA n , ) φ n (cid:1)(cid:19) ( t , x ) dx dt . Using the weighted energy identity (8.5) (for R → ∞ ) and (8.20), we conclude that ∂∂ d ( I + I ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) d = = − Z Z R (cid:18) X j = F n , j F n , j + Re (cid:0) ( ∂ t + iA n , ) φ n ( ∂ + iA n , ) φ n (cid:1)(cid:19) ( t , x ) dx dt ≤ − γ uniformly for all su ﬃ ciently large n . Also, by energy conservation for ( A n , φ n ), we have for all n that ( I + I )( d = = E ( A n , φ n ) < E crit .

24 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Hence, we can ﬁnd a small d > E (cid:0) A Ln , φ Ln (cid:1) (0) ≤ E crit − κ uniformly for all su ﬃ ciently large n for some κ ≡ κ ( γ, A ∞ , Φ ∞ ) >

0, which yields (8.21) and thusﬁnishes the proof of Proposition 8.5. (cid:3)

We next state the analogous result to Proposition 8.5 when I + is inﬁnite. Its proof essentiallyfollows the argument of the proof of Proposition 4.11 in [10] using the same modiﬁcations as in thepreceding proof of Proposition 8.5. Proposition 8.8.

Let ( A ∞ , Φ ∞ ) be as above. Assume that I + = [0 , ∞ ) . Suppose in addition that λ ( t ) ≥ λ > for all t ≥ . Then we have for k = , . . . , and all t ≥ that (8.22) Z R (cid:18) X j = F ∞ j F ∞ k j + Re (cid:0) D Φ ∞ D k Φ ∞ (cid:1)(cid:19) ( t , x ) dx = . It remains to give the proof of Proposition 8.7.

Proof of Proposition 8.7.

We are given an admissible global solution ( A , φ ) to MKG-CG and aLorentz transformation L : R + → R + of the form (8.15) for small d ∈ R . Applying the Lorentztransformation L to ( A , φ ), we obtain a global solution ( A L , φ L ) to the Maxwell-Klein-Gordon sys-tem. Next we deﬁne the gauge transform γ = X l = ∆ − (cid:0) ∂ l A Ll (cid:1) = ∆ − (cid:0) ∂ l A Ll (cid:1) and set ˜ φ L = e i γ φ L , ˜ A L α = A L α − ∂ α γ, α = , , . . . , . Then ( ˜ A L , ˜ φ L ) is in Coulomb gauge and a global solution to MKG-CG. By assumption we have that (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S < ∞ . Now the di ﬃ culty in controlling the S norm of ( A , φ ) is that this norm is far from invariant underthe operation of Lorentz transformations. Nonetheless, one can establish control over a certain setof norms of ( A , φ ) that are essentially invariant under Lorentz transformations, and which in turnimply control over the full S norm of ( A , φ ). We do this in the following observations. Observation 1:

For C = C ( L ) with C ( L ) → ∞ as L → Id, i.e. as d → , we have the bounds (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) ∇ x P k Q [ k + C , k + C ] c φ L (cid:13)(cid:13)(cid:13) X , ∞ (cid:19) . (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S , (8.23) (cid:18)X k ∈ Z − ν k (cid:13)(cid:13)(cid:13) ∇ x P k Q [ k + C , k + C ] φ L (cid:13)(cid:13)(cid:13) L t L + x (cid:19) . (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S (8.24) for some ν > . Similarly for A L , we have the bounds (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) ∇ x P k Q [ k + C , k + C ] c A L (cid:13)(cid:13)(cid:13) X , ∞ (cid:19) . (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S , (8.25) (cid:18)X k ∈ Z − (1 + ) k (cid:13)(cid:13)(cid:13) ∇ x P k Q [ k + C , k + C ] A L (cid:13)(cid:13)(cid:13) L t L + x (cid:19) . (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S . (8.26) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 125

Moreover, we have the bounds (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) ∇ x P k Q ≤ k + C φ L (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:19) . (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S , (8.27) (cid:18)X k ∈ Z − k (cid:13)(cid:13)(cid:13) P k Q ≤ k + C φ L (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:19) . (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S , (8.28) Here the implicit constants may also depend on (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S .Proof of Observation 1. We ﬁrst derive suitable estimates on the gauge transform γ , which will thenallow us to obtain the claimed bounds in the statement of Observation 1. To this end we compute γ in terms of ˜ A L , for which we already have good bounds by assumption. Note that in view of (8.14),we have (cid:0) ˜ A L (cid:1) L − = A − ( ∇ t , x γ ) L − = A − ∇ t , x (cid:0) γ ( L − · ) (cid:1) and so we get γ ( L − · ) = − ∆ − ∂ l (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l . Thus, we ﬁnd γ ( t , x ) = − (cid:16) ∆ − ∂ l (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l (cid:17) L ( t , x ) = − (cid:16) ∆ − ∂ l (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l (cid:17) ( L ( t , x )) . Now for any ﬁxed dyadic frequency k ∈ Z , we can write (cid:16) P k ∆ − ∂ l (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l (cid:17) ( t , x ) = Z R m lk ( a ) (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l ( t , x − a ) da for suitable L x -functions m lk ( a ) with L x -mass ∼ − k , and further (cid:16) P k ∆ − ∂ l (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l (cid:17) L ( t , x ) = Z R m lk ( a ) (cid:16)(cid:0) L − (cid:1) t ˜ A L (cid:17) l (cid:0) ( t , x ) − L − (0 , a ) (cid:1) da . Also, if j ≤ k + C for suitable C = C ( L ), then Fourier localization to dyadic modulation 2 j andspatial frequency 2 k essentially commute with the Lorentz transformation, provided C is not toolarge depending on d , and so we have (cid:16) P k Q j ∆ − ∂ l (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l (cid:17) L ( t , x ) = P k Q j (cid:18)Z R m lk ( a ) (cid:16) P k + O (1) Q j + O (1) (cid:16)(cid:0) L − (cid:1) t ˜ A L (cid:17) l (cid:0) ( t , x ) − L − (0 , a ) (cid:1) da (cid:19) , where we note that the right hand side is a linear combination of all the components (cid:0) ˜ A L (cid:1) α . Thisimmediately implies for j ≤ k + C that2 j (cid:13)(cid:13)(cid:13) P k Q j ∇ x γ (cid:13)(cid:13)(cid:13) L t L x = j (cid:13)(cid:13)(cid:13) P k Q j ∇ x (cid:16) ∆ − ∂ l (cid:0)(cid:0) ˜ A L (cid:1) L − (cid:1) l (cid:17) L (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) P k ˜ A L (cid:13)(cid:13)(cid:13) X , ∞ . Similarly, one shows that for j > k + C we have2 j (cid:13)(cid:13)(cid:13) P k Q j ∇ x γ (cid:13)(cid:13)(cid:13) L t L x . k − j (cid:13)(cid:13)(cid:13) P j ∇ t , x ˜ A L (cid:13)(cid:13)(cid:13) X , ∞ . In fact, here, the very large modulation j then gets transferred to the frequency after Lorentz trans-form. Finally, for the expression P k Q [ k + C , k + C ] γ , the Lorentz transformation may lead to smallfrequencies . k , which is why we can only place the expression into L t L + x then via Bernstein, i.e. (cid:18)X k ∈ Z − (1 + ) k (cid:13)(cid:13)(cid:13) P k Q [ k + C , k + C ] ∇ t , x γ (cid:13)(cid:13)(cid:13) L t L + x (cid:19) . (cid:13)(cid:13)(cid:13) ˜ A L (cid:13)(cid:13)(cid:13) S .

26 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

One also infers by similar reasoning that (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) P k ∇ t , x γ (cid:13)(cid:13)(cid:13) L t L x ∩ L t ˙ W , x (cid:19) . (cid:13)(cid:13)(cid:13) ˜ A L (cid:13)(cid:13)(cid:13) S as well as (cid:13)(cid:13)(cid:13) P k Q ≤ k + C ∇ t , x γ (cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) P k + O (1) ˜ A L (cid:13)(cid:13)(cid:13) L ∞ t L x . With these bounds on γ in hand, we can now start the derivation of the bounds for φ L = e − i γ ˜ φ L .For j ≤ k + C we write P k Q j (cid:0) e − i γ ˜ φ L (cid:1) = P k Q j (cid:0) P ≤ j − Q ≤ j − ( e − i γ ) ˜ φ L (cid:1) + P k Q j (cid:0) P ≤ j − Q > j − ( e − i γ ) ˜ φ L (cid:1) + P k Q j (cid:0) P > j − ( e − i γ ) ˜ φ L (cid:1) . (8.29)For the ﬁrst term on the right, we have P k Q j (cid:0) P ≤ j − Q ≤ j − ( e − i γ ) ˜ φ L (cid:1) = P k Q j (cid:0) P ≤ j − Q ≤ j − ( e − i γ ) P k + O (1) Q j + O (1) ˜ φ L (cid:1) , and so we infer(8.30) 2 j (cid:13)(cid:13)(cid:13) ∇ t , x P k Q j (cid:0) P ≤ j − Q ≤ j − ( e − i γ ) ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) P k ∇ t , x ˜ φ L (cid:13)(cid:13)(cid:13) X , ∞ . For the second term on the right hand side of (8.29), we write schematically P k Q j (cid:0) P ≤ j − Q > j − ( e − i γ ) ˜ φ L (cid:1) = − j P k Q j (cid:0) P ≤ j − Q > j − ( ∂ t γ e − i γ ) ˜ φ L (cid:1) and so we get from the preceding2 j (cid:13)(cid:13)(cid:13) ∇ t , x P k Q j (cid:0) P ≤ j − Q > j − ( e − i γ ) ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j · − j (cid:13)(cid:13)(cid:13) P ≤ j − Q > j − ( ∂ t γ e − i γ ) (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P k ∇ t , x ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) ˜ A L (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k ∇ t , x ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . (8.31)For the last term on the right hand side of (8.29), write it as P k Q j (cid:0) P > j − ( e − i γ ) ˜ φ L (cid:1) = P k Q j (cid:0) P [ j − , k − ( e − i γ ) ˜ φ L (cid:1) + P k Q j (cid:0) P [ k − , k + ( e − i γ ) ˜ φ L (cid:1) + P k Q j (cid:0) P > k + ( e − i γ ) ˜ φ L (cid:1) . (8.32)The ﬁrst term on the right is bounded by(8.33) 2 j (cid:13)(cid:13)(cid:13) ∇ x P k Q j (cid:0) P [ j − , k − ( e − i γ ) ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j k − j (cid:13)(cid:13)(cid:13) P [ j − , k − ( ∇ x γ e − i γ ) (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) ∇ x P k ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . ( k − j ) (cid:13)(cid:13)(cid:13) ˜ A L (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P k ∇ x ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . For the second term on the right hand side of (8.32), we write it schematically as P k Q j (cid:0) P [ k − , k + ( e − i γ ) ˜ φ L (cid:1) = − k P k Q j (cid:0) P [ k − , k + ( ∇ x γ P ≤ k − Q ≤ k − ( e − i γ )) P ≤ k + ˜ φ L (cid:1) + − k P k Q j (cid:0) P [ k − , k + ( ∇ x γ P ≤ k − Q > k − ( e − i γ )) P ≤ k + ˜ φ L (cid:1) + − k P k Q j (cid:0) P [ k − , k + ( ∇ x γ P > k − ( e − i γ )) P ≤ k + ˜ φ L (cid:1) . (8.34) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 127

Then we estimate the ﬁrst term on the right of (8.34) by(8.35) 2 j (cid:13)(cid:13)(cid:13) ∇ x − k P k Q j (cid:0) P [ k − , k + ( ∇ x γ P ≤ k − Q ≤ k − ( e − i γ )) P ≤ k + ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j (cid:13)(cid:13)(cid:13) P [ k − , k + Q ≤ k + C ∇ x γ (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P ≤ k + Q ≤ k ˜ φ L (cid:13)(cid:13)(cid:13) L t L ∞ x + j (cid:13)(cid:13)(cid:13) P [ k − , k + ∇ x γ (cid:13)(cid:13)(cid:13) L t L + x (cid:13)(cid:13)(cid:13) P ≤ k + Q > k ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L − x . ( j − k ) (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x ˜ A L (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) ˜ φ L (cid:13)(cid:13)(cid:13) S . Further, we get for the second term on the right of (8.34) that2 j (cid:13)(cid:13)(cid:13) ∇ x − k P k Q j (cid:0) P [ k − , k + ( ∇ x γ P ≤ k − Q > k − ( e − i γ )) P ≤ k + ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L x . j − k (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x γ (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) ∂ t γ (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P ≤ k + ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . j − k k (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x γ (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) ( ˜ A L , ˜ φ L ) (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) ˜ φ L (cid:13)(cid:13)(cid:13) S . (8.36)The third term on the right hand side of (8.34)2 − k P k Q j (cid:0) P [ k − , k + ( ∇ x γ P > k − ( e − i γ )) P ≤ k + ˜ φ L (cid:1) is handled similarly, which concludes the treatment of the contribution of the second term on theright hand side of (8.32), namely P k Q j (cid:0) P [ k − , k + ( e − i γ ) ˜ φ L (cid:1) . To treat the third term on the right hand side of (8.32), i.e. the high-high interaction term P k Q j (cid:0) P > k + ( e − i γ ) ˜ φ L (cid:1) , we write it schematically as P k Q j (cid:0) P > k + ( e − i γ ) ˜ φ L (cid:1) = X k > k + k = k + O (1) P k Q j (cid:0) P k ( e − i γ ) P k ˜ φ L (cid:1) = X k > k + k = k + O (1) − k P k Q j (cid:0) P k ( ∇ x γ e − i γ ) P k ˜ φ L (cid:1) and so we can estimate this by2 j (cid:13)(cid:13)(cid:13) ∇ x P k Q j (cid:0) P > k + ( e − i γ ) ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L x . X k > k + k = k + O (1) ( j + k ) k − k k − k (cid:13)(cid:13)(cid:13) ∇ x γ (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P k ∇ x ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . (8.37)Combining the bounds (8.30) – (8.37) and square-summing over k , the estimate (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) ∇ t , x P k Q ≤ k + C φ L (cid:13)(cid:13)(cid:13) X , ∞ (cid:19) . (cid:13)(cid:13)(cid:13) ( ˜ A L , ˜ φ L ) (cid:13)(cid:13)(cid:13) S with implied constant also depending on (cid:13)(cid:13)(cid:13) ( ˜ A L , ˜ φ L ) (cid:13)(cid:13)(cid:13) S easily follows. We omit the estimate for (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) ∇ t , x P k Q > k + C φ L (cid:13)(cid:13)(cid:13) X , ∞ (cid:19) as it is similar. This proves the ﬁrst bound (8.23) in the statement of Observation 1.

28 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Next, we turn to the proof of the second bound (8.24) and consider P k Q [ k + C , k + C ] ( e − i γ ˜ φ L ) = P k Q [ k + C , k + C ] (cid:0) P ≤ k − Q ≤ k − ( e − i γ ) ˜ φ L (cid:1) + P k Q [ k + C , k + C ] (cid:0) P ≤ k − Q > k − ( e − i γ ) ˜ φ L (cid:1) + P k Q [ k + C , k + C ] (cid:0) P [ k − , k + ( e − i γ ) ˜ φ L (cid:1) + P k Q [ k + C , k + C ] (cid:0) P > k + ( e − i γ ) ˜ φ L (cid:1) . (8.38)Each of these terms is straightforward to estimate. For the ﬁrst term on the right, we obtain (cid:13)(cid:13)(cid:13) P k Q [ k + C , k + C ] (cid:0) P ≤ k − Q ≤ k − ( e − i γ ) ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L + x . (cid:13)(cid:13)(cid:13) P k + O (1) Q k + O (1) ˜ φ L (cid:13)(cid:13)(cid:13) L t L + x . ( ν − k (cid:13)(cid:13)(cid:13) P k ˜ φ L (cid:13)(cid:13)(cid:13) S . Also, we get (cid:13)(cid:13)(cid:13) P k Q [ k + C , k + C ] ( P ≤ k − Q > k − ( e − i γ ) ˜ φ L ) (cid:13)(cid:13)(cid:13) L t L + x . − k (cid:13)(cid:13)(cid:13) ∂ t γ (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P k + O (1) ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L + x . ( ν − k (cid:13)(cid:13)(cid:13) ∂ t γ (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P k ˜ φ L (cid:13)(cid:13)(cid:13) S , and (cid:13)(cid:13)(cid:13) P k Q [ k + C , k + C ] (cid:0) P [ k − , k + ( e − i γ ) ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L t L + x . − k (cid:13)(cid:13)(cid:13) ∇ x γ (cid:13)(cid:13)(cid:13) L t L x (cid:13)(cid:13)(cid:13) P ≤ k + O (1) ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L + x . ( ν − k (cid:13)(cid:13)(cid:13) ∇ x γ (cid:13)(cid:13)(cid:13) L t L x X l ≤ k + O (1) ν ( l − k ) (cid:13)(cid:13)(cid:13) P l ˜ φ L (cid:13)(cid:13)(cid:13) S . The last term on the right hand side of (8.38) can be handled similarly. These estimates then yieldthe second inequality (8.24) of Observation 1.We also observe that the estimates on γ established earlier yield the required bounds (8.25) and(8.26) for A L = ˜ A L + ∇ t , x γ .Now we turn to the last bounds (8.27) and (8.28) in the statement of Observation 1. We onlyprove (8.27), the proof of (8.28) being similar. We write P k Q ≤ k + C φ L = P k Q ≤ k + C (cid:0) P ≤ k − ( e − i γ ) ˜ φ L (cid:1) + P k Q ≤ k + C (cid:0) P [ k − , k + ( e − i γ ) ˜ φ L (cid:1) + P k Q ≤ k + C (cid:0) P > k + ( e − i γ ) ˜ φ L (cid:1) . (8.39)The ﬁrst term is directly bounded by(8.40) (cid:13)(cid:13)(cid:13) ∇ x P k Q ≤ k + C (cid:0) P ≤ k − ( e − i γ ) ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . The second term on the right of (8.39) is a bit more complicated. We write it schematically as P k Q ≤ k + C (cid:0) P [ k − , k + ( e − i γ ) ˜ φ L (cid:1) = P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ e − i γ ) P ≤ k + O (1) ˜ φ L (cid:1) = P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P ≤ k − Q ≤ k − ( e − i γ )) P ≤ k − ˜ φ L (cid:1) + P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P ≤ k − Q > k − ( e − i γ )) P ≤ k − ˜ φ L (cid:1) + P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P > k − ( e − i γ )) P ≤ k − ˜ φ L (cid:1) + P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P ≤ k − ( e − i γ )) P > k − ˜ φ L (cid:1) . (8.41) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 129

Then we get for the ﬁrst term of the last list of four terms (cid:13)(cid:13)(cid:13) ∇ x P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P ≤ k − Q ≤ k − ( e − i γ )) P ≤ k − ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) P [ k − , k + Q ≤ k + C + ∇ x γ (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P ≤ k − ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x . (cid:13)(cid:13)(cid:13) P k + O (1) ˜ A L (cid:13)(cid:13)(cid:13) L ∞ t L x X l ≤ k − l (cid:13)(cid:13)(cid:13) P l ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x ˜ A L (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) ˜ φ L (cid:13)(cid:13)(cid:13) S , (8.42)where we have taken advantage of our previous considerations on the structure of γ . For the secondterm on the above list (8.41), we get(8.43) (cid:13)(cid:13)(cid:13) ∇ x P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P ≤ k − Q > k − ( e − i γ )) P ≤ k − ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) P ≤ k + O (1) ∇ x γ (cid:13)(cid:13)(cid:13) L ∞ t L + x (cid:13)(cid:13)(cid:13) P ≤ k − Q > k − ( e − i γ ) (cid:13)(cid:13)(cid:13) L ∞ t L + x (cid:13)(cid:13)(cid:13) P ≤ k − ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L ∞− x . (cid:13)(cid:13)(cid:13) ˜ A L (cid:13)(cid:13)(cid:13) S X l ≤ k − − σ ( k − l ) (cid:13)(cid:13)(cid:13) P l ˜ φ L (cid:13)(cid:13)(cid:13) S . The term P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P > k − ( e − i γ )) P ≤ k − ˜ φ L (cid:1) is handled similarly. Finally, we have (cid:13)(cid:13)(cid:13) ∇ x P k Q ≤ k + C (cid:0) − k P [ k − , k + ( ∇ x γ P ≤ k − ( e − i γ )) P > k − ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) L ∞ t L x . (cid:13)(cid:13)(cid:13) P k + O (1) ∇ x γ (cid:13)(cid:13)(cid:13) L ∞ t L x (cid:13)(cid:13)(cid:13) P k + O (1) ˜ φ L (cid:13)(cid:13)(cid:13) L ∞ t L x . (8.44)The bounds (8.40) – (8.44) su ﬃ ce to perform the square summation over k in the last inequality ofObservation 1. The last term on the right hand side of (8.39) P k Q ≤ k + C (cid:0) P > k + ( e − i γ ) ˜ φ L (cid:1) is treated similarly and hence omitted here. (cid:3) Observation 2:

We have the bound X k ≤ k − k (cid:13)(cid:13)(cid:13) P k Q ±≤ k + C φ L P k Q ±≤ k + C ∇ t , x φ L (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) ( ˜ A L , ˜ φ L ) (cid:13)(cid:13)(cid:13) S . Moreover, for any L -space-time integrable weight function m ( a ) , a ∈ R + , we have X k ≤ k − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)Z R + m ( a ) P k Q ±≤ k + C φ L ( · − a ) P k Q ±≤ k + C ∇ t , x φ L ( · − a ) da (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) ( ˜ A L , ˜ φ L ) (cid:13)(cid:13)(cid:13) S . Similar bounds can be obtained upon replacing one or more factors by A L . We make the crucialobservation that these bounds are essentially invariant under mild Lorentz transformations. Thus,we infer similar bounds for A and φ .Proof of Observation 2. Here one places the low frequency input P k Q ±≤ k + C φ L into L t L ∞ x and the high frequency input P k Q ±≤ k + C ∇ t , x φ L into L ∞ t L x by using Observation 1. (cid:3)

30 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Using Observation 1 and Observation 2, we can now move toward controlling the norm k ( A , φ ) k S .From above, we know a priori that we control (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) P k Q [ k + C , k + C ] c ∇ t , x φ L (cid:13)(cid:13)(cid:13) X , ∞ (cid:19) as well as norms of the form X k ≤ k − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)Z R + m ( a ) P k Q ±≤ k + C φ L ( · − a ) P k Q ±≤ k + C ∇ t , x φ L ( · − a ) da (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ! / . The latter has the crucial divisibility property, i.e. for any δ > R into intervals I , I , . . . , I N , with N depending on the size of the norm as well as δ such that we havefor each n = , . . . , N that X k ≤ k − k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) χ I n ( t ) Z R + m ( a ) P k Q ±≤ k + C φ L ( · − a ) P k Q ±≤ k + C ∇ t , x φ L ( · − a ) da (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ! / ≤ δ. Of course, we get a similar statement for weakened versions of the former norm such as (cid:18)X k ∈ Z (cid:13)(cid:13)(cid:13) P k Q [ k − C , k + C ] ∇ t , x φ L (cid:13)(cid:13)(cid:13) X , ∞ (cid:19) . In order to infer the desired S norm bound on ( A , φ ), we shall partition the time axis R into ﬁnitelymany intervals I , . . . , I N , whose number depends on (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S and such that on each of these I n , we can infer via a direct bootstrap argument a bound on k ( A , φ ) k S ( I n × R ) . This will then su ﬃ ce to obtain the desired bound on k ( A , φ ) k S ( R × R ) . We do this in two steps, whichwe outline below. Step 1:

Given δ > and δ > , using the known a priori bound on (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S and choosing theintervals I n suitably as above (whose number will depend on δ , δ , as well as the assumed boundon (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S ) , we infer from the equation for A, upon writingA | I n = A f ree , ( I n ) + A nonlin , ( I n ) , that there exists a decompositionA nonlin , ( I n ) = A nonlin , ( I n ) + A nonlin , ( I n ) , where we have schematicallyA nonlin = X k (cid:3) − P k Q k + O (1) ( P k + O (1) Q < k − C φ ∇ x P k + O (1) Q < k − C φ ) , while we also have the bound (cid:13)(cid:13)(cid:13) A nonlin (cid:13)(cid:13)(cid:13) ℓ S ( I n × R ) ≤ δ + δ (cid:16) k ( A , φ ) k S ( I n × R ) + k ( A , φ ) k S ( I n × R ) (cid:17) for all n = , , . . . , N. Moreover, it holds that (cid:13)(cid:13)(cid:13) A nonlin (cid:13)(cid:13)(cid:13) S ( I n × R ) ≤ δ for all n = , , . . . , N. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 131

The idea behind this bound is to insert it into the equation for φ and pick δ , δ small dependingon E crit . Proof of Step 1.

We proceed as in the proof of Lemma 7.6 and write the source term in the equationfor A in the schematic form (cid:3) A i = ∆ − ∇ r N ir (cid:0) φ, φ (cid:1) + A | φ | . Localizing this to frequency k =

0, we write the right hand side in the form P (cid:16) ∆ − ∇ r N ir (cid:0) φ, φ (cid:1) + A | φ | (cid:17) = P (cid:18) X k , k ∆ − ∇ r N ir (cid:0) P k φ, P k φ (cid:1) + X k , k , k P k AP k φ P k φ (cid:19) . We ﬁrst deal with the quadratic null form term and reduce this to moderate frequencies by observingthat for C = C ( δ ) large enough, we obtain (for a suitable absolute constant σ independent of allother constants) (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X | k | > C , k P (cid:16) ∆ − ∇ r N ir ( P k φ, P k φ ) (cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ≤ δ X k > C − σ | k | (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S . Generalizing to arbitrary output frequencies, one easily gets from here the bound X k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X | k − k | > C , k P k (cid:16) ∆ − ∇ r N ir (cid:0) P k φ, P k φ (cid:1)(cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N k . δ (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) S . Next, we pick C = C ( E crit ) such that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P (cid:16) X | k | , | k | < C ∆ − ∇ r N ir (cid:0) P k Q > C φ, P k φ ) (cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ≤ δ X | k | , | k | < C (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k , and generalizing to general output frequencies, we then reduce to X k P k (cid:18) X | k , − k | < C ∆ − ∇ r N ir (cid:0) P k Q ≤ k + C φ, P k Q ≤ k + C φ (cid:1)(cid:19) . Depending on our choice of C , we may assume the Lorentz transform L to be chosen su ﬃ cientlyclose to the identity, i.e. | d | su ﬃ ciently small, such that according to Observation 1 we have (cid:18) X k (cid:13)(cid:13)(cid:13) P k Q [ k − C , k + C ] ∇ t , x φ (cid:13)(cid:13)(cid:13) X , ∞ (cid:19) . (cid:13)(cid:13)(cid:13) ( ˜ A L , ˜ φ L ) (cid:13)(cid:13)(cid:13) S . As observed before, this norm has the divisibility property, so that restricting to suitable time inter-vals I n , n = , . . . , N , which form a partition of the time axis R , we may assume (cid:18)X k (cid:13)(cid:13)(cid:13) P k Q [ k − C , k + C ] ∇ t , x φ (cid:13)(cid:13)(cid:13) X , ∞ ( I n × R ) (cid:19) ≤ δ for all n = , . . . , N . But then we easily infer the bound (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P (cid:18) X | k | , | k | < C ∆ − ∇ r N ir (cid:0) P k Q [ k − C , k + C ] φ, P k φ (cid:1)(cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . (cid:13)(cid:13)(cid:13) P k Q [ k − C , k + C ] φ (cid:13)(cid:13)(cid:13) L t L x ( I n × R ) (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) L t L ∞ x ≤ δ (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k ,

32 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION and this su ﬃ ces again, after generalizing this to arbitrary output frequency. In fact, we get X k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P k (cid:18) X | k , − k | < C ∆ − ∇ r N ir (cid:0) P k Q [ k − C , k + C ] φ, P k φ (cid:1)(cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N k ( I n × R ) . (cid:18) X k (cid:13)(cid:13)(cid:13) P k Q [ k − C , k + C ] ∇ t , x φ (cid:13)(cid:13)(cid:13) X , ∞ ( I n × R ) (cid:19) (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) S ( I n × R ) ≤ δ (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) S ( I n × R ) . We have now reduced to the expression X k P k (cid:18) X | k , − k | < C ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ (cid:1)(cid:19) . The last reduction here consists in removing extremely small angular separation between the inputs P k Q ≤ k − C φ, P k Q ≤ k − C φ. Thus, there exists C = C ( δ ) such that we have (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) P (cid:18) X | k , | < C ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ (cid:1)(cid:19) ′ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ≤ δ X | k , | < C (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S k , where the prime indicates that the inputs are reduced to have closely aligned Fourier supports ofangular separation C − . Finally, we write X | k , | < C P (cid:18) ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ (cid:1)(cid:19) = X | k , | < C P (cid:18) ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ (cid:1)(cid:19) ′ + X | k , | < C P (cid:18) ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ (cid:1)(cid:19) ′′ , where the second term is of the form A nonlin as required for Step 1. In fact, the angular separationof the inputs and small modulation forces the output to have modulation ∼

1. Moreover, replacingthe output frequency by k and square-summing over k results in a small norm due to the fact that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X | k , | < C P (cid:18) ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ (cid:1)(cid:19) ′′ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X | k , | < C P (cid:18) ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ ) (cid:19) ′′ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( I n × R ) and we can then take advantage of Observation 2 to obtain (cid:18) X k (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X | k , − k | < C P k (cid:18) ∆ − ∇ r N ir (cid:0) P k Q ≤ k − C φ, P k Q ≤ k − C φ (cid:1)(cid:19) ′′ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N k ( I n × R ) (cid:19) ≤ δ by choosing the intervals I n suitably. The cubic term P k , , P k AP k φ P k φ is handled similarly. (cid:3) ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 133

Step 2:

Choosing the time intervals I n suitably as in Step 1, we obtain the equation X k ∈ Z (cid:3) A free < k P k φ = F , where we have k F k N ( I n × R ) ≤ δ + δ (cid:0) k ( A , φ ) k S ( I n × R ) + (cid:13)(cid:13)(cid:13) ( A , φ ) (cid:13)(cid:13)(cid:13) S ( I n × R ) (cid:1) , where A f ree is the free wave evolution of the data for A at the beginning endpoint of I n .Proof of Step 2. This to a large extent mimics the argument for the proof of Proposition 7.5. In fact,we recall from there that we can write F = P k ∈ Z F k with F k = − iP k (cid:0) A f ree ≥ k ,ν ∂ ν φ (cid:1) − [ P k , (cid:3) A free < k ] φ − P k (cid:0) ( (cid:3) A − (cid:3) A free ) φ (cid:1) + P k (cid:0) (cid:3) A free < k φ + i (cid:0) A f ree ≥ k ,ν ∂ ν φ (cid:1) − (cid:3) A free φ (cid:1) . As usual, we treat each term separately.

First term . Similar to the proof of Proposition 7.5, we reduce it to − iP k (cid:0) P k + O (1) A f ree ,ν ∂ ν P k + O (1) φ (cid:1) up to terms satisfying the conclusion of Step 2. Then using divisibility for the norm X k ∈ Z − k (cid:13)(cid:13)(cid:13) P k A f ree ,ν (cid:13)(cid:13)(cid:13) L t L ∞ x as well as the inequality (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X k ∈ Z − iP k (cid:16) P k + O (1) A f ree ,ν ∂ ν P k + O (1) φ (cid:17)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . (cid:18) X k ∈ Z − k (cid:13)(cid:13)(cid:13) χ I n P k A f ree ,ν (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:19) k φ k S ( I n × R ) , we get the conclusion of Step 2 by choosing the intervals I n suitably and by subdividing the intervalsobtained from Step 1, if necessary. Second term . This is handled like the ﬁrst term, since it can be written in the form X k ˜ P k (cid:0) i ∇ x A f ree < k ,ν ∂ ν φ (cid:1) + ˜ P k (cid:0) ∇ x (( A f ree < k ) ) φ (cid:1) . Third term . As usual this term is the most di ﬃ cult one, since it contains X k ∈ Z iP < k A nonlin ν ∂ ν P k φ. We essentially follow the reductions performed in the proof of Proposition 7.5, whence we shall becorrespondingly brief.

Reduction to H ∗ N lowhi . Using the same notation as in that proof and restricting to frequency k = (cid:13)(cid:13)(cid:13) N lowhi (cid:0) P < A nonlin , ( I n ) , P φ (cid:1) − H ∗ N lowhi (cid:0) P < A nonlin , ( I n ) , P φ (cid:1)(cid:13)(cid:13)(cid:13) N ( I n × R ) . (cid:13)(cid:13)(cid:13) A nonlin , ( I n ) (cid:13)(cid:13)(cid:13) S ( I n × R ) (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I n × R ) . Hence, replacing the output frequency by general k ∈ Z and square-summing gives the bound . k ( A , φ ) k S ( I n × R ) (cid:16) δ + δ (cid:0) k ( A , φ ) k S ( I n × R ) + k ( A , φ ) k S ( I n × R ) (cid:1)(cid:17) ,

34 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION which is of the desired form. This then reduces the estimate of N lowhi (cid:0) P < A nonlin , ( I n ) , P φ (cid:1) − H ∗ N lowhi (cid:0) P < A nonlin , ( I n ) , P φ (cid:1) to the contribution of P < A nonlin , ( I n ) , whose explicit form we recall from Step 1. This means wehave to estimate the expression (cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) ∇ t , x P φ (cid:19) . The idea here is to use the a priori bounds from Observation 1 and Observation 2 to arrive at therequired estimate. For this, we split the above expression into the following (cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) ∇ t , x P φ (cid:19) = (cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q [ C , C ] ∇ t , x P φ (cid:19) + (cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q [ k − C , C ] ∇ t , x P φ (cid:19) + (cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q ≤ k − C ∇ t , x P φ (cid:19) + (cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q > C ∇ t , x P φ (cid:19) ≡ I + II + III + IV . We now estimate each of the terms on the right in turn.

Estimate of term I . We distinguish between very small k and k = O (1). In the latter case, weschematically estimate the term in the following fashion. We shall suppress the distinction betweenspace-time translates of φ and φ , as our norms are invariant under these, and also keep in mind thatthe operator (cid:3) − P k Q k + O (1) is given by (space-time) convolution with a kernel of L -mass ∼ − k . Then we get in case k = O (1), (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q [ C , C ] ∇ t , x P φ (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) ≤ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q [ C , C ] ∇ t , x P φ (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) L t L x ( I n × R ) . − k (cid:13)(cid:13)(cid:13) P k + O (1) Q ≤ k − C φ (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ Q [ C , C ] ∇ t , x P φ (cid:13)(cid:13)(cid:13) L t L x . Here the second factor is essentially invariant under mild Lorentz transformations, and so we get(up to changing the meaning of the constants slightly) (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ Q [ C , C ] ∇ t , x P φ (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ L Q [ C , C ] ∇ t , x P ≤ O (1) φ L (cid:13)(cid:13)(cid:13) L t L x . We estimate the last norm using Observation 1, resulting in the bound (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ Q [ C , C ] ∇ t , x P φ (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ L (cid:13)(cid:13)(cid:13) L ∞ t L − x (cid:13)(cid:13)(cid:13) Q [ C , C ] ∇ t , x P ≤ O (1) φ L (cid:13)(cid:13)(cid:13) L t L + x ≤ C (cid:16)(cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S (cid:17) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 135

Then by divisibility of the norm (cid:18) X l ∈ Z − l (cid:16) X k − l = O (1) (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ Q [ l − C , l + C ] ∇ t , x P l φ (cid:13)(cid:13)(cid:13) L t L x (cid:17) (cid:19) , we arrive upon suitable choice of the intervals I n at the conclusion that (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X l ∈ Z (cid:0) − H ∗ (cid:1)(cid:18) X | k − l | = O (1) (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ · ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q [ l + C , l + C ] ∇ t , x P l φ (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) ≤ δ k φ k S ( I n × R ) . This completes the contribution of the term I when k = O (1). On the other hand, when k ≪ −

1, thesmallness gain comes directly from k . Indeed, we can then estimate (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:0) − H ∗ (cid:1)(cid:18) X k ≪− (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q [ C , C ] ∇ t , x P φ (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . X k ≪− − k (cid:13)(cid:13)(cid:13) P k + O (1) Q ≤ k − C φ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( I n × R ) (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ (cid:13)(cid:13)(cid:13) L t L ∞ x ( I n × R ) (cid:13)(cid:13)(cid:13) Q [ C , C ] ∇ t , x P φ (cid:13)(cid:13)(cid:13) L t L x ( I n × R ) . X k ≪− k (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I n × R ) k φ k S ( I n × R ) ≤ δ (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S ( I n × R ) . Replacing P φ by P l φ , l ∈ Z , and square-summing over l results in the desired bound. This com-pletes the estimate for term I . Estimate of term II . Here we use the bound (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:0) − H ∗ (cid:1)(cid:18) X k < (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q [ k − C , C ] ∇ t , x P φ (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . X k < l ∈ [ k − C , C ] − k (cid:13)(cid:13)(cid:13) P k + O (1) Q ≤ k − C φ (cid:13)(cid:13)(cid:13) L t L ∞ x ( I n × R ) (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x ( I n × R ) (cid:13)(cid:13)(cid:13) Q l ∇ t , x P φ (cid:13)(cid:13)(cid:13) L t L x ( I n × R ) . Now if we further restrict the above term to | k − l | ≫

1, we easily bound it by ≤ δ (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) S ( I n × R ) (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) X , ∞ ( I n × R ) ≤ δ k φ k S ( I n × R ) , which is as desired. On the other hand, when restricting the modulation of Q [ k − C , C ] ∇ t , x P φ to l = k + O (1), we use the fact that for k ≪ − (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ Q k + O (1) ∇ t , x P φ (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ L Q k + O (1) ∇ t , x P φ L (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) P k + O (1) Q ≤ k − C ∇ x φ L (cid:13)(cid:13)(cid:13) L ∞ t L ∞ x (cid:13)(cid:13)(cid:13) Q k + O (1) ∇ t , x P φ L (cid:13)(cid:13)(cid:13) L t L x . Then changing the frequency 0 to general m ∈ Z and using Observation 1, we infer X k < m − C − k (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ Q k + O (1) ∇ t , x P m φ (cid:13)(cid:13)(cid:13) L t L x ≤ C (cid:16)(cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S (cid:17) .

36 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Also, the square-sum norm on the left has the divisibility property, whence by restricting to suitabletime intervals I n , we may arrange it to be ≪ δ . Finally, we infer the bound X m (cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:0) − H ∗ (cid:1)(cid:18) X k < m − C (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) Q k + O (1) ∇ t , x P m φ (cid:19)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . (cid:18) X k − k (cid:13)(cid:13)(cid:13) P k + O (1) Q ≤ k − C φ (cid:13)(cid:13)(cid:13) L t L ∞ x (cid:19) (cid:18) X k < m − C − k (cid:13)(cid:13)(cid:13) ∇ x P k + O (1) Q ≤ k − C φ Q k + O (1) ∇ t , x P m φ (cid:13)(cid:13)(cid:13) L t L x (cid:19) ≤ δ k φ k S ( I n × R ) . Estimate of term III . This follows the same pattern as for term I , by placing the product ∇ x P k + O (1) Q ≤ k − C φ Q ≤ k − C ∇ t , x P φ into L t L x and using Observation 2. Estimate of term IV . Here one places (cid:3) − P k Q k + O (1) (cid:0) P k + O (1) Q ≤ k − C φ ∇ x P k + O (1) Q ≤ k − C φ (cid:1) into L t L ∞ x and Q > C ∇ t , x P φ into L t L x , keeping in mind that C ≫ C = C ( E crit ) is very large. Reduction to H ∗ N lowhi (cid:0) H P < A nonlin , ( I n ) , P φ (cid:1) . To begin with, recall the notation from the proof ofProposition 7.5 for the deﬁnition of the symbol H applied to bilinear expressions. To reduce to thisterm, we need to estimate the di ﬀ erence (cid:13)(cid:13)(cid:13) H ∗ N lowhi (cid:0) P < A nonlin , ( I n ) , P φ (cid:1) − H ∗ N lowhi (cid:0) H P < A nonlin , ( I n ) , P φ (cid:1)(cid:13)(cid:13)(cid:13) N ( I n × R ) . Here we recall that H k M ( φ, ψ ) = X j ≤ k + C Q j M (cid:0) Q ≤ j − C φ, Q ≤ j − C ψ (cid:1) as well as H M (cid:0) φ, ψ (cid:1) = X k ≤ k , − C , k ≤ min { k , k }− C H k M (cid:0) P k φ, P k ψ (cid:1) . Then write for the spatial components of (cid:0) I − H (cid:1) P < A nonlin , ( I n ) , (cid:0) I − H (cid:1) P < A nonlin , ( I n ) = X k < k > max { k , k }− C (cid:3) − P k P x (cid:0) P k φ ∇ x P k φ (cid:1) + X k ≤ k , − Cj > k + C (cid:3) − P k Q j P x (cid:0) P k φ ∇ x P k φ (cid:1) + X k ≤ k , − Cj ≤ k + C (cid:3) − P k Q j P x (cid:0) P k Q > j − C φ ∇ x P k φ (cid:1) + X k ≤ k , − Cj ≤ k + C (cid:3) − P k Q j P x (cid:0) P k Q ≤ j − C φ ∇ x P k Q > j − C φ (cid:1) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 137

For the ﬁrst term on the right, employing notation introduced in [22] and also used in the proof ofProposition 7.5, we get upon further restricting to (cid:12)(cid:12)(cid:12) max { k , } − min { k , } (cid:12)(cid:12)(cid:12) ≫ , the smallness gain (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X k < , k ≥ max { k , k }− C (cid:3) − P k P x (cid:0) P k φ ∇ x P k φ (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) Z ≪ δ (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) S and the corresponding contribution to H ∗ N lowhi (cid:0) ( I − H ) P < A nonlin , ( I n ) , P φ (cid:1) can then be boundedwith respect to (cid:13)(cid:13)(cid:13) · (cid:13)(cid:13)(cid:13) N ( I n × R ) by ≤ δ (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) S (cid:13)(cid:13)(cid:13) P φ (cid:13)(cid:13)(cid:13) S , which upon replacing 0 by general frequencies and square summing gives the desired bound. Sim-ilarly, for the remaining terms on the right above, one may reduce to k , = k + O (1), see estimate(134) in [22]. Finally, in each of these terms, we may reduce the output to modulation ∼ k , sinceelse one gains smallness due to the null form structure for (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) H ∗ N lowhi (cid:0) P < A nonlin , ( I n ) , P φ (cid:1) − H ∗ N lowhi (cid:0) H P < A nonlin , ( I n ) , P φ (cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . Thus we have now reduced to estimating (and gaining a smallness factor) for the schematic expres-sion X k < , k , = k + O (1) (cid:3) − P k Q k + O (1) P x (cid:0) P k φ ∇ x P k φ (cid:1) ∂ ν Q ≤ k − C P φ. Here we can suppress the operator (cid:3) − P k Q k + O (1) , which is given by convolution with a space-timekernel of L -norm ∼ − k , and then schematically estimate the preceding via (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X k < k , = k + O (1) (cid:3) − P k Q k + O (1) P x (cid:0) P k φ ∇ x P k φ (cid:1) ∂ ν Q ≤ k − C P φ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . X k = k + O (1) < O (1) − k (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) L t L ∞ x ( I n × R ) (cid:13)(cid:13)(cid:13) ∇ x P k φ Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) L t L x ( I n × R ) . Here we exploit Lorentz invariance of the norm of the right factor to obtain X k < − k (cid:13)(cid:13)(cid:13) ∇ x P k φ Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) L t L x . X k < − k (cid:13)(cid:13)(cid:13) ∇ t , x ( P k φ ) L Q ≤ k − C ( P φ ) L (cid:13)(cid:13)(cid:13) L t L x . (cid:13)(cid:13)(cid:13) ( ˜ A L , ˜ φ L ) (cid:13)(cid:13)(cid:13) S . In fact, distinguishing as usual between di ﬀ erent frequency / modulation conﬁgurations for either ofthe factors, one estimates the L t L x -norm of the input by placing the ﬁrst input into L t L ∞ x and thesecond into L ∞ t L x , both of which are controlled by Observation 1. Using divisibility of the L t L x norm, it now follows that upon proper choice of the intervals I n , whose number of course only

38 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION depends on (cid:13)(cid:13)(cid:13)(cid:0) ˜ A L , ˜ φ L (cid:1)(cid:13)(cid:13)(cid:13) S , we get the estimate (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) X k < k , = k + O (1) (cid:3) − P k Q k + O (1) P x (cid:0) P k φ ∇ x P k φ (cid:1) ∂ ν Q ≤ k − C P φ (cid:13)(cid:13)(cid:13)(cid:13)(cid:13) N ( I n × R ) . (cid:18) X k < − k (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) L t L ∞ x ( I n × R ) (cid:19) (cid:18) X k < − k (cid:13)(cid:13)(cid:13) ∇ x P k φ Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) L t L x ( I n × R ) (cid:19) ≪ δ (cid:13)(cid:13)(cid:13) φ (cid:13)(cid:13)(cid:13) S ( I n × R ) . Of course, one gets the same bound upon replacing the frequency 0 by general l ∈ Z and squaresumming.As usual, similar reductions can be applied to the elliptic interaction term P < A ∂ t P φ . Dealing with H ∗ N lowhi (cid:0) H P < A nonlin , ( I n ) , P φ (cid:1) . Here we exploit the null structure arising fromcombining the elliptic as well as hyperbolic terms, just as in the proof of Proposition 7.5, or asin [22]. Correspondingly, we have to analyze three null forms, each in turn. The ﬁrst null form . We can write it as X j ≤ k < , k , > k + O (1) (cid:3) − P k Q j (cid:0) Q ≤ j − C P k φ∂ α Q ≤ j − C P k φ (cid:1) ∂ α Q ≤ j − C P φ. From (148) in [22], it follows that we may restrict to j = k + O (1), as otherwise the desired smallnessjust follows from the o ﬀ -diagonal decay of the estimate (even without restriction to smaller timeintervals). Furthermore, if k , k <

0, then we gain exponentially in the di ﬀ erence k − k , while if k , k ≥

0, we gain exponentially in k . So we may further restrict to X k < , k , = k + O (1) (cid:3) − P k Q k + O (1) (cid:0) Q ≤ k − C P k φ∂ α Q ≤ k − C P k φ (cid:1) ∂ α Q ≤ k − C P φ and from here the argument proceeds exactly as before by suppressing the operator (cid:3) − P k Q k + O (1) and schematically estimating (cid:13)(cid:13)(cid:13) ∂ α Q ≤ k − C P k φ∂ α Q ≤ k − C P φ (cid:13)(cid:13)(cid:13) L t L x using Observation 2, while placing Q ≤ k − C P k φ into L t L ∞ x . The second and third null forms . These are treated identically and hence omitted here. (cid:3)

This completes Step 2. Together with Step 1, the linear theory for the operator P k (cid:3) A free < k P k anda standard bootstrap argument, this yields the bounds claimed in Proposition 8.7 for the localizednorms k ( A , φ ) k S ( I n × R ) . From there one can glue the localized components together to get the globalbounds. (cid:3) Rigidity I: Inﬁnite time interval and reduction to the self-similar case for ﬁnite time in-tervals.

As in [10, Theorem 5.1], our goal is now to establish the following rigidity result.

Proposition 8.9.

Let ( A ∞ , Φ ∞ ) be as before with lifespan I = ( − T , T ) . Then it is not possibleto have T < ∞ or T < ∞ . Moreover, if λ ( t ) ≥ λ > for all t ∈ R , then we necessarily have ( A ∞ , Φ ∞ ) = (0 , , whence there cannot be a minimal energy blowup solution under the givenassumptions. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 139

We begin the proof of Proposition 8.9 in the case when T = ∞ and λ ( t ) ≥ λ > , ∞ ). Tothis end we follow the method of proof in [20], which in turn follows the strategy in [10], but alsoadds a crucial Vitali type covering argument that is inspired by the covering argument in [42]. Usingthe assumption λ ( t ) ≥ λ > , ∞ ) and the compactness property expressed in Theorem 7.23,we obtain that for any ε >

0, there exists R ( ε ) > t ≥ Z(cid:8) | x + x ( t ) λ ( t ) |≥ R ( ε ) (cid:9)(cid:18)X α (cid:12)(cid:12)(cid:12) ∇ t , x A ∞ α ( t , x ) (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) ∇ t , x Φ ∞ ( t , x ) (cid:12)(cid:12)(cid:12) + | Φ ∞ ( t , x ) | | x | (cid:19) dx ≤ ε. Then we have in perfect analogy with [20, Lemma 10.9] and [10, Lemma 5.4] the following

Lemma 8.10.

There exist ε > and C > such that if ε ∈ (0 , ε ) , there exists R ( ε ) so that ifR > R ( ε ) , then there exists t = t ( R , ε ) with ≤ t ≤ CR and the property that for all < t < t we have (cid:12)(cid:12)(cid:12)(cid:12) x ( t ) λ ( t ) (cid:12)(cid:12)(cid:12)(cid:12) < R − R ( ε ) , (cid:12)(cid:12)(cid:12)(cid:12) x ( t ) λ ( t ) (cid:12)(cid:12)(cid:12)(cid:12) = R − R ( ε ) . Proof.

We adjust the proof of [20, Lemma 10.9] by using the weighted momentum monotonicityidentity (8.6) from Proposition 8.2. To begin with, we show that there exists α > Z I Z R (cid:18)X k F ∞ k ( t , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( t , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx dt ≥ α > I ⊂ [0 , ∞ ) of unit length. We argue by contradiction. Suppose not, then there existsa sequence of intervals J n = [ t n , t n +

1] with t n → ∞ such that(8.47) Z J n Z R (cid:18)X k F ∞ k ( t , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( t , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx dt ≤ n . For a sequence of times { s n } n with s n ∈ J n , the set ((cid:18) λ ( s n ) ∇ t , x A ∞ x (cid:18) s n , · − x ( s n ) λ ( s n ) (cid:19) , λ ( s n ) ∇ t , x Φ ∞ (cid:18) s n , · − x ( s n ) λ ( s n ) (cid:19)(cid:19)) n is pre-compact in (cid:0) L x ( R ) (cid:1) by Theorem 7.23. Then by Corollary 6.3, there exists a non-emptyopen interval I ∗ around t = λ ( s n ) (cid:18) ∇ t , x A ∞ x , ∇ t , x Φ ∞ (cid:19)(cid:18) s n + t λ ( s n ) − , · − x ( s n ) λ ( s n ) (cid:19) converges to a limiting function (cid:0) ∇ t , x A ∗ x , ∇ t , x Φ ∗ (cid:1) ∈ C (cid:0) I ∗ , ( L x ( R )) (cid:1) as n → ∞ in the given topology. ( A ∗ , Φ ∗ ) is a weak solution to MKG-CG on I ∗ × R in the L t ˙ H x -sense and satisﬁes the Coulomb condition.We now distinguish two cases: Either there exists a sequence of times { s n } n with s n ∈ J n suchthat { λ ( s n ) } n remains bounded or { λ ( s n ) } n does not remain bounded for any sequence of times { s n } n with s n ∈ J n . In the ﬁrst case, noting that λ ( t ) ≥ λ >

0, we may replace I ∗ by a smaller non-emptytime interval I † and assume that s n + λ ( s n ) − I † ⊂ J n for all n ≥

1. From (8.47) we infer that Z I † Z R (cid:12)(cid:12)(cid:12)(cid:0) ∂ t Φ ∗ + i A ∗ Φ ∗ (cid:1) ( t , x ) (cid:12)(cid:12)(cid:12) dx dt = ,

40 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION whence ∂ t Φ ∗ + i A ∗ Φ ∗ ≡ I † × R . But then we have in the weak sense that X k = (cid:0) ∂ k + i A ∗ k (cid:1) Φ ∗ ≡ I † × R . This implies Φ ∗ | I † × R ≡ ∂ t Φ ∗ | I † × R ≡

0. We conclude that ( A ∗ , Φ ∗ ) must bea “trivial” solution in that the spatial components of A ∗ are ﬁnite energy free waves, while thetemporal component vanishes, and we have Φ ∗ ≡

0. But this solution has ﬁnite S -bounds, whichis a contradiction upon applying Proposition 7.20.Next, we consider the case that { λ ( s n ) } n does not remain bounded for any sequence of times { s n } n with s n ∈ J n . Then we essentially replicate the preceding argument, but need to also add a Vitalitype covering trick. We write for each n ≥ J n = [ s ∈ J n (cid:2) s − λ ( s ) − , s + λ ( s ) − (cid:3) ∩ J n . Applying Vitali’s covering lemma, we may pick a disjoint subcollection of intervals { I s } s ∈ A n with I s : = [ s − λ ( s ) − , s + λ ( s ) − ] ∩ J n for some subset A n ⊂ J n such that X s ∈ A n | I s | ≥ . It follows that we may pick a sequence of times { s n } n with s n ∈ J n such that we have Z I sn Z R (cid:18)X k F ∞ k ( t , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( t , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx dt = o ( λ ( s n ) − ) . In particular, we obtain Z − Z R χ J n (cid:18)X k (cid:0) F ∞ k (cid:1) + (cid:12)(cid:12)(cid:12) D Φ ∞ (cid:12)(cid:12)(cid:12) (cid:19)!(cid:0) s n + t λ ( s n ) − , x (cid:1) dx dt = o (1) . But then, passing to a subsequence, we may again extract a limiting function from1 λ ( s n ) (cid:18) ∇ t , x A ∞ x , ∇ t , x Φ ∞ (cid:19)(cid:18) s n + t λ ( s n ) − , · − x ( s n ) λ ( s n ) (cid:19) , which yields a time independent solution and leads to a contradiction as before.This shows that (8.46) is indeed valid for suitable α >

0. We note that λ ( t ) ≥ λ > x (0) =

0. If the assertion of the lemma was false, then we would have (cid:12)(cid:12)(cid:12)(cid:12) x ( t ) λ ( t ) (cid:12)(cid:12)(cid:12)(cid:12) < R − R ( ε )for all 0 ≤ t < CR with C > ﬃ ciently large later on. We now use the weightedmomentum monotonicity identity (8.6) to obtain a contradiction. In view of (8.45), we concludethat the corresponding remainder term (8.7) satisﬁes r ( R ) . ε. Now choose ε > I of unit length, Z I Z R X k F ∞ k ( t , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( t , x ) (cid:12)(cid:12)(cid:12) dx + O ( r ( R )) ! dt ≥ α , ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 141 provided CR is su ﬃ ciently large. In particular, this implies Z CR Z R X k F ∞ k ( t , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( t , x ) (cid:12)(cid:12)(cid:12) dx + O ( r ( R )) ! dt ≥ α CR − . On the other hand, integrating (8.6) in time from 0 to CR , we ﬁnd Z CR Z R X k F ∞ k ( t , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( t , x ) (cid:12)(cid:12)(cid:12) dx + O ( r ( R )) ! dt . RE crit with a universal implied constant. The two preceding bounds contradict each other for C large. (cid:3) To ﬁnish o ﬀ the proof of Proposition 8.9 in the case when T = ∞ and λ ( t ) ≥ λ > , ∞ ),we now use Proposition 8.8 to conclude a contradiction to the preceding Lemma 8.10. Lemma 8.11.

There exist ε > , R ( ε ) > , C > such that if R > R ( ε ) , t = t ( R , ε ) are as inLemma 8.10, then for < ε < ε , t ( R , ε ) ≥ C R ε . Proof.

The proof proceeds exactly as in [10, Lemma 5.5] using the weighted energy identity (8.5)and that the minimal blowup solution ( A ∞ , Φ ∞ ) has vanishing momentum by Proposition 8.8. (cid:3) It remains to prove Proposition 8.9 when T < ∞ . As in [10] and [20], we ﬁrst reduce this caseto a self-similar blowup scenario. After rescaling we may assume that T =

1. We recall fromLemma 8.3 that there exists a constant C ( K ) > < C ( K )1 − t ≤ λ ( t )for all 0 ≤ t <

1. Moreover, we know from Lemma 8.4 that after spatial translation(8.48) supp (cid:16) F ∞ αβ ( t , · ) , Φ ∞ ( t , · ) (cid:17) ⊂ B (0 , − t )for all 0 ≤ t < α, β ∈ { , , . . . , } . Next, we prove an upper bound on λ ( t ). Lemma 8.12.

Let ( A ∞ , Φ ∞ ) be as above with T = . Then there exists C ( K ) > such that λ ( t ) ≤ C ( K )1 − tfor all ≤ t < .Proof. We follow the argument in Lemma 10.11 in [20]. Suppose the claim was false. Thenconsider for 0 ≤ t < z ( t ) = Z R X k x k (cid:18)X j F ∞ j F ∞ k j + Re (cid:0) D Φ ∞ D k Φ ∞ (cid:1)(cid:19) dx + Z R Re (cid:0) Φ ∞ D Φ ∞ (cid:1) dx . From the weighted momentum monotonicity identity (8.6) in Proposition 8.2 we obtain that z ′ ( t ) = − Z R (cid:18)X k (cid:0) F ∞ k (cid:1) + (cid:12)(cid:12)(cid:12) D Φ ∞ (cid:12)(cid:12)(cid:12) (cid:19) dx . Since we have by (8.48) and Hardy’s inequality that z ( t ) → t →

1, we can write z ( t ) = Z t Z R (cid:18)X k F ∞ k ( s , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( s , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ds .

42 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Now we distinguish between two possibilities: Either there exists α > ≤ t < Z t Z R (cid:18)X k F ∞ k ( s , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( s , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ds ≥ α (1 − t ) , or else there exists a sequence { t n } n ⊂ [0 ,

1) with t n → J n : = [ t n , | J n | Z J n Z R (cid:18)X k F ∞ k ( s , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( s , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ds → . In the former case, we argue exactly as in [10, Lemma 5.6] to obtain the conclusion of the lemma.In particular, here we invoke Proposition 8.5. In the latter case, a contradiction ensues as follows.Using the same Vitali covering argument as in the proof of Lemma 8.10, we can select a sequenceof intervals J ′ n = [ s n − λ ( s n ) − , s n + λ ( s n ) − ] with s n ∈ J n such that1 | J ′ n | Z J ′ n Z R (cid:18)X k F ∞ k ( s , x ) + (cid:12)(cid:12)(cid:12) D Φ ∞ ( s , x ) (cid:12)(cid:12)(cid:12) (cid:19) dx ds → . But then, using compactness, we again extract a trivial limiting solution, and obtain a contradictionas in the proof of Lemma 8.10. (cid:3)

We are now in a position to reduce to the exactly self-similar case.

Corollary 8.13.

Let ( A ∞ , Φ ∞ ) be as above with T = . Then the set (cid:26)(cid:16) (1 − t ) (cid:0) ∇ t , x A ∞ x (cid:1) ( t , (1 − t ) x ) , (1 − t ) (cid:0) ∇ t , x Φ ∞ (cid:1) ( t , (1 − t ) x ) (cid:17) : 0 ≤ t < (cid:27) is pre-compact in (cid:0) L x ( R ) (cid:1) .Proof. Here we can proceed similarly to the proof of Proposition 5.7 in [10]. Our point of departureis Theorem 7.23. From Lemma 8.3 and Lemma 8.12 we know that C ( K ) ≤ (1 − t ) λ ( t ) ≤ C ( K )for all 0 ≤ t <

1. Using the sharp support properties (8.48) and that E crit >

0, we also conclude that | x ( t ) | ≤ C for all 0 ≤ t < C >

0. Then the claim follows from the compactnessassertion in Theorem 7.23. (cid:3)

Rigidity II: The self-similar case.

In this subsection we rule out the existence of a minimalblowup solution (cid:0) A ∞ , Φ ∞ (cid:1) as in Corollary 8.13. To this end we use self-similar variables andderive a suitable Lyapunov functional. For ease of notation we drop the superscript ∞ and denotethe minimal blowup solution from Corollary 8.13 just by ( A , Φ ). Following [10] and [20], weintroduce the self-similar variables y = x − t , s = − log(1 − t ) , ≤ t < e Φ ( s , y , = e − s Φ (1 − e − s , e − s y ) , e A α ( s , y , = e − s A α (1 − e − s , e − s y ) , ≤ α ≤ . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 143

We also deﬁne the associated covariant derivatives of e Φ and the curvature 2-form associated with e A in self-similar variables by ] D α Φ ( s , y , = e − s D α Φ (1 − e − s , e − s y ) , ≤ α ≤ , g F αβ ( s , y , = e − s F αβ (1 − e − s , e − s y ) , ≤ α, β ≤ . (8.49)Observe that (cid:0) e A , e Φ (cid:1) ( s , y ,

0) are deﬁned for 0 ≤ s < ∞ . Moreover, in view of (8.48), e Φ ( s , · ,

0) andthe curvature components g F αβ ( s , · ,

0) have support in { y ∈ R : | y | ≤ } . For small δ >

0, we alsodeﬁne y = x + δ − t , s = − log(1 + δ − t ) , ≤ t < e Φ ( s , y , δ ) = e − s Φ (1 + δ − e − s , e − s y ) , e A α ( s , y , δ ) = e − s A α (1 + δ − e − s , e − s y ) , ≤ α ≤ . Analogously to (8.49), we introduce ] D α Φ ( s , y , δ ) for 0 ≤ α ≤ g F αβ ( s , y , δ ) for 0 ≤ α, β ≤ (cid:0) e A , e Φ (cid:1) ( s , y , δ ) is deﬁned for − log(1 + δ ) ≤ s < log( δ ).In self-similar variables the Maxwell-Klein-Gordon system is given by(8.50)  ∂ k g F k = Im (cid:0)e Φ ] D Φ (cid:1) , − (cid:0) ∂ s + + y · ∇ y (cid:1)g F j + ∂ k g F jk = Im (cid:0)e Φ g D j Φ (cid:1) , (cid:0) ∂ s + i e A + + y · ∇ y (cid:1) ] D Φ = (cid:0) ∂ k + i e A k (cid:1) g D k Φ , where ∂ k denotes partial di ﬀ erentiation with respect to the y variable. We begin by stating thefollowing properties of (cid:0) e A , e Φ (cid:1) . Lemma 8.14. (i) For ﬁxed δ > , we have for all ≤ s < log( δ ) that supp (cid:0)e Φ ( s , · , δ ) (cid:1) ⊂ (cid:8) y ∈ R : | y | ≤ − δ (cid:9) , supp (cid:0)g F αβ ( s , · , δ ) (cid:1) ⊂ (cid:8) y ∈ R : | y | ≤ − δ (cid:9) . (8.51) (ii) Uniformly for all δ > and all ≤ s < log( δ ) , it holds that (8.52) Z B X α (cid:12)(cid:12)(cid:12) ] D α Φ ( s , y , δ ) (cid:12)(cid:12)(cid:12) dy . E crit and (8.53) Z B | e Φ ( s , y , δ ) | (1 − | y | ) dy . E crit . Proof. (i) For 0 ≤ s < log( δ ) we infer from the support properties (8.48) thatsupp (cid:0)e Φ ( s , · , δ ) (cid:1) ⊂ (cid:26) y ∈ R : | y | ≤ − t + δ − t = e − s − δ e − s ≤ − δ (cid:27) and similarly for the support of g F αβ ( s , · , δ ).(ii) The estimate (8.52) follows immediately from a change of variables. Noting that e Φ ( s , · , δ ) ∈

44 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION H ( B ) for all δ > ≤ s < log( δ ), we then use the Hardy-type inequality (0.5) from [3]together with the diamagnetic inequality to conclude that Z B | e Φ | (1 − | y | ) dy ≤ Z B | e Φ | (1 − | y | ) dy . Z B (cid:12)(cid:12)(cid:12) ∇ y | e Φ | (cid:12)(cid:12)(cid:12) dy . Z B X k (cid:12)(cid:12)(cid:12) g D k Φ (cid:12)(cid:12)(cid:12) dy . E crit . (cid:3) For small δ > (cid:0) e A , e Φ (cid:1) ( s , y , δ ) with associated covariant derivatives ] D α Φ ( s , y , δ ) and curva-ture components g F αβ ( s , y , δ ) as above, we now introduce a Lyapunov functional e E ( s ) = Z B (cid:18) X j g F j + X j , k g F jk − X j , k y k g F j g F jk (cid:19) dy (1 − | y | ) + Z B (cid:18) X α (cid:12)(cid:12)(cid:12) ] D α Φ (cid:12)(cid:12)(cid:12) − X k y k Re (cid:0) g D k Φ ] D Φ (cid:1) − Re (cid:0)e Φ ] D Φ (cid:1) − | e Φ | − | y | (cid:19) dy (1 − | y | ) and deﬁne the non-negative quantity e Ξ ( s ) = Z B X k (cid:18)g F k − (cid:16) X j y j | y | g F j (cid:17) y k | y | + X j y j g F jk (cid:19) dy (1 − | y | ) + Z B | y | (cid:18) X j y j g F j (cid:19) dy (1 − | y | ) + Z B (cid:12)(cid:12)(cid:12)(cid:12) ] D Φ − X k y k g D k Φ − e Φ (cid:12)(cid:12)(cid:12)(cid:12) dy (1 − | y | ) . We emphasize that both e E and e Ξ are gauge invariant quantities. They are well-deﬁned for all δ > e E . Proposition 8.15.

Let (cid:0) e A , e Φ (cid:1) ( s , y , δ ) for δ > be as above. Then we have for ≤ s < s < log (cid:16) δ (cid:17) that (8.54) e E ( s ) − e E ( s ) = Z s s e Ξ ( s ) ds . Moreover, it holds that (8.55) lim s → log( δ ) e E ( s ) ≤ E crit . The crucial monotonicity identity (8.54) can be derived in a gauge invariant manner. However,the computations simplify signiﬁcantly by imposing the Cronstrom-type gauge condition(8.56) X k = x k A k ( t , x ) = ≤ t < x ∈ R . This does not change the energy regularity of (cid:0) A , Φ (cid:1) . In self-similarvariables the gauge condition (8.56) reads(8.57) X k = y k e A k ( s , y , δ ) = ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 145 for all δ >

0, 0 ≤ s < log( δ ) and y ∈ R . Under the gauge condition (8.57) the functional e E can bewritten as e E ( s ) = Z B (cid:18) X j (cid:0) ∂ s e A j − ∂ j e A (cid:1) + X j , k (cid:0) ∂ j e A k − ∂ k e A j (cid:1) − X j (cid:0) (1 + y · ∇ y ) e A j (cid:1) (cid:19) dy (1 − | y | ) + Z B (cid:18) (cid:12)(cid:12)(cid:12) ( ∂ s + i e A ) e Φ (cid:12)(cid:12)(cid:12) + X k (cid:12)(cid:12)(cid:12) ( ∂ k + i e A k ) e Φ (cid:12)(cid:12)(cid:12) − (cid:12)(cid:12)(cid:12) (1 + y · ∇ y ) e Φ (cid:12)(cid:12)(cid:12) − | e Φ | − | y | (cid:19) dy (1 − | y | ) and the quantity e Ξ reads e Ξ ( s ) = Z B X k (cid:18) ∂ k e A − (cid:16) y | y | · ∇ y e A (cid:17) y k | y | − ∂ s e A k (cid:19) dy (1 − | y | ) + Z B | y | (cid:16) y · ∇ y e A (cid:17) dy (1 − | y | ) + Z B (cid:12)(cid:12)(cid:12) ( ∂ s + i e A ) e Φ (cid:12)(cid:12)(cid:12) dy (1 − | y | ) , where ∂ k denotes partial di ﬀ erentiation with respect to the y variable. It is not obvious that theabove expressions for e E and e Ξ in the Cronstrom-type gauge (8.57) are even well-deﬁned for all δ >

0. However, this follows from the gauge invariant support properties of e Φ and g F αβ , and thefollowing easily veriﬁed identities (assuming the gauge condition (8.57)) ∂ s e A j − ∂ j e A = g F j − X k y k g F k j ,∂ j e A k − ∂ k e A j = g F jk , y · ∇ y e A = X j y j g F j , (cid:0) + y · ∇ y (cid:1) e A j = X k y k g F k j , (cid:0) ∂ s + i e A (cid:1)e Φ = ] D Φ − X k y k g D k Φ − e Φ , (cid:0) ∂ k + i e A k (cid:1)e Φ = g D k Φ , y · ∇ y e Φ = X k y k g D k Φ . (8.58) Proof of Proposition 8.15.

In order to justify the computations in the derivation of the monotonicityidentity (8.54), we shall have to assume smoothness of (cid:0) A , Φ (cid:1) . However, smoothing the compo-nents as in Deﬁnition 5.3 destroys the crucial support properties of Φ and the curvature components F αβ , and thus of e Φ and g F αβ . In that sense certain expressions below involving weights of the form(1 − | y | ) − or (1 − | y | ) − become singular at | y | =

1. To deal with this, we need to introducean additional smooth cuto ﬀ χ (cid:0) −| y | ε (cid:1) that smoothly localizes away from the boundary, but such thatlim ε → χ (cid:0) y ε (cid:1) = χ (0 , ( | y | ), where χ (0 , is the sharp characteristic cuto ﬀ to the the interval (0 , χ (cid:18) − | y | ε (cid:19) − | y | ) − , which will lead to additional error terms localized near the boundary. But then letting the frequencycuto ﬀ converge toward | ξ | = + ∞ in the regularization for ﬁxed but su ﬃ ciently small ε >

0, it will be

46 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION easy to convince oneself that the additional errors vanish in the limit due to the support properties(8.48) of the underlying ( A , Φ ). We shall formally omit this additional cuto ﬀ .Further, in order to simplify the computations below, we impose the Cronstrom-type gauge con-dition (8.56) on ( A , Φ ). This leads to another technical complication in that the C ∞ smoothnessof the regularized ( A , Φ ) in Coulomb gauge will be lost. This can again be dealt with via smoothtruncation of the functional, this time away from the origin by including the cuto ﬀ χ (cid:0) | y | ε (cid:1) . Since allthe integrations by parts to be performed below involve an operator y · ∇ y , the error terms are seento be controllable in terms of the energy on smaller and smaller balls, and hence negligible in thelimit as ε →

0. Again, we shall gloss over this technicality in the formulas below.We now begin with the derivation of the monotonicity identity (8.54), where we assume that (cid:0) e A , e Φ (cid:1) ( s , y , δ ) satisfy the Cronstrom-type gauge condition(8.59) X k = y k e A k ( s , y , δ ) = ≤ s < log( δ ), y ∈ R and are smooth solutions to the Maxwell-Klein-Gordon system (8.50)in self-similar variables. In order to make the notation less heavy in this derivation, we write ( ˜ A , ˜ φ )instead of (cid:0) e A , e Φ (cid:1) , and g D α φ, g F αβ instead of ] D α Φ , g F αβ . We will repeatedly apply the following easilyveriﬁed identities without further referencing,( ∂ s + i ˜ A + + y · ∇ y ) ˜ φ = g D φ, ( ∂ k + i ˜ A k )( ∂ s + i ˜ A ) = ( ∂ s + i ˜ A )( ∂ k + i ˜ A k ) + i ( ∂ k ˜ A − ∂ s ˜ A k ) , where ∂ k denotes partial di ﬀ erentiation with respect to the y variable. We also recall the identities(8.58). Moreover, we use the notation ρ ( y ) = − | y | ) and observe that ∂ k ρ ( y ) = y k ρ ( y ) , (1 + y · ∇ y ) ρ ( y ) = ρ ( y ) . The equation for ˜ φ can be written in expanded form as( ∂ s + i ˜ A + + y · ∇ y )( ∂ s + i ˜ A + + y · ∇ y ) ˜ φ = X k ( ∂ k + i ˜ A k ) ˜ φ, or alternatively as( ∂ s + i ˜ A ) ˜ φ + (3 + y · ∇ y )( ∂ s + i ˜ A ) ˜ φ + (2 + y · ∇ y )(1 + y · ∇ y ) ˜ φ − i ( y · ∇ y ˜ A ) ˜ φ = X k ( ∂ k + i ˜ A k ) ˜ φ. We start analyzing the derivative with respect to s of the following energy functional dds Z | ( ∂ s + i ˜ A ) ˜ φ | ρ ( y ) dy = Z Re (cid:16) ∂ s ( ∂ s + i ˜ A ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy = Z Re (cid:16) ( ∂ s + i ˜ A ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 147

Inserting the equation for ˜ φ , we obtain Z Re (cid:16) ( ∂ s + i ˜ A ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy = X k Z Re (cid:16) ( ∂ k + i ˜ A k ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy − Z Re (cid:16) (3 + y · ∇ y )( ∂ s + i ˜ A ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy − Z Re (cid:16) (2 + y · ∇ y )(1 + y · ∇ y ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy + Z Re (cid:16) i ( y · ∇ y ˜ A ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy ≡ I + II + III + IV . Integrating by parts in the term I , we ﬁnd I = − X k Z Re (cid:16) ( ∂ k + i ˜ A k ) ˜ φ ( ∂ k + i ˜ A k )( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy − X k Z Re (cid:16) ( ∂ k + i ˜ A k ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ∂ k ρ ( y ) dy = − X k Z Re (cid:16) ( ∂ k + i ˜ A k ) ˜ φ ( ∂ s + i ˜ A )( ∂ k + i ˜ A k ) ˜ φ (cid:17) ρ ( y ) dy − X k Z Re (cid:16) ( ∂ k + i ˜ A k ) ˜ φ i ( ∂ k ˜ A − ∂ s ˜ A k ) ˜ φ (cid:17) ρ ( y ) dy − X k Z Re (cid:16) ( ∂ k + i ˜ A k ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) y k ρ ( y ) dy = − dds Z (cid:18)X k | ( ∂ k + i ˜ A k ) ˜ φ | (cid:19) ρ ( y ) dy − X k Z Im (cid:16) ˜ φ g D k φ (cid:17) ( ∂ s ˜ A k − ∂ k ˜ A ) ρ ( y ) dy − X k Z Re (cid:16) ∂ k ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) y k ρ ( y ) dy , where in the last step we took advantage of the gauge condition (8.59). We expect the second tolast term to cancel against a corresponding term from a suitable energy functional for ˜ A . On theother hand, the last term is expected to cancel against other terms from the equation for ˜ φ . Next,integrating by parts in the term II yields II = − Z Re (cid:16) (3 + y · ∇ y )( ∂ s + i ˜ A ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy = Z | ( ∂ s + i ˜ A ) ˜ φ | ρ ( y ) dy .

48 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Performing another round of integration by parts, now in the term

III , we ﬁnd that

III = − Z Re (cid:16) (2 + y · ∇ y )(1 + y · ∇ y ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy = Z Re (cid:16) (1 + y · ∇ y ) ˜ φ ( ∂ s + i ˜ A )(1 + y · ∇ y ) ˜ φ (cid:17) ρ ( y ) dy + Z Re (cid:16) (1 + y · ∇ y ) ˜ φ i ( y · ∇ y ˜ A ) ˜ φ (cid:17) ρ ( y ) dy + Z Re (cid:16) y · ∇ y ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy + Z Re (cid:16) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy , and thus, III = dds Z | (1 + y · ∇ y ) ˜ φ | ρ ( y ) dy + Z Re (cid:16) (1 + y · ∇ y ) ˜ φ i ( y · ∇ y ˜ A ) ˜ φ (cid:17) ρ ( y ) dy + X k Z Re (cid:16) ∂ k ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) y k ρ ( y ) dy + dds Z | ˜ φ | ρ ( y ) dy . Now we note that the second term on the right hand side of

III and the term IV nicely combine togive Z Re (cid:16) (1 + y · ∇ y ) ˜ φ i ( y · ∇ y ˜ A ) ˜ φ (cid:17) ρ ( y ) dy + Z Re (cid:16) i ( y · ∇ y ˜ A ) ˜ φ ( ∂ s + i ˜ A ) ˜ φ (cid:17) ρ ( y ) dy = Z Re (cid:16) i ( y · ∇ y ˜ A ) ˜ φ ( ∂ s + i ˜ A + + y · ∇ y ) ˜ φ (cid:17) ρ ( y ) dy = − Z ( y · ∇ y ˜ A ) Im (cid:16) ˜ φ g D φ (cid:17) ρ ( y ) dy . Moreover, we observe that the last term on the right hand side of I cancels out with the third termon the right hand side of III . We summarize the preceding computations as follows dds

Z (cid:18) | ( ∂ s + i ˜ A ) ˜ φ | + X k | ( ∂ k + i ˜ A k ) ˜ φ | − | (1 + y · ∇ y ) ˜ φ | − | ˜ φ | ρ ( y ) (cid:19) ρ ( y ) dy = Z | ( ∂ s + i ˜ A ) ˜ φ | ρ ( y ) dy − X k Z Im (cid:16) ˜ φ g D k φ (cid:17) ( ∂ s ˜ A k − ∂ k ˜ A ) ρ ( y ) dy − Z ( y · ∇ y ˜ A ) Im (cid:16) ˜ φ g D φ (cid:17) ρ ( y ) dy . (8.60)We expect the last two terms on the right hand side to cancel against corresponding terms gen-erated by di ﬀ erentiating a suitable energy functional for ˜ A , while the ﬁrst term furnishes the keymonotonicity. ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 149

At this point we have to pass to the corresponding equation for ˜ A . It is given in expanded formfor j = , . . . , ∂ s ( ∂ s ˜ A j − ∂ j ˜ A ) + (3 + y · ∇ y )( ∂ s ˜ A j ) + (2 + y · ∇ y )(1 + y · ∇ y ) ˜ A j − (2 + y · ∇ y )( ∂ j ˜ A ) − X k ∂ k ( ∂ k ˜ A j − ∂ j ˜ A k ) = Im (cid:16) ˜ φ g D j φ (cid:17) . We begin with a tentative ansatz for the correct energy functional for ˜ A to leading order, which wedi ﬀ erentiate with respect to s , dds Z (cid:18) X j ( ∂ s ˜ A j − ∂ j ˜ A ) + X j , k ( ∂ j ˜ A k − ∂ k ˜ A j ) (cid:19) ρ ( y ) dy = Z X j ∂ s ( ∂ s ˜ A j − ∂ j ˜ A ) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy + Z X j , k ∂ s ( ∂ j ˜ A k − ∂ k ˜ A j ) ( ∂ j ˜ A k − ∂ k ˜ A j ) ρ ( y ) dy ≡ ① + ② . Inserting the equation for ˜ A in the term ① , we obtain ① = − X j Z (3 + y · ∇ y )( ∂ s ˜ A j ) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy − X j Z (2 + y · ∇ y )(1 + y · ∇ y ) ˜ A j ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy + X j Z (2 + y · ∇ y )( ∂ j ˜ A ) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy + X j , k Z ∂ k ( ∂ k ˜ A j − ∂ j ˜ A k ) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy + X j Z Im (cid:16) ˜ φ g D j φ (cid:17) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy = e I + e II + f III + f IV + e V ,

50 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION where we already see that the term e V cancels against the second term on the right hand side of(8.60). The term ② can be rewritten as ② = X j , k Z (cid:16) ∂ j ( ∂ s ˜ A k − ∂ k ˜ A ) − ∂ k ( ∂ s ˜ A j − ∂ j ˜ A ) (cid:17) ( ∂ j ˜ A k − ∂ k ˜ A j ) ρ ( y ) dy = X j , k Z ∂ j ( ∂ s ˜ A k − ∂ k ˜ A ) ( ∂ j ˜ A k − ∂ k ˜ A j ) ρ ( y ) dy = − X j , k Z ( ∂ s ˜ A k − ∂ k ˜ A ) ∂ j ( ∂ j ˜ A k − ∂ k ˜ A j ) ρ ( y ) dy − X j , k Z ( ∂ s ˜ A k − ∂ k ˜ A ) ( ∂ j ˜ A k − ∂ k ˜ A j ) y j ρ ( y ) dy = − X j , k Z ( ∂ s ˜ A k − ∂ k ˜ A ) ∂ j ( ∂ j ˜ A k − ∂ k ˜ A j ) ρ ( y ) dy − X k Z ( ∂ s ˜ A k − ∂ k ˜ A ) (1 + y · ∇ y ) ˜ A k ρ ( y ) dy , where in the second to last step we integrated by parts and in the last step we used that X j y j ( ∂ j ˜ A k − ∂ k ˜ A j ) = (1 + y · ∇ y ) ˜ A k due to the gauge condition (8.59). We see that the term f IV on the right hand side of ① cancelsagainst the ﬁrst term on the right hand side of ② . Next, we integrate by parts in the term e I to ﬁnd e I = − X j Z (3 + y · ∇ y )( ∂ s ˜ A j ) ( ∂ s ˜ A j ) ρ ( y ) dy + X j Z (3 + y · ∇ y )( ∂ s ˜ A j ) ( ∂ j ˜ A ) ρ ( y ) dy = X j Z ( ∂ s ˜ A j ) ρ ( y ) dy − X j Z ( ∂ s ˜ A j ) (5 + y · ∇ y )( ∂ j ˜ A ) ρ ( y ) dy − X j Z ( ∂ s ˜ A j ) ( ∂ j ˜ A ) y · ∇ y ρ ( y ) dy . Integrating by parts also in the term e II yields e II = X j Z (1 + y · ∇ y ) ˜ A j (1 + y · ∇ y )( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy + X j Z (1 + y · ∇ y ) ˜ A j ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy = dds Z (cid:18)X j | (1 + y · ∇ y ) ˜ A j | (cid:19) ρ ( y ) dy − X j Z (1 + y · ∇ y ) ˜ A j (1 + y · ∇ y )( ∂ j ˜ A ) ρ ( y ) dy + X j Z (1 + y · ∇ y ) ˜ A j ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 151 and we observe that the third term on the right hand side of e II cancels against the second term onthe right hand side of ② . Another round of integration by parts, now in the term f III , leads to f III = X j Z (2 + y · ∇ y )( ∂ j ˜ A ) ( ∂ s ˜ A j ) ρ ( y ) dy + X j Z

12 ( ∂ j ˜ A ) y · ∇ y ρ ( y ) dy . Combining the above expressions, we are thus reduced to ① + ② = X j Z ( ∂ s ˜ A j ) ρ ( y ) dy + dds Z (cid:18)X j | (1 + y · ∇ y ) ˜ A j | (cid:19) ρ ( y ) dy + X j Z

12 ( ∂ j ˜ A ) y · ∇ y ρ ( y ) dy − X j Z ( ∂ s ˜ A j ) ( ∂ j ˜ A ) y · ∇ y ρ ( y ) dy − X j Z ( ∂ s ˜ A j ) (3 + y · ∇ y )( ∂ j ˜ A ) ρ ( y ) dy − X j Z (1 + y · ∇ y ) ˜ A j (1 + y · ∇ y )( ∂ j ˜ A ) ρ ( y ) dy + X j Z Im (cid:16) ˜ φ g D j φ (cid:17) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy . We reformulate this as ① + ② = X j Z ( ∂ s ˜ A j ) ρ ( y ) dy + dds Z (cid:18)X j | (1 + y · ∇ y ) ˜ A j | (cid:19) ρ ( y ) dy + X j Z

12 ( ∂ j ˜ A ) y · ∇ y ρ ( y ) dy − X j Z ( ∂ s ˜ A j ) ( ∂ j ˜ A ) ρ ( y ) dy − X j Z ( ∂ s + + y · ∇ y ) ˜ A j (1 + y · ∇ y )( ∂ j ˜ A ) ρ ( y ) dy + X j Z Im (cid:16) ˜ φ g D j φ (cid:17) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy . (8.61)Next, we further analyze the second to last term on the right hand side of the above identity. Inte-gration by parts gives − X j Z ( ∂ s + + y · ∇ y ) ˜ A j (1 + y · ∇ y )( ∂ j ˜ A ) ρ ( y ) dy = − X j Z ( ∂ s + + y · ∇ y ) ˜ A j ∂ j ( y · ∇ y ˜ A ) ρ ( y ) dy = + X j Z ∂ j ( ∂ s + + y · ∇ y ) ˜ A j ( y · ∇ y ˜ A ) ρ ( y ) dy + X j Z ( ∂ s + y + y · ∇ y ) ˜ A j ( y · ∇ y ˜ A ) y j ρ ( y ) dy = + X j Z ∂ j ( ∂ s + + y · ∇ y ) ˜ A j ( y · ∇ y ˜ A ) ρ ( y ) dy , (8.62)

52 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION where in the last step we used that due to the gauge condition (8.59), X j y j ( ∂ s + + y · ∇ y ) ˜ A j = . Moreover, one easily veriﬁes that(8.63) X j ∂ j ( ∂ s + + y · ∇ y ) ˜ A j = X j ∂ j ˜ A + X j ∂ j g F j = X j ∂ j ˜ A + Im (cid:16) φ g D φ (cid:17) , where in the last equality we linked with the equation for ˜ A . Inserting (8.63) back into (8.62) andintegrating by parts several times more, we conclude that − X j Z ( ∂ s + + y · ∇ y ) ˜ A j (1 + y · ∇ y )( ∂ j ˜ A ) ρ ( y ) dy = X j Z ( ∂ j ˜ A ) ρ ( y ) dy + X j Z

12 ( ∂ j ˜ A ) y · ∇ y ρ ( y ) dy − Z ( y · ∇ y ˜ A ) ρ ( y ) dy + Z Im (cid:16) φ g D φ (cid:17) ( y · ∇ y ˜ A ) ρ ( y ) dy . (8.64)Finally, inserting (8.64) back into (8.61) and combining terms, we may summarize the precedingcomputations as follows dds Z (cid:18) X j ( ∂ s ˜ A j − ∂ j ˜ A ) + X j , k ( ∂ j ˜ A k − ∂ k ˜ A j ) − X j | (1 + y · ∇ y ) ˜ A j | (cid:19) ρ ( y ) dy = Z X j ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy − Z ( y · ∇ y ˜ A ) ρ ( y ) dy + Z Im (cid:16) φ g D φ (cid:17) ( y · ∇ y ˜ A ) ρ ( y ) dy + X j Z Im (cid:16) ˜ φ g D j φ (cid:17) ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy . (8.65)We observe that the last two terms on the right hand side cancel against the last two terms on theright hand side of (8.60). However, it is not yet obvious that the ﬁrst two terms on the right handside of the above identity (8.65) yield the desired monotonicity. To this end we decompose the 4-vector ( ∂ j ˜ A ) j = into its radial and angular part. The gauge condition (8.59) then allows to rewritethis as Z X j ( ∂ s ˜ A j − ∂ j ˜ A ) ρ ( y ) dy − Z ( y · ∇ y ˜ A ) ρ ( y ) dy = Z X j (cid:18) ∂ s ˜ A j − ∂ j ˜ A + (cid:16) y | y | · ∇ y ˜ A (cid:17) y j | y | (cid:19) ρ ( y ) dy + Z (cid:16) y | y | · ∇ y ˜ A (cid:17) ρ ( y ) dy − Z (cid:0) y · ∇ y ˜ A (cid:1) ρ ( y ) dy = Z X j (cid:18) ∂ s ˜ A j − ∂ j ˜ A + (cid:16) y | y | · ∇ y ˜ A (cid:17) y j | y | (cid:19) ρ ( y ) dy + Z | y | (cid:0) y · ∇ y ˜ A (cid:1) ρ ( y ) dy . (8.66)Combining (8.60), (8.65), and (8.66) ﬁnishes the proof of the monotonicity identity (8.54). ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 153

It remains to prove (8.55). Using the gauge invariant formulation of the Lyapunov functional e E ,we proceed exactly as in the proof of Proposition 6.2 (iii) in [10] to show that for all δ > s → log( δ ) Z B (cid:18) X j g F j + g F jk + X α (cid:12)(cid:12)(cid:12) ] D α Φ (cid:12)(cid:12)(cid:12) (cid:19) dy (1 − | y | ) ≤ E crit , whilelim s → log( δ ) Z B (cid:18)X j , k y k g F j g F jk + X k y k Re (cid:0) g D k Φ ] D Φ (cid:1) + Re (cid:0)e Φ ] D Φ (cid:1) + | e Φ | − | y | (cid:19) dy (1 − | y | ) = . (cid:3) Next, we prove upper and lower bounds for the Lyapunov functional e E ( s ) uniformly in δ > ≤ s < log( δ ). Lemma 8.16.

For all δ > and all ≤ s < log( δ ) , we have (8.67) − CE crit ≤ e E ( s ) ≤ E crit for some absolute constant C > .Proof. The upper bound is immediate from (8.55) and the monotonicity property (8.54) of thefunctional e E . In order to prove the lower bound, we work with the gauge invariant formulation ofthe Lyapunov functional e E and ﬁrst observe that for | y | ≤

1, the quantities12 X α (cid:12)(cid:12)(cid:12) ] D α Φ (cid:12)(cid:12)(cid:12) − X k y k Re (cid:0) g D k Φ ] D Φ (cid:1) and 12 X j g F j + X j , k g F jk − X j , k y k g F j g F jk are non-negative. This is straightforward to see for the ﬁrst expression, while for the second one weuse that X j , k y k g F j g F jk = | y | X j , k (cid:16) y k | y | g F j − y j | y | g F k (cid:17)g F jk ≤ | y | X j , k (cid:16) y k | y | g F j − y j | y | g F k (cid:17) + | y | X j , k g F j , k . From the general identity X j , k (cid:0) ω k r j − ω j r k (cid:1) = (cid:0) r − ( r · ω ) (cid:1) ≤ r for r , ω ∈ R with | ω | =

1, we then conclude that X j , k y k g F j g F jk ≤ | y | X j g F j + | y | X j , k g F jk . It therefore su ﬃ ces to obtain an upper bound on Z B (cid:18) Re (cid:0)e Φ ] D Φ (cid:1) + | e Φ | − | y | (cid:19) dy (1 − | y | )

54 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION uniformly for all δ > ≤ s < log( δ ). From H ¨older’s inequality, (8.52) and (8.53) we easilyinfer that Z B (cid:18) Re (cid:0)e Φ ] D Φ (cid:1) + | e Φ | − | y | (cid:19) dy (1 − | y | ) . Z B | ] D Φ | dy + Z B | e Φ | (1 − | y | ) dy . E crit . (cid:3) As a corollary of Proposition 8.15 and Lemma 8.16, we obtain the following decay property as δ → Corollary 8.17.

For each δ > , there exists s δ ∈ (cid:0) log( δ ) , log( δ ) (cid:1) such that (8.68) Z s δ + log( δ ) s δ e Ξ ( s ) ds ≤ CE crit log( δ ) . Proof.

From (8.54) and Lemma 8.16 we have that Z log( δ )0 e Ξ ( s ) ds ≤ CE crit . Then the claim is immediate. (cid:3)

Our goal is now to extract a limiting solution (cid:0) A ∗ , Φ ∗ (cid:1) and to eventually show that Φ ∗ mustvanish. This will yield a contradiction to the minimal blowup solution (cid:0) A , Φ (cid:1) having inﬁnite S norm.Let t δ = + δ − e − s δ , where s δ is as in Corollary 8.17. By Corollary 8.13 we can pick a sequence δ l → l → ∞ such that (cid:16) (1 − t δ l ) ∇ t , x A x (cid:0) t δ l , (1 − t δ l ) x (cid:1) , (1 − t δ l ) ∇ t , x Φ (cid:0) t δ l , (1 − t δ l ) x (cid:1)(cid:17) → (cid:16) ∇ t , x A ∗ x ( x ) , ∇ t , x Φ ∗ ( x ) (cid:17) strongly in (cid:0) L x ( R ) (cid:1) as δ l →

0. We may also arrange that(8.69) (cid:16) (1 + δ l − t δ l ) ∇ t , x A x (cid:0) t δ l , (1 + δ l − t δ l ) x (cid:1) , (1 + δ l − t δ l ) ∇ t , x Φ (cid:0) t δ l , (1 + δ l − t δ l ) x (cid:1)(cid:17) → (cid:16) ∇ t , x A ∗ x ( x ) , ∇ t , x Φ ∗ ( x ) (cid:17) in (cid:0) L x ( R ) (cid:1) as δ l →

0. We now consider the MKG-CG evolutions in the sense of Deﬁnition 5.3of the energy class Coulomb data given by the left hand side of (8.69). Denote these evolutions by (cid:0) A l ∗ , Φ l ∗ (cid:1) . By the perturbative results from Corollary 6.3, these evolutions exist on some ﬁxed timeinterval [0 , T ∗ ], where we may assume that 0 < T ∗ <

1. Moreover, we have on [0 , T ∗ ] that A l ∗ = (1 + δ l − t δ l ) A (cid:0) t δ l + (1 + δ l − t δ l ) t , (1 + δ l − t δ l ) x (cid:1) , Φ l ∗ ( t , x ) = (1 + δ l − t δ l ) Φ (cid:0) t δ l + (1 + δ l − t δ l ) t , (1 + δ l − t δ l ) x (cid:1) , and (cid:0) ∇ t , x A l ∗ x ( t , · ) , ∇ t , x Φ l ∗ ( t , · ) (cid:1) → (cid:0) ∇ t , x A ∗ x ( t , · ) , ∇ t , x Φ ∗ ( t , · ) (cid:1) in (cid:0) L x ( R ) (cid:1) as l → ∞ uniformly for all 0 ≤ t ≤ T ∗ , where (cid:0) A ∗ , Φ ∗ (cid:1) is a weak solution to MKG-CGon [0 , T ∗ ] × R . Note that on account of these identities we havesupp (cid:0) Φ l ∗ ( t , · ) (cid:1) ⊂ (cid:26) x ∈ R : | x | ≤ − t δ l + δ l − t δ l − t < − t (cid:27) and similarly supp (cid:0) ( ∂ α A l ∗ β − ∂ β A l ∗ α )( t , · ) (cid:1) ⊂ (cid:8) x ∈ R : | x | < − t (cid:9) . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 155

We now switch to the self-similar variables s = − log(1 − t ) , y = x − t , ≤ t ≤ T ∗ , and deﬁne g A l ∗ α ( s , y ) = e − s A l ∗ α (1 − e − s , e − s y ) , f Φ l ∗ ( s , y ) = e − s Φ l ∗ (1 − e − s , e − s y ) , and similarly for (cid:0) f A ∗ , f Φ ∗ (cid:1) . We conclude exactly as in [10] after Remark 6.8 there that g A l ∗ α ( s , y ) = e A α ( s δ l + s , y , δ l ) , f Φ l ∗ ( s , y ) = e Φ ( s δ l + s , y , δ l ) , (8.70)and(8.71) (cid:0) ∇ s , y g A l ∗ x , ∇ s , y f Φ l ∗ (cid:1) ( s , · ) → (cid:0) ∇ s , y f A ∗ x , ∇ s , y f Φ ∗ (cid:1) ( s , · )in (cid:0) L y ( R ) (cid:1) as l → ∞ uniformly for all 0 ≤ s ≤ − log(1 − T ∗ ) = : S . Then (cid:0) f A ∗ , f Φ ∗ (cid:1) is a weaksolution to the Maxwell-Klein-Gordon system in self-similar variables (8.50). Denoting by ] D α Φ ∗ and g F ∗ αβ the covariant derivatives and curvature components in self-similar variables associated with (cid:0) f A ∗ , f Φ ∗ (cid:1) , we conclude that supp (cid:8)f Φ ∗ ( s , · ) (cid:9) ⊂ { y ∈ R : | y | ≤ } , supp (cid:8)g F ∗ αβ ( s , · ) (cid:9) ⊂ { y ∈ R : | y | ≤ } . (8.72) Lemma 8.18.

Let (cid:0) f A ∗ , f Φ ∗ (cid:1) be as above. Then it holds that (8.73) X j y j g F ∗ j ≡ , ] D Φ ∗ − X k y k ] D k Φ ∗ − f Φ ∗ ≡ .

56 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Proof.

For large l we obtain from (8.70), (8.71), and Corollary 8.17 that Z S Z B | y | (cid:18) X j y j g F ∗ j ( s , y ) (cid:19) dy (1 − | y | ) ds + Z S Z B (cid:12)(cid:12)(cid:12)(cid:12)(cid:16) ] D Φ ∗ − X k y k ] D k Φ ∗ − f Φ ∗ (cid:17) ( s , y ) (cid:12)(cid:12)(cid:12)(cid:12) dy (1 − | y | ) ds ≤ lim inf l →∞ ( Z S Z B | y | (cid:18) X j y j g F j ( s δ l + s , y , δ l ) (cid:19) dy (1 − | y | ) ds + Z S Z B (cid:12)(cid:12)(cid:12)(cid:12)(cid:16) ] D Φ − X k y k g D k Φ − e Φ (cid:17) ( s δ l + s , y , δ l ) (cid:12)(cid:12)(cid:12)(cid:12) dy (1 − | y | ) ds ) ≤ lim inf l →∞ ( Z s δ l + Ss δ l Z B | y | (cid:18) X j y j g F j ( s , y , δ l ) (cid:19) dy (1 − | y | ) ds + Z s δ l + Ss δ l Z B (cid:12)(cid:12)(cid:12)(cid:12)(cid:16) ] D Φ − X k y k g D k Φ − e Φ (cid:17) ( s , y , δ l ) (cid:12)(cid:12)(cid:12)(cid:12) dy (1 − | y | ) ds ) ≤ lim inf l →∞ CE crit log( δ l ) = . (cid:3) Proposition 8.19.

Let (cid:0) f A ∗ , f Φ ∗ (cid:1) be as above. Then we have f Φ ∗ ≡ . Going back to the ( t , x ) coordinates, the preceding proposition implies that A ∗ k is a free wavefor k = , . . . ,

4, while A ∗ ≡

0. This contradicts Proposition 6.1 and hence completes the proof ofProposition 8.9.

Proof of Proposition 8.19.

In order to simplify the computations below, we assume that (cid:0) f A ∗ , e Φ ∗ (cid:1) satisfy the Cronstrom-type gauge condition (in self-similar variables)(8.74) X k = y k f A ∗ k ( s , y ) = ≤ s ≤ S and y ∈ R . Then the properties (8.73) of the limiting solution (cid:0) f A ∗ , f Φ ∗ (cid:1) can bewritten as y · ∇ y f A ∗ ≡ , (cid:0) ∂ s + i f A ∗ (cid:1)e Φ ∗ ≡ f Φ ∗ simpliﬁes to(2 + y · ∇ y )(1 + y · ∇ y ) f Φ ∗ = X k (cid:0) ∂ k + i f A ∗ k (cid:1) f Φ ∗ . Integrating this equation against f Φ ∗ , we ﬁnd(8.75) Z R (cid:16) (2 + y · ∇ y )(1 + y · ∇ y ) f Φ ∗ (cid:17)f Φ ∗ dy = − X k Z R (cid:12)(cid:12)(cid:12)(cid:0) ∂ k + i f A ∗ k (cid:1)f Φ ∗ (cid:12)(cid:12)(cid:12) dy . ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 157

A simple integration by parts shows that the left hand side of (8.75) is given by Z R (cid:16) (2 + y · ∇ y )(1 + y · ∇ y ) f Φ ∗ (cid:17)f Φ ∗ dy = Z R (cid:12)(cid:12)(cid:12)f Φ ∗ (cid:12)(cid:12)(cid:12) dy − Z R (cid:12)(cid:12)(cid:12) y · ∇ y f Φ ∗ (cid:12)(cid:12)(cid:12) dy . Decomposing the 4-vector (cid:0) ∂ k f Φ ∗ (cid:1) k = into its radial and angular part, we observe that the gaugecondition (8.74) allows to rewrite the right hand side of (8.75) as − X k Z (cid:12)(cid:12)(cid:12) ( ∂ k + i f A ∗ k ) f Φ ∗ (cid:12)(cid:12)(cid:12) dy = − Z R (cid:18)(cid:12)(cid:12)(cid:12)(cid:12) y | y | · ∇ y f Φ ∗ (cid:12)(cid:12)(cid:12)(cid:12) + X k (cid:12)(cid:12)(cid:12)(cid:12) ∂ k f Φ ∗ − (cid:16) y | y | · ∇ y f Φ ∗ (cid:17) y k | y | + i f A ∗ k f Φ ∗ (cid:12)(cid:12)(cid:12)(cid:12) (cid:19) dy ≤ − Z R (cid:12)(cid:12)(cid:12)(cid:12) y | y | · ∇ y f Φ ∗ (cid:12)(cid:12)(cid:12)(cid:12) dy . Thus, we ﬁnd that 4 Z R (cid:12)(cid:12)(cid:12)f Φ ∗ (cid:12)(cid:12)(cid:12) dy ≤ − Z R (cid:12)(cid:12)(cid:12)(cid:12) y | y | · ∇ y f Φ ∗ (cid:12)(cid:12)(cid:12)(cid:12) dy − Z R (cid:12)(cid:12)(cid:12) y · ∇ y f Φ ∗ (cid:12)(cid:12)(cid:12) dy ! , and in view of the support properties (8.72) of f Φ ∗ , we must have f Φ ∗ ≡ (cid:3) To conclude the rigidity argument, we need to reduce to the additional assumption λ ( t ) ≥ λ > t ∈ R made in the statement of Proposition 8.9. However, this follows as in Lemma 10.18in [20].Finally, we summarize the proof of the global existence assertion in Theorem 1.2 and addressthe proof of the scattering assertion. Proof of Theorem 1.2.

From the concentration compactness step in Section 7 and the rigidity argu-ment in this section, we infer the existence of a non-decreasing function K : (0 , ∞ ) → (0 , ∞ ) withthe following property: Let ( A x , φ )[0] be admissible Coulomb class data of energy E . Then thereexists a unique global admissible solution ( A , φ ) to MKG-CG with initial data ( A x , φ )[0] satisfyingthe a priori bound (cid:13)(cid:13)(cid:13) ( A x , φ ) (cid:13)(cid:13)(cid:13) S ( R × R ) ≤ K ( E ) . It remains to prove that the dynamical variables ( A x , φ ) of the global solution ( A , φ ) to MKG-CGscatter to ﬁnite energy free waves. To this end it su ﬃ ces to show that k (cid:3) A j k N ( R × R ) < ∞ for j = , . . . , k (cid:3) φ k N ( R × R ) < ∞ . Here the only concern is to bound the low-high interactions in the magnetic interaction term − iA f reej ∂ j φ in the equation for φ , where A f reej is the free wave evolution of the initial data A j [0]. Inthis case, the bound k ( A x , φ ) k S ( R × R ) < ∞ does not su ﬃ ce and we have to invest our strong assump-tions about the spatial decay of the initial data. More precisely, from [22] we have the followingestimate for dyadic frequencies k ≤ k − C , (cid:13)(cid:13)(cid:13) P k A f reej P k ∂ j φ (cid:13)(cid:13)(cid:13) N . (cid:13)(cid:13)(cid:13) P k A x [0] (cid:13)(cid:13)(cid:13) ˙ H x × L x (cid:13)(cid:13)(cid:13) P k φ (cid:13)(cid:13)(cid:13) S .

58 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

Thus, we may bound the low-high interactions in the magnetic interaction term − iA f reej ∂ j φ by (cid:13)(cid:13)(cid:13)(cid:13)X k ∈ Z P ≤ k − C A f reej P k ∂ j φ (cid:13)(cid:13)(cid:13)(cid:13) N . (cid:13)(cid:13)(cid:13) A x [0] (cid:13)(cid:13)(cid:13) ℓ ( ˙ H x × L x ) k φ k S . To see that (cid:13)(cid:13)(cid:13) A x [0] (cid:13)(cid:13)(cid:13) ℓ ( ˙ H x × L x ) is ﬁnite, we observe that in the Coulomb gauge we have for j = , . . . , A j = − ∆ − ∂ l F jl . Hence, we obtain for j = , . . . , X k ∈ Z k P k A j (0) k ˙ H x . X k ∈ Z X l = k P k ∆ − ∂ l F jl (0) k ˙ H x . X k ∈ Z X l = k P k F jl (0) k L x < ∞ , since the spatial curvature components F jl (0) are of Schwartz class by assumption. Similarly, weconclude that k ∂ t A x (0) k ℓ L x < ∞ , which ﬁnishes the proof. (cid:3) R eferences

1. Hajer Bahouri and Patrick G´erard,

High frequency approximation of solutions to critical nonlinear wave equations ,Amer. J. Math. (1999), no. 1, 131–175.2. Jean Bourgain,

Fourier transform restriction phenomena for certain lattice subsets and applications to nonlinearevolution equations. II. The KdV-equation , Geom. Funct. Anal. (1993), no. 3, 209–262.3. Ha¨ım Brezis and Moshe Marcus, Hardy’s inequalities revisited , Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) (1997),no. 1-2, 217–237 (1998).4. Scipio Cuccagna, On the local existence for the Maxwell-Klein-Gordon system in R + , Comm. Partial Di ﬀ erentialEquations (1999), no. 5-6, 851–867.5. Douglas M. Eardley and Vincent Moncrief, The global existence of Yang-Mills-Higgs ﬁelds in -dimensionalMinkowski space. I. Local existence and smoothness properties , Comm. Math. Phys. (1982), no. 2, 171–191.6. , The global existence of Yang-Mills-Higgs ﬁelds in -dimensional Minkowski space. II. Completion of proof ,Comm. Math. Phys. (1982), no. 2, 193–212.7. Tosio Kato, Marius Mitrea, Gustavo Ponce, and Michael Taylor, Extension and representation of divergence-freevector ﬁelds on bounded domains , Math. Res. Lett. (2000), no. 5-6, 643–650.8. Markus Keel, Tristan Roy, and Terence Tao, Global well-posedness of the Maxwell-Klein-Gordon equation belowthe energy norm , Discrete Contin. Dyn. Syst. (2011), no. 3, 573–621.9. Carlos E. Kenig and Frank Merle, Global well-posedness, scattering and blow-up for the energy-critical, focusing,non-linear Schr¨odinger equation in the radial case , Invent. Math. (2006), no. 3, 645–675.10. ,

Global well-posedness, scattering and blow-up for the energy-critical focusing non-linear wave equation ,Acta Math. (2008), no. 2, 147–212.11. Sergiu Klainerman,

Long time behaviour of solutions to nonlinear wave equations , Proceedings of the InternationalCongress of Mathematicians, Vol. 1, 2 (Warsaw, 1983), PWN, Warsaw, 1984, pp. 1209–1215.12. ,

Geometric and Fourier Methods in Nonlinear Wave Equations , Lectures delivered at IPAM workshop inOscillatory integrals and PDE, March 19 – 25, 2001.13. Sergiu Klainerman and Matei Machedon,

Space-time estimates for null forms and the local existence theorem ,Comm. Pure Appl. Math. (1993), no. 9, 1221–1268.14. , On the Maxwell-Klein-Gordon equation with ﬁnite energy , Duke Math. J. (1994), no. 1, 19–44.15. , Finite energy solutions of the Yang-Mills equations in R + , Ann. of Math. (2) (1995), no. 1, 39–119.16. , Smoothing estimates for null forms and applications , Duke Math. J. (1995), no. 1, 99–133 (1996).17. Sergiu Klainerman, Igor Rodnianski, and J´er´emie Szeftel, The bounded L curvature conjecture , Invent. Math. (2015), no. 1, 91–216.18. Sergiu Klainerman and Daniel Tataru, On the optimal local regularity for Yang-Mills equations in R + , J. Amer.Math. Soc. (1999), no. 1, 93–116.19. Joachim Krieger, Global regularity of wave maps from R + to H . Small energy , Comm. Math. Phys. (2004),no. 3, 507–580.20. Joachim Krieger and Wilhelm Schlag, Concentration compactness for critical wave maps , EMS Monographs inMathematics, European Mathematical Society (EMS), Z¨urich, 2012.

ONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION 159

21. Joachim Krieger and Jacob Sterbenz,

Global regularity for the Yang-Mills equations on high dimensional Minkowskispace , Mem. Amer. Math. Soc. (2013), no. 1047, vi + Global well-posedness for the Maxwell-Klein-Gordon equationin + dimensions: small energy , Duke Math. J. (2015), no. 6, 973–1040.23. Pierre-Louis Lions, The concentration-compactness principle in the calculus of variations. The locally compact case.I , Ann. Inst. H. Poincar´e Anal. Non Lin´eaire (1984), no. 2, 109–145.24. , The concentration-compactness principle in the calculus of variations. The locally compact case. II , Ann.Inst. H. Poincar´e Anal. Non Lin´eaire (1984), no. 4, 223–283.25. , The concentration-compactness principle in the calculus of variations. The limit case. I , Rev. Mat.Iberoamericana (1985), no. 1, 145–201.26. , The concentration-compactness principle in the calculus of variations. The limit case. II , Rev. Mat.Iberoamericana (1985), no. 2, 45–121.27. Matei Machedon and Jacob Sterbenz, Almost optimal local well-posedness for the (3 + -dimensional Maxwell-Klein-Gordon equations , J. Amer. Math. Soc. (2004), no. 2, 297–359 (electronic).28. Frank Merle and Luis Vega, Compactness at blow-up time for L solutions of the critical nonlinear Schr¨odingerequation in 2D , Internat. Math. Res. Notices (1998), no. 8, 399–425.29. G. M´etivier and S. Schochet, Trilinear resonant interactions of semilinear hyperbolic waves , Duke Math. J. (1998), no. 2, 241–304.30. Sung-Jin Oh, Gauge choice for the Yang-Mills equations using the Yang-Mills heat ﬂow and local well-posedness inH , J. Hyperbolic Di ﬀ er. Equ. (2014), no. 1, 1–108.31. , Finite energy global well-posedness of the Yang–Mills equations on R + : An approach using the Yang–Mills heat ﬂow , Duke Math. J. (2015), no. 9, 1669–1732.32. Sung-Jin Oh and Daniel Tataru, Energy dispersed solutions for the (4 + -dimensional Maxwell-Klein-Gordonequation , arXiv:1503.01561.33. , Global well-posedness and scattering of the (4 + -dimensional Maxwell-Klein-Gordon equation ,arXiv:1503.01562.34. , Local well-posedness of the (4 + -dimensional Maxwell-Klein-Gordon equation at energy regularity ,arXiv:1503.01560.35. Igor Rodnianski and Terence Tao, Global regularity for the Maxwell-Klein-Gordon equation with small criticalSobolev norm in high dimensions , Comm. Math. Phys. (2004), no. 2, 377–426.36. Sigmund Selberg,

Almost optimal local well-posedness of the Maxwell-Klein-Gordon equations in + dimensions ,Comm. Partial Di ﬀ erential Equations (2002), no. 5-6, 1183–1227.37. Sigmund Selberg and Achenef Tesfahun, Finite-energy global well-posedness of the Maxwell-Klein-Gordon systemin Lorenz gauge , Comm. Partial Di ﬀ erential Equations (2010), no. 6, 1029–1057.38. Jalal Shatah and Michael Struwe, Regularity results for nonlinear wave equations , Ann. of Math. (2) (1993),no. 3, 503–518.39. Jacob Sterbenz,

Global regularity and scattering for general non-linear wave equations. II. (4 + dimensionalYang-Mills equations in the Lorentz gauge , Amer. J. Math. (2007), no. 3, 611–664.40. Jacob Sterbenz and Daniel Tataru, Energy dispersed large data wave maps in + dimensions , Comm. Math. Phys. (2010), no. 1, 139–230.41. , Regularity of wave-maps in dimension +

1, Comm. Math. Phys. (2010), no. 1, 231–264.42. Michael Struwe,

Equivariant wave maps in two space dimensions , Comm. Pure Appl. Math. (2003), no. 7, 815–823.43. , Variational methods , fourth ed., Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge, vol. 34,Springer-Verlag, Berlin, 2008.44. J´er´emie Szeftel,

Parametrix for wave equations on a rough background I: regularity of the phase at initial time ,arXiv:1204.1768.45. ,

Parametrix for wave equations on a rough background II: construction and control at initial time ,arXiv:1204.1769.46. ,

Parametrix for wave equations on a rough background III: space-time regularity of the phase ,arXiv:1204.1770.47. ,

Parametrix for wave equations on a rough background IV: control of the error term , arXiv:1204.1771.48. ,

Sharp Strichartz estimates for the wave equation on a rough background , arXiv:1301.0112.49. Terence Tao,

Global regularity of wave maps. III – VII

60 CONCENTRATION COMPACTNESS FOR THE CRITICAL MKG EQUATION

50. ,

Global regularity of wave maps. I. Small critical Sobolev norm in high dimension , Internat. Math. Res.Notices (2001), no. 6, 299–328.51. ,

Global regularity of wave maps. II. Small energy in two dimensions , Comm. Math. Phys. (2001), no. 2,443–544.52. Daniel Tataru,

On global existence and scattering for the wave maps equation , Amer. J. Math. (2001), no. 1,37–77.53. ,

Rough solutions for the wave maps equation , Amer. J. Math. (2005), no. 2, 293–377.Bˆ atiment des M ath ´ ematiques , EPFL, S tation

8, 1015 L ausanne , S witzerland

E-mail address : [email protected] D epartement M athematik , ETH Z¨ urich , 8092 Z¨ urich , S witzerland E-mail address ::