[PDF] Inflation after Planck: and the winners are

Abstract

We review the constraints that the recently released Cosmic Microwave Background (CMB) Planck data put on inflation and we argue that single field slow-roll inflationary scenarios (with minimal kinetic term) are favored. Then, within this class of models, by means of Bayesian inference, we show how one can rank the scenarios according to their performances, leading to the identification of ``the best models of inflation''.

Full PDF

aa r X i v : . [ a s t r o - ph . C O ] D ec INFLATION AFTER PLANCK: AND THE WINNERS ARE ...

J´erˆome Martin

Insitut d’Astrophysique de Paris, UMR 7095-CNRS, Universit´e Pierre et Marie Curie,98bis boulevard Arago, 75014 Paris (France)

We review the constraints that the recently released Cosmic Microwave Background (CMB)Planck data put on inﬂation and we argue that single ﬁeld slow-roll inﬂationary scenarios (withminimal kinetic term) are favored. Then, within this class of models, by means of Bayesianinference, we show how one can rank the scenarios according to their performances, leadingto the identiﬁcation of “the best models of inﬂation”.

The theory of inﬂation 1 , , , , , , K = − . +0 . − . , which is of coursevery consistent with inﬂation and that the cosmological ﬂuctuations are adiabatic (at 95% CL)and Gaussian f loc NL = 2 . ± . f eq NL = − ±

75 and f ortho NL = − ±

39 7. Another importantmessage of the Planck data 8 is the fact that a tilt in the power spectrum has now been detectedat a signiﬁcant statistical level, n S = 0 . ± . σ . In addition, neither a signiﬁcant running nor a signiﬁcant running of the runninghave been detected since it was found that d n S / d ln k = − . ± .

009 (Planck+WP) andd n S / d ln k = 0 . ± .

016 (WMAP+WP), with a pivot scale chosen at k ∗ = 0 . − .Based on the above discussion, it is clear that single ﬁeld slow roll models (with a minimalkinetic term) are favored from an observational point of view since this class of models preciselypredicts no entropy perturbations and negligible non-Gaussianities. Of course, this does notmean that other inﬂationary scenarios are ruled out but simply that there are not needed toexplain the data. Inﬂation therefore appears as a simple and non trivial, but non exotic, theory.It should however be clear that, even if we restrict our considerations to this simple class ofmodels, it still remains a very large number of possible models 9. Then comes the questions ofhow one can constrain these models, estimate their performances and rank them, in a statisticallywell-deﬁned fashion in order to ﬁnd “the best model(s) of inﬂation”. Once a well justiﬁed methodhas been designed, it can be applied to all inﬂationary models in order to actually identify whichscenario is favored by the Planck data. Answering and discussing these questions is the mainsubject of the present paper.This article is organized as follows. In the next section, Sec. 2, we brieﬂy review slow-rollinﬂation. Then, in Sec. 3, we deﬁne and discuss what is meant by a model A is better than amodel B. For this purpose, we review the Bayesian model comparison approach, we quickly recallhow the Bayesian evidence of a slow-roll inﬂationary model can be estimated and we present theresults of Ref. 10 which give the model winners. Finally, in the conclusion, Sec. 4, we summarizeour results. Slow-roll inﬂation is a very simple system. It consists in one scalar ﬁeld with a minimal kineticterm and a potential V ( φ ) and its behavior is controlled by the Friedmann-Lemaˆıtre and Klein-Gordon equations, namely H = 13 M Pl " ˙ φ V ( φ ) , ¨ φ + 3 H ˙ φ + V φ = 0 , (1)where H ≡ ˙ a/a denotes the Hubble parameter, a ( t ) being the Friedmann-Lemaˆıtre-RobertsonWalker (FLRW) scale factor and ˙ a its derivative with respect to cosmic time t . M Pl = 8 πG denotes the reduced Planck mass. A subscript φ means a derivative with respect to the inﬂatonﬁeld. Therefore, the only unknown function is the potential and, here, we try to constrain itsshape using the Planck data.When the potential is no longer ﬂat enough (this usually happens when the system ap-proaches its ground state, i.e. the minimum of the potential), inﬂation stops, the inﬂaton ﬁelddecays 11 ,

12, the decay products thermalize 13 and this is how inﬂation is smoothly connectedo the standard hot Big Bang phase. Let ρ and P be the energy density and pressure of theeﬀective ﬂuid dominating the Universe during reheating and w reh ≡ P/ρ the corresponding “in-stantaneous” equation of state. One can also deﬁne the mean equation of state parameter, w reh ,by 14 w reh ≡ N Z N reh N end w reh ( n )d n, (2)where ∆ N ≡ N reh − N end is the total number of e-folds during reheating, N end being the numberof e-folds at the end of inﬂation and N reh being the number of e-folds at which reheating iscompleted and the radiation dominated era begins. Then, one introduces a new parameter 14ln R rad ≡ ∆ N − w reh ) . (3)As discussed in detail in Ref. 14, this parameter completely characterizes the reheating phaseand its knowledge is necessary in order to work out the inﬂationary predictions for the CMB.In particular, it can be related to the so-called reheating temperature through 14 T = 30 ρ end π g ⋆ R w reh ) / (1 − w reh )rad , (4)where ρ end is the energy density at the end of inﬂation, which is known when V ( φ ) has beenchosen, and g ⋆ is the number of degrees of freedom at that time.Let us now turn to the description of inﬂationary perturbations. Two types of ﬂuctuationsare relevant for inﬂation: density perturbations and primordial gravity waves. The densityperturbations are described in terms of the Mukhanov-Sasaki variable v ( η, x ). In the Schr¨odingerapproach, the quantum state of the system is described by a wavefunctional, Ψ [ v ( η, x )], whichcan be factorized into mode components as 15Ψ [ v ( η, x )] = Y k Ψ k (cid:0) v R k , v I k (cid:1) = Y k Ψ R k (cid:0) v R k (cid:1) Ψ I k (cid:0) v I k (cid:1) , (5)where v R k denotes the real part of v and v I k its imaginary part. Each wavefunction obeys aSchr¨odinger equation with an Hamiltonian that can be deduced from a second order expansionof the action “gravity + inﬂaton ﬁeld”. Then, one can show that the solution is explicitlytime-dependent and given by a Gaussian ( η being the conformal time)Ψ R , I k (cid:16) η, v R , I k (cid:17) = N k ( η )e − Ω k ( η ) ( v R , I k ) . (6)where the functions N k ( η ) and Ω k ( η ) can be expressed as 15 | N k | = (cid:18) ℜ e Ω k π (cid:19) / , Ω k = − i f ′ k f k . (7)The function f k obeys the equation of motion of a parametric oscillator, namely f ′′ k + ω f k = 0,where the time dependent frequency of this oscillator is given by ω ( η, k ) = k − (cid:0) a √ ǫ (cid:1) ′′ / ( a √ ǫ ), k being the wavenumber of the mode under consideration and ǫ ≡ − ˙ H/H the ﬁrst slow-roll pa-rameter characterizing the cosmological expansion during inﬂation. For gravitational waves, onealso obtains a Gaussian wave-function except that the fundamental frequency of the oscillator f k is now given by ω = k − a ′′ /a .One of the great advantage of inﬂation is that it is possible to choose well justiﬁed initialconditions. In brief, this is because, at the beginning of inﬂation, the physical wavelengths ofFourier modes of cosmological relevance today are much smaller than the Hubble radius. Thesemodes do not feel spacetime expansion and, as a consequence, it is natural to choose the vacuumtate as their initial state. Technically, this amounts to take Ω k = k/ ǫ n +1 ≡ d ln | ǫ n | d N , n ≥ , (8)where ǫ ≡ H ini /H . The slow-roll conditions refer to a situation where all the ǫ n ’s satisfy ǫ n ≪

1. From this deﬁnition, we see that ω ( k, η ) for density perturbations depends on ǫ , ǫ and ǫ while, for gravity waves, it only depends on ǫ . Notice that, since H ( φ ) and V ( φ ) arerelated through the Einstein equations, the parameters ǫ n can also be expressed in terms of thesuccessive derivatives of the potential, namely ǫ ≃ M Pl (cid:18) V φ V (cid:19) , (9) ǫ ≃ M Pl "(cid:18) V φ V (cid:19) − V φφ V , (10) ǫ ǫ ≃ M Pl " V φφφ V φ V − V φφ V (cid:18) V φ V (cid:19) + 2 (cid:18) V φ V (cid:19) . (11)The slow-roll approximation also allows us to solve the equation that controls the evolutionof the function f k and, therefore, of the wavefunction. Since the initial conditions are alsocompletely speciﬁed (see the above discussion), the function f k and, hence, the wavefunction, iscompletely known. One can then calculate the two-point correlation function of the Mukhanov-Sasaki variable or, in Fourier space, of the power spectrum a . This involves a double expansion.The power spectrum is ﬁrst expanded around a chosen pivot scale k ∗ such that P ( k ) P = a + a ln (cid:18) kk ∗ (cid:19) + a (cid:18) kk ∗ (cid:19) + . . . , (13)where P ζ = H / (cid:0) π ǫ M Pl (cid:1) and, then, the coeﬃcients a n are expanded in terms of the slow-rollparameters. Concretely, for scalar perturbations, at second order in the slow-roll approximation,one obtains 16 , a = 1 − C + 1) ǫ ∗ − Cǫ ∗ + (cid:18) C + 2 C + π − (cid:19) ǫ ∗ + (cid:18) C − C + 7 π − (cid:19) ǫ ∗ ǫ ∗ + (cid:18) C + π − (cid:19) ǫ ∗ + (cid:18) − C + π (cid:19) ǫ ∗ ǫ ∗ , (14) a = − ǫ ∗ − ǫ ∗ + 2(2 C + 1) ǫ ∗ + (2 C − ǫ ∗ ǫ ∗ + Cǫ ∗ − Cǫ ∗ ǫ ∗ , (15) a = 4 ǫ ∗ + 2 ǫ ∗ ǫ ∗ + ǫ ∗ − ǫ ∗ ǫ ∗ , (16)where C ≡ γ E + ln 2 − ≈ − . γ E being the Euler constant. ǫ n ∗ denotes the value ofthe function ǫ n at Hubble radius crossing during inﬂation. For gravitational waves, the powerspectrum has the same structure but the expressions of the coeﬃcients a n diﬀer. a For density perturbations, the deﬁnition of the power spectrum reads P ζ ( k ) ≡ k π M Pl (cid:12)(cid:12)(cid:12)(cid:12) v k a √ ǫ (cid:12)(cid:12)(cid:12)(cid:12) . (12) n order to make concrete predictions, we must calculate the numerical values of the quan-tities ǫ n ∗ . In order to do so, one needs to know the slow-roll trajectory and we need to calculateaccurately when inﬂation stops. As a result, ǫ n ∗ usually depends on θ inf , the parameters of thepotential V ( φ ), and on the reheating temperature: ǫ n ∗ = ǫ n ∗ ( θ inf , T reh ).The above considerations explain how the CMB can tell us something about inﬂation. In-deed, CMB measurements constrain the power spectrum, that is the say, given the form theexpression of P ( k ) above, the values of the parameters ǫ n ∗ ( θ inf , T reh ). These parameters carryinformation about the shape of the potential (recall the expression of the slow-roll parametersin terms of the derivative of the potential) and on the reheating temperature. As a consequence,one can infer what are the properties of the inﬂaton potential V ( φ ) and learn about the physicalconditions that prevailed in the early universe. In the previous section, we have described how one can calculate the predictions of a giveninﬂationary model. However, we also would like to compare the performances of the diﬀerentinﬂationary scenarios and one way to achieve this program is to compare the quality of the ﬁtsprovided by the diﬀerent models.Let us now brieﬂy describe how this can be achieved 18 , ,

20. Let us call M and M two competing models, aiming at explaining some data D (here, of course, we have in mindthe Cosmic Microwave Background - CMB - measurements), the model one depending on oneparameter, θ , and the model two depending on two parameters, α and β . Their likelihoodfunction can be written as L ( D | θ ) = L , max e − χ ( θ ) / , L ( D | α, β ) = L , max e − χ ( α,β ) / , (17)where χ is the eﬀective chi-squared of the corresponding model that we do not need to specifyat this stage. The quality of the ﬁts can be estimated by computing the ratio of the maximumsof the two likelihoods. However, this does not give us information regarding the complexity ofthe two models b . If, for instance, model M achieves a very good ﬁt only at the price of a ﬁne-tuning, while M “naturally” performs well, one may wish to penalize M for its complexity.This “Occam’s razor” criterion is automatically included if one characterizes a model by itsBayesian evidence 19. The Bayesian evidence is the integral of the likelihood function over theprior space. Concretely, for M and M , this leads to E = Z L ( D | θ ) π ( θ ) d θ, E = Z L ( D | α, β ) π ( α, β ) d α d β. (18)The prior distributions π ( θ ) and π ( α, β ), satisfying R π ( θ )d θ = 1 [and a similar expression for π ( α, β )], encodes what we know about the parameter θ before our information is updated whenwe learn about the data D . Let us notice that the likelihood functions are not normalized inthe sense that R L ( D | θ ) d θ = 1. For simplicity, let us now assume that the prior π ( θ ) is ﬂat inthe range [ θ min , θ max ] and vanishes elsewhere. Because the distribution is normalized, one has π ( θ ) = 1 / ∆ θ with ∆ θ = θ max − θ min . Let us also assume that the likelihood function has a bellshape (for instance, but necessarily, is a Gaussian function) characterized by the width δθ . Letus ﬁnally suppose that the data give more information than the prior, in other words that thelikelihood is more peaked than the prior. In that case, the Bayesian evidence of model M canbe approximated by E ≃ L , max δθ ∆ θ . (19) b In the following, we will introduce a quantity called the “Bayesian complexity”. Here, we use the word“complexity” in the standard sense, i.e. a model is more complicated than another if, for instance, it has moreparameters or more ﬁne-tuning. At this stage, it should not be confused with the Bayesian complexity. n the same fashion, with the same assumptions (and obvious notations), the evidence of model M can be expressed as E ≃ L , max δα ∆ α δβ ∆ β . (20)Then, applying Bayes’ theorem, the probability of model M is given by p ( M | D ) = E π ( M ) /p ( D )and a similar formula for p ( M | D ). In this expression, π ( M ) represents the prior of model M and the quantity p ( D ) is a normalization factor. If we say that, initially, the two models areequally probable, that is to say π ( M ) = π ( M ), then the ratio of their posterior probabilities,the so-called Bayes factor, can be expressed as B ≡ p ( M | D ) p ( M | D ) = E E = L , max L , max δα ∆ α δβ ∆ β ∆ θδθ . (21)We see that the Bayes factor is controlled by the ratio L , max / L , max but now weighted by afactor, the so-called Occam factor, which penalizes the more complicated model, M , for anywasted parameter space. If, for instance, we take δα/ ∆ α = δβ/ ∆ β = δθ/ ∆ θ = 0 .

01, then B = 0 . L , max / L , max and the more complicated model can win only if its likelihood at the“best ﬁt point” is two orders of magnitude larger than that of M . So the best model is themodel which can achieve the best compromise between simplicity and quality of the ﬁt.From the previous considerations, we see that the Bayesian evidence is an ideal tool to rankmodels and to ﬁnd the best model. Nevertheless, it has the following property that could beconsidered as a shortcomings. Suppose we deﬁne a model M such that it is in fact model M but with a third parameter, say γ , such that this new parameter does not aﬀect in any way theﬁt to the data; in other words, such that the likelihood is ﬂat along γ . In that case, the evidenceof model M is given by E = Z L ( D | α, β, γ ) π ( α ) π ( β ) π ( γ )d α d β d γ = Z L ( D | α, β ) π ( α ) π ( β ) π ( γ )d α d β d γ (22)= Z L ( D | α, β ) π ( α ) π ( β )d α d β Z π ( γ )d γ = E . (23)Therefore, the two models have the same evidence despite the fact that M is obviously simplerthan M . In order to break this degeneracy, one has to introduce another quantity, the Bayesiancomplexity 18, which allows us to distinguish M and M .In order to discuss the deﬁnition of the complexity, we work with a one parameter modelonly, i.e. M , (the generalization to an arbitrary number of parameters is straightforward) andwe explicitly assume that the likelihood of the model is a Gaussian, namely L ( D | θ ) = L , max e − ( θ − d ) / ( σ ) , (24)where d represents a measurement of the parameter θ . Regarding the prior, instead of consideringa ﬂat distribution as before, we also assume it is given by a Gaussian centered at θ = µ , π ( θ ) = 1Σ √ π e − ( θ − µ ) / ( ) . (25)We can check that this distribution is properly normalized. These new assumptions are made forconvenience only and do not change the above discussion (in fact, not quite exactly, see below).In particular, now, δθ is clearly given by σ and the ∆ θ by Σ so that the condition that the dataare more informative than the prior, δθ ≪ ∆ θ , corresponds to σ ≪ Σ. Then one can calculatethe posterior distribution of the parameter θ , p ( θ | D ) = 1 E L ( D | θ ) π ( θ ) (26)= 1 √ π r + 1 σ exp " − (cid:18) + 1 σ (cid:19) (cid:18) θ − d + µ σ / Σ σ / Σ (cid:19) , (27)hich is a properly normalized Gaussian with mean and variance respectively given by d + µσ / Σ σ / Σ , q + σ . (28)On the other hand, the evidence of the model can be expressed as E = L , max p /σ e − ( µ − d ) / [ ( σ +Σ )] . (29)This result is compatible with the previous discussion. Indeed, if the likelihood is more informa-tive than the prior, then Σ /σ ≫ ∼ L , max σ/ Σwhich is equivalent to L , max δθ/ ∆ θ and shows that the Occam’s factor is simply σ/ Σ.We now come to the deﬁnition of the Bayesian complexity denoted by C b in what follows.It reads 18 C b = (cid:10) χ ( θ ) (cid:11) − χ ( h θ i ) , (30)where the symbol h· · · i means an average of the quantity · · · with a weigh given by the posterior p ( θ | D ). In the above expression, the eﬀective χ is deﬁned by − L , which in the present case,reads χ ( θ ) = 1 σ ( θ − d ) − L , max . (31)Then, using the explicit expression for the posterior distribution, see Eq. (26), and the previousexpression for the χ , one obtains the following formula for the Bayesian complexity C b = Z p ( θ | D ) χ ( θ )d θ − χ (cid:20)Z p ( θ | D ) θ d θ (cid:21) = 11 + σ / Σ . (32)Therefore, if σ ≪ Σ, one has C b ≃

1. In other words, since the likelihood function is much morepeaked than the prior, the parameter θ is well-measured and the complexity is one. If, one thecontrary, σ ≫ Σ, then C b ≃ θ . In themultidimensional case (i.e. a model with n parameters), one has C b = P ni =1 / (1 + σ i / Σ i ), andthe complexity gives the number of parameters that have been measured with the data D or,in other words, the number of eigendirections in which the likelihood is more informative thanthe prior.Finally, to conclude this section, let us try to derive the complexity for another very simpleone parameter model, similar to the example we treated at the beginning of this article. Thiswill help us to understand the meaning of complexity in another context 21. We assume that thelikelihood is ﬂat, centered at θ = 0 with a width given by δθ and a height L max . We also assumethat the prior is ﬂat in the range [ − ∆ θ/ , ∆ θ/

2] and has height 1 / ∆ θ (and is less informativethan the likelihood). In that case, it is straightforward to estimate the evidence of the modelwhich is E = L max δθ/ ∆ θ . On the other hand, the posterior on the parameter θ can be expressedas p ( θ | D ) = L max ∆ θ E = 1 δθ , for − δθ < θ < δθ , (33)and vanishes otherwise. As a consequence, one ﬁnds that the complexity can be written as C b = − Z δθ/ − δθ/ δθ ln L d θ + 2 ln L max = 0 . (34)We see that one can no longer interpret the complexity as we did before. The reason is that themodel we have used is too far from a Gaussian model and the concept of complexity cannot bereally deﬁned in that case. This illustrates the limitation of this statistical tool which is eﬃcientonly if the underlying statistics is not too far from a Gaussian. This is a warning that shouldbe kept in mind in the following. ln B i REF | Odds Strength of evidence < . < . ∼ . ∼

12 : 1 Moderate evidence5 . ∼

150 : 1 Strong evidence

Table 1: Jeﬀreys scale for evaluating the strength of evidence when comparing two models, M i versus a referencemodel M REF . Following the above considerations, it should now be clear that one way to estimate the perfor-mances of inﬂationary models (in explaining the recently released Planck data) is to calculatetheir evidence and their complexity. Then, one can rank them in a statistically consistent wayand ﬁnd the best scenarios. The predictions of all single ﬁeld scenarios have been worked out andcompared to Planck data in

Encyclopædia Inﬂationaris B i REF ≡ E ( D |M i ) E ( D |M REF ) , (35)where the reference model was taken to be the Starobinsky model. The “Jeﬀreys scale”, seeTable 1, gives an empirical prescription for translating the values of B i REF into strengths ofbelief. One can summarize our results as follows. Firstly, for convenience, one can change thereference point of the Bayes factor and estimate the quantity B i BEST ≡ E ( D |M i ) / E ( D |M BEST )(rather than B i REF before) with non-committal model priors. Then, one uses the Jeﬀreys scalewith B i BEST , instead of B i REF , and count the number of models in the “inconclusive”, “weakevidence”, “moderate evidence” and “strong evidence” zones. The models in the “inconclusive”category can be viewed as the best models. We have found that this is the case for 52 modelsfor a total of 193 models, that is to say 26% of the models. Therefore, this means that ≃

73% ofthe inﬂationary scenarios can now be considered as disfavored and/or ruled out by the Planckdata.Secondly, one determines the number of unconstrained parameters, N i uc , which is the numberof parameters of model M i , N i param , minus its complexity C i b N i uc = N i param − C i b . (36)Then, among the models in the “inconclusive” region, one should prefers models for which N i uc ≃

0. If one retains the criterion 0 < N i uc <

1, then one reduces the number of “goodmodels” to 17, that is to say to ≃

9% of the

Encyclopædia Inﬂationaris scenarios.These results are summarized in Fig. 1 which shows the histogram corresponding to thenumber of models in each Jeﬀreys category with a given value of N i uc . A complete analysis andthe list of the best models can be found in Ref. 10. In these proceedings, we have analyzed the implications of the recently released Planck datafor inﬂation. We have argued that single ﬁeld slow-roll scenarios with minimal kinetic term arefavored by Planck 2013. Then, we have designed speciﬁc Bayesian tools to further constrain themodels within the class of favored scenarios. We have shown that Planck2013 can then singleout about ∼

10% of the models, thus strongly reducing the inﬂationary landscape compatiblewith the astrophysical observations. Our results demonstrate concretely that CMB data can igure 1 – Histogram representing the number of inﬂationary models after Planck2013 according to the Jeﬀreycategory and the number of unconstrained parameters. constrain the physics of the early universe in an eﬃcient way. In the near future, the nextrelease of Planck measurements should allow us to learn even more about inﬂation.

Acknowledgments

I would like to thank C. Ringeval and V. Vennin for careful reading of the manuscript.

References

1. A. Starobinsky,

Phys. Lett. B , 99 (1980).2. A. Guth, Phys. Rev. D , 347 (1981).3. V. Mukhanov and G. Chibisov, JETP Lett. , 532 (1981).4. A. Linde, Phys. Lett. B , 389 (1982).5. A. Starobinsky, Phys. Lett. B , 175 (1982).6. J. Martin, Lect. Notes Phys. , 193 (2008).7. P. Ade et al., arXiv:1303.5084

8. P. Ade et al., arXiv:1303.5076 .9. J. Martin, C. Ringeval, and V. Vennin, arXiv:1303.3787 .10. J. Martin, C. Ringeval, R. Trotta and V. Vennin, arXiv:1312.3529 .11. M. Turner,

Phys. Rev. D , 1243 (1983).12. L. Kofman, A. Linde and A. Starobinsky, Phys. Rev. D , 3258 (1997).13. D. Podolsky, G. Felder, L. Kofman and M. Peloso Phys. Rev. D , 023501 (2006).14. J. Martin and C. Ringeval, Phys. Rev. D , 023511 (2010).15. J. Martin, V. Vennin and P. Peter, Phys. Rev. D , 103524 (2012).16. D. Schwarz and C. Terrero-Escalante, JCAP , 003 (2004).17. J. Martin, C. Ringeval and V. Vennin,

JCAP , 021 (2013).18. M. Kunz, R. Trotta and D. Parkinson,

Phys. Rev. D , 023503 (2006).9. R. Trotta, Contemp. Phys. , 71 (2008).20. J. Martin, C. Ringeval and R. Trotta, Phys. Rev. D , 063524 (2011).21. F. Feroz, K. Cranmer, M. Hobson, R. de Austri and R. Trotta, JHEP , 042 (2011).22. C. Ringeval, arXiv:1312.2347arXiv:1312.2347