[PDF] How well do experience curves predict technological progress? A method for making distributional forecasts

Abstract

Experience curves are widely used to predict the cost benefits of increasing the deployment of a technology. But how good are such forecasts? Can one predict their accuracy a priori? In this paper we answer these questions by developing a method to make distributional forecasts for experience curves. We test our method using a dataset with proxies for cost and experience for 51 products and technologies and show that it works reasonably well. The framework that we develop helps clarify why the experience curve method often gives similar results to simply assuming that costs decrease exponentially. To illustrate our method we make a distributional forecast for prices of solar photovoltaic modules.

Full PDF

HHow well do experience curves predict technological progress? A method for making distributional forecasts ∗ François Lafond , Aimee Gotway Bailey , Jan David Bakker , Dylan Rebois ,Rubina Zadourian , Patrick McSharry , and J. Doyne Farmer Institute for New Economic Thinking at the Oxford Martin School Smith School for Enterprise and the Environment, University of Oxford London Institute for Mathematical Sciences U.S. Department of Energy Department of Economics, University of Oxford Mathematical Institute, University of Oxford Max Planck Institute for the Physics of Complex Systems, Germany Carnegie Mellon University Africa, Rwanda African Center of Excellence in Data Science, University of Rwanda Oxford Man Institute of Quantitative Finance, University of Oxford Computer Science Department, University of Oxford Santa Fe Institute USA

September 18, 2017

Abstract

Experience curves are widely used to predict the cost beneﬁts of increasing the deployment ofa technology. But how good are such forecasts? Can one predict their accuracy a priori? Inthis paper we answer these questions by developing a method to make distributional forecasts forexperience curves. We test our method using a dataset with proxies for cost and experience for 51products and technologies and show that it works reasonably well. The framework that we develophelps clarify why the experience curve method often gives similar results to simply assuming thatcosts decrease exponentially. To illustrate our method we make a distributional forecast for pricesof solar photovoltaic modules.JEL: C53, O30, Q47.Keywords: Forecasting, Technological progress, Experience curves. ∗ Acknowledgements: This project was primarily sup-ported by the European Commission project FP7-ICT-2013-611272 (GROWTHCOM) and by Partners for aNew Economy. We also gratefully acknowledge sup-port from the European Commission project H2020-730427 (COP21 RIPPLES) and the Institute for NewEconomic Thinking. We are grateful to Giorgio Tri-

Since Wright’s (1936) study of airplanes, it hasbeen observed that for many products and tech- ulzi and two anonymous referees for comments on anearlier draft. Contact: [email protected],[email protected]. a r X i v : . [ q -f i n . E C ] S e p ologies the unit cost of production tends todecrease by a constant factor every time cu-mulative production doubles (Thompson 2012).This relationship, also called the experience orlearning curve, has been studied in many do-mains. It is often argued that it can be usefulfor forecasting and planning the deployment ofa particular technology (Ayres 1969, Sahal 1979,Martino 1993). However in practice experiencecurves are typically used to make point fore-casts, neglecting prediction uncertainty. Ourcentral result in this paper is a method for mak-ing distributional forecasts , that explicitly takeprediction uncertainty into account. We use his-torical data to test this and demonstrate thatthe method works reasonably well.Forecasts with experience curves are usuallymade by regressing historical costs on cumula-tive production. In this paper we recast the ex-perience curve as a time series model expressedin ﬁrst-diﬀerences: the change in costs is deter-mined by the change in experience. We derive aformula for how the accuracy of prediction variesas a function of the time horizon for the forecast,the number of data points the forecast is basedon, and the volatility of the time series. Weare thus able to make distributional rather thanpoint forecasts. Our approach builds on earlierwork by Farmer & Lafond (2016) that showedhow to do this for univariate forecasting basedon a generalization of Moore’s law (the autocor-related geometric random walk with drift). Herewe apply our new method based on experiencecurves to solar photovoltaics modules (PV) andcompare to the univariate model.Other than Farmer & Lafond (2016), the twoclosest papers to our contribution here are Al-berth (2008) and Nagy et al. (2013). Both pa-pers tested the forecast accuracy of the experi-ence curve model, and performed comparisonswith the time trend model. Alberth (2008) per-formed forecast evaluation by keeping some ofthe available data for comparing forecasts with See Yelle (1979), Dutton & Thomas (1984), An-zanello & Fogliatto (2011) and for energy technolo-gies Neij (1997), Isoard & Soria (2001), Nemet (2006),Kahouli-Brahmi (2009), Junginger et al. (2010), Can-delise et al. (2013). actual realized values. Here, we build on themethodology developed by Nagy et al. (2013)and Farmer & Lafond (2016), which consists inperforming systematic hindcasting. That is, weuse an estimation window of a constant (small)size and perform as many forecasts as possible.As in Alberth (2008) and Nagy et al. (2013),we use several datasets and we pool forecast er-rors to construct a distribution of forecast er-rors. We think that out-of-sample forecasts areindeed good tests for models that aim at pre-dicting technological progress. However, whena forecast error is observed, it is generally notclear whether it is “large” or “small”, from a sta-tistical point of view. And it is not clear thatit makes sense to aggregate forecast errors fromtechnologies that are more or less volatile andhave high or low learning rates.A distinctive feature of our work is that we ac-tually calculate the expected forecast errors. Asin Farmer & Lafond (2016), we derive an approx-imate formula for the theoretical variance of theforecast errors, so that forecast errors from dif-ferent technologies can be normalized, and thusaggregated in a theoretically grounded way. Asa result, we can check whether our empiricalforecast errors are in line with the model. Weshow how in our model forecast errors dependon future random shocks, but also parameteruncertainty, as is only seldomly acknowledged inthe literature (for exceptions, see Vigil & Sarper(1994) and Van Sark (2008)).Alberth (2008) and Nagy et al. (2013) com-pared the forecasts from the experience curve,which we call Wright’s law, with those from asimple univariate time series model of exponen-tial progress, which we call Moore’s law. WhileAlberth (2008) found that the experience curvemodel was vastly superior to an exogenous timetrend, our results (and method and dataset) arecloser to the ﬁndings of Nagy et al. (2013): uni-variate and experience curves models tend toperform similarly, due to the fact that for many Alberth (2008) produced forecasts for a number(1,2, . . .

6) of doublings of cumulative production. Hereinstead we use time series methods so it is more natu-ral to compute everything in terms of calendar forecasthorizon. .The experience curve, like any model, is only anapproximation. Its simplicity is both a virtueand a detriment. The virtue is that the modelis so simple that its parameters can usually beestimated well enough to have predictive valuebased on the short data sets that are typi- We limit ourselves to showing that the forecast er-rors are compatible with our model being correct, andwe do not try to show that they could be compatiblewith the experience curve model being spurious. cally available . The detriment is that sucha simple model neglects many eﬀects that arelikely to be important. A large literature start-ing with Arrow (1962) has convincingly arguedthat learning-by-doing occurs during the pro-duction (or investment) process, leading to de-creasing unit costs. But innovation is a com-plex process relying on a variety of interact-ing factors such as economies of scale, inputprices, R&D and patents, knowledge depreci-ation eﬀects, or other eﬀects captured by ex-ogenous time trends. For instance, Candeliseet al. (2013) argue that there is a lot of vari-ation around the experience curve trend in so-lar PV, due to a number of unmodelled mech-anisms linked to industrial dynamics and inter-national trade, and Sinclair et al. (2000) arguedthat the relationship between costs and experi-ence is due to experience driving expectations offuture production and thus incentives to investin R&D. Besides, some have argued that simpleexponential time trends are more reliable thanexperience curves. For instance Funk & Magee(2015) noted that signiﬁcant technological im-provements can take place even though pro-duction experience did not really accumulate,and Magee et al. (2016) found that in domainswhere experience (measured as annual patentoutput) did not grow exponentially, costs stillhad a exponentially decreasing pattern, break-ing down the experience curve. Finally, an-other important aspect that we do not address isreverse causality (Kahouli-Brahmi 2009, Nord-haus 2014, Witajewski-Baltvilks et al. 2015): ifdemand is elastic, a decrease in price should leadto an increase in production. Here we have in-tentionally focused on the simplest case in orderto develop the method. For short data sets such as most of those used here,ﬁtting more than one parameter often results in degra-dation in out-of-sample performance (Nagy et al. 2013). For examples of papers discussing these eﬀectswithin the experience curves framework, see Argote et al.(1990), Berndt (1991), Isoard & Soria (2001), Papineau(2006), Söderholm & Sundqvist (2007), Jamasb (2007),Kahouli-Brahmi (2009), Bettencourt et al. (2013), Ben-son & Magee (2015) and Nordhaus (2014). Empirical framework

Experience curves postulate that unit costs de-crease by a constant factor for every doublingof cumulative production . This implies a linearrelationship between the log of the cost, whichwe denote y , and the log of cumulative produc-tion which we denote x : y t = y + ωx t . (1)This relationship has also often been called “thelearning curve” or the experience curve. Wewill often call it “Wright’s law” in reference toWright’s original study, and to express our ag-nostic view regarding the causal mechanism.Generally, experience curves are estimated as y t = y + ωx t + ι t , (2)where ι t is i.i.d. noise. However, it has some-times been noticed that residuals may be au-tocorrelated. For instance Womer & Patter-son (1983) noticed that autocorrelation “seemsto be an important problem” and Lieberman(1984) “corrected” for autocorrelation using theCochrane-Orcutt procedure. Bailey et al.(2012) proposed to estimate Eq.(1) in ﬁrst dif-ference y t − y t − = ω ( x t − x t − ) + η t , (3)where η t are i.i.d errors η t ∼ N (0 , σ η ) . In Eq.(3),noise accumulates so that in the long run thevariables in level can deviate signiﬁcantly fromthe deterministic relationship. To see this, notethat (assuming x = log(1) = 0 ) Eq. (3) can berewritten as y t = y + ωx t + t (cid:88) i =1 η i , For other parametric models relating experienceto costs see Goldberg & Touw (2003) and Anzanello &Fogliatto (2011). See also McDonald (1987), Hall & Howell (1985),and Goldberg & Touw (2003) for further discussion ofthe eﬀect of autocorrelation on diﬀerent estimation tech-niques. which is the same as Eq.(2) except that the noiseis accumulated across the entire time series. Incontrast, Eq.(2) implies that even in the longrun the two variables should be close to theirdeterministic relationship.If y and x are I(1) , Eq.(2) deﬁnes a cointe-grated relationship. We have not tested for coin-tegration rigorously, mostly because unit rootand cointegration tests have uncertain proper-ties in small samples, and our time series aretypically short and they are all of diﬀerentlength. Nevertheless, we have run some anal-yses suggesting that the diﬀerence model maybe more appropriate. First of all in about halfthe cases we found that model (2) resulted ina Durbin-Watson statistic lower than the R ,indicating a risk of spurious regression and sug-gesting that ﬁrst-diﬀerencing may be appropri-ate . Second, the variance of the residuals ofthe level model was generally higher, so thatthe tests proposed in Harvey (1980) generallyfavored the ﬁrst-diﬀerence model. Third, we rancointegration tests in the form of AugmentedDickey-Fuller tests on the residuals of the re-gression (2), again generally suggesting a lackof cointegration. While a lengthy study usingdiﬀerent tests and paying attention to diﬀeringsample sizes would shed more light on this issue,in this paper we will use Eq. (3) (with autocor-related noise). The simplicity of this speciﬁca-tion is also motivated by the fact that we wantto have the same model for all technologies, wewant to be able to calculate the variance of theforecast errors, and we want to estimate param-eters with very short estimation windows so asto obtain as many forecast errors as possible.We will compare our forecasts using Wright’slaw with those of a univariate time series modelwhich we call Moore’s law y t − y t − = µ + n t . (4)This is a random walk with drift. The fore-cast errors have been analyzed for i.i.d. nor- A variable is I(1) or integrated of order one if its ﬁrstdiﬀerence y t +1 − y t is stationary. Note, however, that since we do not include an in-tercept in the diﬀerence model and since the volatility ofexperience is low, ﬁrst diﬀerencing is not a good solutionto the spurious regression problem. n t and for n t = v t + θv t − with i.i.d. nor-mal v t (keeping the simplest forecasting rule)in Farmer & Lafond (2016). As we will notethroughout the paper, if cumulative productiongrows at a constant logarithmic rate of growth,i.e. x t +1 − x t = r for all t , Moore’s and Wright’slaws are observationally equivalent in the sensethat Eq. (3) becomes Eq. (4) with µ = ωr .This equivalence has already been noted by Sa-hal (1979) and Ferioli & Van der Zwaan (2009)for the deterministic case. Nagy et al. (2013),using a dataset very close to ours, showed thatusing trend stationary models to estimate thethree parameters independently (Eq. (2) andregressions of the (log) costs and experience lev-els on a time trend), the identity ˆ µ = ˆ ω ˆ r holdsvery well for most technologies. Here we willreplicate this result using diﬀerence stationarymodels. To evaluate the predictive ability of the mod-els, we follow closely Farmer & Lafond (2016)by using hindcasting to compute as many fore-cast errors as possible and using a surrogate dataprocedure to test their statistical compatibilitywith our models. Pretending to be in the past,we make pseudo forecasts of values that we areable to observe and compute the errors of ourforecasts. More precisely, our procedure is as fol-lows. We consider all periods for which we have( m + 1 ) years of observations (i.e. m year-to-year growth rates) to estimate the parameters,and at least one year ahead to make a forecast(unless otherwise noted we choose m =5). Foreach of these periods, we estimate the param-eters and make all the forecasts for which wecan compute forecast errors. Because of our fo-cus on testing the method and comparing withunivariate forecasts, throughout the paper weassume that cumulative production is known inadvance. Having obtained a set of forecast er-rors, we compute a number of indicators, suchas the distribution of the forecast errors or themean squared forecast error, and compare theempirical values to what we expect given the size and structure of our dataset.To know what we expect to ﬁnd, we use an an-alytical approach as well as a surrogate data pro-cedure. The analytical approach simply consistsin deriving an approximation of the distributionof forecast errors. However, the hindcasting pro-cedure generates forecast errors which, for a sin-gle technology, are not independent . However,in this paper we have many short time series sothat the problem is somewhat limited (see Ap-pendix A). Nevertheless, we deal with it by usinga surrogate data procedure: we simulate manydatasets similar to ours and perform the sameanalysis, thereby determining the sampling dis-tribution of any statistics of interest. To simplify notation a bit, let Y t = y t − y t − and X t = x t − x t − be the changes of y and x inperiod t . We estimate Wright’s exponent fromEq.(3) by running an OLS regression throughthe origin. Assuming that we have data fortimes i = 1 ... ( m + 1) , minimizing the squarederrors gives ˆ ω = (cid:80) m +1 i =2 X i Y i (cid:80) m +1 i =2 X i . (5)Substituting ωX i + η i for Y i , we have ˆ ω = ω + (cid:80) m +1 i =2 X i η i (cid:80) m +1 i =2 X i . (6)The variance of the noise σ η is estimated as theregression standard error ˆ σ η = 1 m − m +1 (cid:88) i =2 ( Y i − ˆ ωX i ) . (7)For comparison, parameter estimation in theunivariate model Eq. (4) as done in Farmer &Lafond (2016) yields the sample mean ˆ µ andvariance ˆ K of Y t ∼ N ( µ, K ) . For a review of forecasting ability tests and a dis-cussion of how the estimation scheme aﬀects the forecasterrors, see West (2006) and Clark & McCracken (2013). Throughout the paper, we will use the hat symbolfor estimated parameters when the estimation is madeusing only the m + 1 years of data on which the forecastsare based. When we provide full sample estimates weuse the tilde symbol. .4 Forecast errors Let us ﬁrst recall that for the univariate modelEq. (4), the variance of the forecast errors isgiven by (Sampson 1991, Clements & Hendry2001, Farmer & Lafond 2016) E [ E M,τ ] = K (cid:18) τ + τ m (cid:19) , (8)where τ is the forecast horizon and the sub-script M indicates forecast errors obtained us-ing “Moore” ’s model. It shows that in the sim-plest model, the expected squared forecast errorgrows due to future noise accumulating ( τ ) andto estimation error ( τ /m ). These terms willreappear later so we will use a shorthand A ≡ τ + τ m , (9)We now compute the variance of the forecasterrors for Wright’s model. If we are at time t = m + 1 and look τ steps ahead into the future, weknow that y t + τ = y t + ω ( x t + τ − x t ) + t + τ (cid:88) i = t +1 η i . (10)To make the forecasts we assume that the futurevalues of x are known, i.e. we are forecastingcosts conditional on a given growth of futureexperience. This is a common practice in theliterature (Meese & Rogoﬀ 1983, Alberth 2008).More formally, the point forecast at horizon τ is ˆ y t + τ = y t + ˆ ω ( x t + τ − x t ) . (11)The forecast error is the diﬀerence between Eqs.(10) and (11), that is E τ ≡ y t + τ − ˆ y t + τ = ( ω − ˆ ω ) t + τ (cid:88) i = t +1 X i + t + τ (cid:88) i = t +1 η i . (12)We can derive the expected squared error.Since the X i s are known constants, using ˆ ω fromEq. (6) and the notation m + 1 = t , we ﬁnd E [ E τ ] = σ η (cid:32) τ + (cid:0)(cid:80) t + τi = t +1 X i (cid:1) (cid:80) ti =2 X i (cid:33) . (13) Sahal (1979) was the ﬁrst to point out that in thedeterministic limit the combination of exponen-tially increasing cumulative production and ex-ponentially decreasing costs gives Wright’s law.Here we generalize this result in the presence ofnoise and show how variability in the productionprocess aﬀects this relationship.Under the assumption that experience growthrates are constant ( X i = r ) and letting m = t − , Eq. (13) gives the result that the varianceof Wright’s law forecast errors are precisely thesame as the variance of Moore’s law forecast er-rors given in Eq.(8), with ˆ K = ˆ σ η . To see howthe ﬂuctuations in the growth rate of experienceimpact forecast errors we can rewrite Eq. (13)as E [ E τ ] = σ η (cid:32) τ + τ m ˆ r f ) ˆ σ x, ( p ) + ˆ r p ) (cid:33) , (14)where ˆ σ x, ( p ) refers to the estimated variance ofpast experience growth rates, ˆ r ( p ) to the esti-mated mean of past experience growth rates,and ˆ r ( f ) to the estimated mean of future experi-ence growth rates. This makes it clear that the higher the volatil-ity of experience ( σ x ), the lower the forecast er-rors. This comes from a simple, well-known factof regression analysis: high variance of the re-gressor makes the estimates of the slope moreprecise. Here the high standard errors in the es-timation of ω (due to low σ x ) decrease the partof the forecast error variance due to parameterestimation, which is associated with the term τ /m .This result shows that, assuming Wright’s lawis correct, for Wright’s law forecasts to work well(and in particular to outperform Moore’s law), itis better to have cumulative production growthrates that ﬂuctuate a great deal. Unfortunatelyfor our data this is typically not the case. In-stead, empirically cumulative production followsa fairly smooth exponential trend. To explain The past refers to data at times ( , . . . , t ) and thefuture to times ( t + 1 , . . . , t + τ ) . g and volatility σ q . In Appendix B, using a saddlepoint approximation for the long time limit weﬁnd that E [ X ] ≡ r ≈ g andVar [ X ] ≡ σ x ≈ σ q tanh( g/ , (15)where tanh is the hyperbolic tangent function.We have tested this remarkably simple relation-ship using artiﬁcially generated data and we ﬁndthat it works reasonably well.These results show that cumulative produc-tion grows at the same rate as production. Moreimportantly, since < tanh( g/ < (and herewe assume g > ), the volatility of cumula-tive production is lower than the volatility ofproduction. This is not surprising: it is well-known that integration acts as a low pass ﬁl-ter, in this case making cumulative productionsmoother than production. Thus if productionfollows a geometric random walk with drift, ex-perience is a smoothed version, making it hardto distinguish from an exogenous exponentialtrend. When this happens Wright’s law andMoore’s law yield similar predictions. This the-oretical result is relevant to our case, as can beseen in the time series of production and expe-rience plotted in Fig. 2 and Fig. 3, and the lowvolatility of experience compared to productionreported in Table 1 below. We now turn to an extension of the basic model.As we will see, the data shows some evidenceof autocorrelation. Following Farmer & Lafond(2016), we augment the model to allow for ﬁrstorder moving average autocorrelation. For theautocorrelated Moore’s law model ( “IntegratedMoving Average of order 1”) y t − y t − = µ + v t + θv t − , Farmer & Lafond (2016) obtained a formula forthe forecast error variance when the forecastsare performed assuming no autocorrelation E [ E M,τ ] = σ v (cid:104) − θ + (cid:16) m − θm + θ (cid:17) A (cid:105) , (16) where A = τ + τ m (Eq.(9)). Here we extend thisresult to the autocorrelated Wright’s law model y t − y t − = ω ( x t − x t − ) + u t + ρu t − , (17)where u t ∼ N (0 , σ u ) . We treat ρ as a knownparameter. Moreover, we will assume that it isthe same for all technologies and we will esti-mate it as the average of the ˜ ρ j estimated oneach technology separately (as described in thenext section). This is a delicate assumption, butit is motivated by the fact that many of our timeseries are too short to estimate a speciﬁc ρ j reli-ably, and assuming a universal, known value of ρ allows us to keep analytical tractability.The forecasts are made exactly as before, butthe forecast error now is E τ = m +1 (cid:88) j =2 H j [ v j + ρv j − ]+ t + τ (cid:88) T = t +1 [ v T + ρv T − ] , (18)where the H j are deﬁned as H j = − (cid:80) t + τi = t +1 X i (cid:80) ti =2 X i X j . (19)The forecast error can be decomposed as a sumof independent normally distributed variables,from which the variance can be computed as E [ E τ ] = σ u (cid:16) ρ H + m (cid:88) j =2 ( H j + ρH j +1 ) + ( ρ + H m +1 ) + ( τ − ρ ) + 1 (cid:17) . (20)When we will do real forecasts (Section 4),we will take the rate of growth of future cumu-lative production as constant. If we also assumethat the growth rates of past cumulative pro-duction were constant, we have X i = r and thus H i = − τ /m for all i . As expected from Sahal’sidentity, simplifying Eq.(20) under this assump-tion gives Eq.(16) where θ is substituted by ρ and σ v is substituted by σ u , E [ E τ ] = σ u (cid:104) − ρ + (cid:16) m − ρm + ρ (cid:17) A (cid:105) . (21)7n practice we compute ˆ σ η using Eq. (7), so that σ u may be estimated as ˆ σ u = (cid:113) ˆ σ η / (1 + ρ ) , suggesting the normalized error E τ / ˆ σ η . To gainmore intuition on Eq.(21), and propose a simple,easy to use formula, note that for τ (cid:29) and m (cid:29) it can be approximated as E (cid:104) (cid:18) E τ σ η (cid:19) (cid:105) ≈ (1 + ρ ) ρ (cid:16) τ + τ m (cid:17) . (22)For all models (Moore and Wright with andwithout autocorrelated noise), having deter-mined the variance of the forecast errors we cannormalize them so that they follow a StandardNormal distribution E τ (cid:112) E [ E τ ] ∼ N (0 , (23)In what follows we will replace σ η by its esti-mated value, so that when E and ˆ σ η are inde-pendent the reference distribution is Student.However, for small sample size m the Studentdistribution is only a rough approximation, asshown in Appendix A, where it is also shownthat the theory works well when the variance isknown, or when the variance is estimated but m is large enough. One objective of the paper is to compareMoore’s law and Wright’s law forecasts. To nor-malize Moore’s law forecasts, Farmer & Lafond(2016) used the estimate of the variance of thedelta log cost time series ˆ K , as suggested byEq. (8), i.e. (cid:15) M = E M / ˆ K, (24)To compare the two models, we propose thatWright’s law forecast errors can be normalizedby the very same value (cid:15) W = E W / ˆ K, (25)Using this normalization, we can plot the nor-malized mean squared errors from Moore’s and Wright’s models at each forecast horizon. Theseare directly comparable, because the raw errorsare divided by the same value, and these aremeaningful because Moore’s normalization en-sures that the errors from diﬀerent technologiesare comparable and can reasonably be aggre-gated. In the context of comparing Moore andWright, when pooling the errors of diﬀerent fore-cast horizons we also use the normalization fromMoore’s model (neglecting autocorrelation forsimplicity), A ≡ τ + τ /m (see Farmer & La-fond (2016) and Eqs. 8 and 9). − − − + + Cumulative Production U n i t C o s t ChemicalsHardwareConsumer GoodsEnergyFood

Figure 1: Scatter plot of unit costs against cumula-tive production.

We mostly use data from the performancecurve database created at the Santa Fe Insti-tute by Bela Nagy and collaborators from per-sonal communications and from Colpier & Corn-land (2002), Goldemberg et al. (2004), Lieber-man (1984), Lipman & Sperling (1999), Zhao(1999), McDonald & Schrattenholzer (2001),Neij et al. (2003), Moore (2006), Nemet (2006)and Schilling & Esmundo (2009). We aug-mented the dataset with data on solar photo-voltaics modules taken from public releases of The data can be accessed at pcdb.santafe.edu. − + + + + A nnua l P r odu c t i on ChemicalsHardwareConsumer GoodsEnergyFood

Figure 2: Production time series − + + + + E x pe r i en c e ChemicalsHardwareConsumer GoodsEnergyFood

Figure 3: Experience time series

This database gives a proxy for unit costs for a number of technologies over variable peri- In a few cases (milk and automotive), a measureof performance is used instead of costs, and automo-tive’s experience is computed based on distance driven.The main results would not be severely aﬀected by theexclusion of these two time series. Also, unit cost isgenerally computed from total cost and production of a ods of time. In principle we would prefer to havedata on unit costs, but often these are unavail-able and the data is about prices . Since thedatabase is built from experience curves foundin the literature, rather than from a representa-tive sample of products/technologies, there arelimits to the external validity of our study butunfortunately we do not know of a database thatcontains suitably normalized unit costs for allproducts.We have selected technologies to minimizecorrelations between the diﬀerent time series, byremoving technologies that are too similar (e.g.other data on solar photovoltaics). We have alsorefrained from including very long time seriesthat would represent a disproportionate share ofour forecast errors, make the problem of auto-correlation of forecast errors very pronounced,and prevent us from generating many randomdatasets for reasons of limited computing power.Starting with the set of 53 technologies with asigniﬁcant improvement rate from Farmer & La-fond (2016), we removed DNA sequencing forwhich no production data was available, andElectric Range which had a zero productiongrowth rate so that we cannot apply the correc-tion to cumulative production described below.We are left with 51 technologies belonging to dif-ferent sectors (chemical, energy, hardware, con-sumer durables, and food), although chemicalsfrom the historical reference (Boston Consult-ing Group 1972) represent a large part of thedataset. For some technologies the number ofyears is slightly diﬀerent from Farmer & Lafond(2016) because we had to remove observationsfor which data on production was not available. year or batch, not from actual observation of every unitcost. Diﬀerent methods may give slightly diﬀerent re-sults (Gallant 1968, Womer & Patterson 1983, Goldberg& Touw 2003), but our dataset is too heterogenous to at-tempt any correction. Obviously, changes in unit costsdo not come from technological progress only, and it isdiﬃcult to account for changes in quality, but unit costsare nevertheless a widely used and defensible proxy. This implies a bias whenever prices and costs do nothave the same growth rate, as is typically the case whenpricing strategies are dynamic and account for learningeﬀects (for instance predatory pricing). For a review ofthe industrial organization literature on this topic, seeThompson (2010).

A potentially serious problem in experiencecurve studies is that one generally does not ob-serve the complete history of the technology,so that simply summing up observed produc-tion misses the experience previously accumu-lated. There is no perfect solution to this prob-lem. For each technology, we infer the initialcumulative production using a procedure com-mon in the “R&D capital” literature (Hall &Mairesse 1995), although not often used in expe-rience curve studies (for an exception see Nord-haus (2014)). It assumes that production grewas Q t +1 = Q t (1 + g d ) and experience accumu-lates as Z t +1 = Z t + Q t , so that it can be shownthat Z t = Q t /g d . We estimate the discrete an-nual growth rate as ˆ g d = exp(log( Q T /Q t ) / ( T − − , where Q t is production during the ﬁrstavailable year, and T is the number of avail-able years. We then construct the experiencevariable as Z t = Q t / ˆ g d for the ﬁrst year, and Z t +1 = Z t + Q t afterwards.Note that this formulation implies that expe-rience at time t does not include production oftime t , so the change in experience from time t to t + 1 does not include how much is pro-duced during year t + 1 . In this sense we assumethat the growth of experience aﬀects technolog-ical progress with a certain time lag. We haveexperimented with slightly diﬀerent ways of con-structing the experience time series, and the ag-gregated results do not change much, due to thehigh persistence of production growth rates.A more important consequence of this correc-tion is that products with a small productiongrowth rate will have a very important correc-tion for initial cumulative production. In turn,this large correction of initial cumulative pro-duction leads to signiﬁcantly lower values for theannual growth rates of cumulative production. As a result, the experience exponent ˆ ω becomeslarger than if there was no correction. This ex-plains why products like milk, which have a verylow rate of growth of production, have such alarge experience exponent. Depending on theproduct, this correction may be small or large,and it may be meaningful or not. Here we havedecided to use this correction for all products. Table 1 summarizes our dataset, showing in par-ticular the parameters estimated using the fullsample. Fig. 4 complements the table by show-ing histograms for the distribution of the mostimportant parameters. Note that we denote es-timated parameters using a tilde because we usethe full sample (when using the small rollingwindow, we used the hat notation). To estimatethe ˜ ρ j , we have used a maximum likelihood es-timation of Eq. 17 (letting ˜ ω MLE diﬀer from ˜ ω ). F r equen cy −3.0 −2.0 −1.0 0.0 ω ~ F r equen cy −1.0 −0.5 0.0 0.5 1.0 ρ ~ F r equen cy r~ F r equen cy σ x ~ Figure 4: Histograms of estimated parameters.

Fig. 5 compares the variance of the noiseestimated from Wright’s and Moore’s models.These key quantities express how much of thechange in (log) cost is left unexplained by eachmodel; they also enter as direct factor in theexpected mean squared forecast error formulas.The lower the value, the better the ﬁt and themore reliable the forecasts. The ﬁgure shows10hat for each technology the two models givesimilar values; see Table 1). σ ~ η K ~ ChemicalsHardwareConsumer GoodsEnergyFood

Figure 5: Comparison of residuals standard devia-tion from Moore’s and Wright’s models.

Next, we show Sahal’s identity as in Nagyet al. (2013). Sahal’s observation is that if cumu-lative production and costs both have exponen-tial trends r and µ , respectively, then costs andproduction have a power law (constant elastic-ity) relationship parametrized by ω = µ/r . Oneway to check the validity of this relationship isto measure µ , r and ω independently and plot ω against µ/r . Fig. 6 shows the results andconﬁrms the relevance of Sahal’s identity. −µ r − ω ChemicalsHardwareConsumer GoodsEnergyFood

Figure 6: Illustration of Sahal’s identity.

To explain why Sahal’s identity works so welland Moore’s and Wright’s laws have similar ex-planatory power, in Section 2.4 we have shownthat in theory if production grows exponentially,cumulative production grows exponentially withan even lower volatility. Fig. 7 shows how thistheoretical result applies to our dataset. Know-ing the drift and volatility of the (log) produc-tion time series, we are able to predict the driftand volatility of the (log) cumulative productiontime series fairly well. − − − − − σ q2 ^ tanh ( g^ 2 ) σ x ^ Figure 7: Test of Eq. 15 relating the volatility ofcumulative production to the drift and volatility ofproduction. The inset shows the drift of cumulativeproduction ˆ r against the drift of production ˆ g . Moore’s law forecasts are based only on the costtime series, whereas Wright’s law forecasts useinformation about future experience to predictfuture costs. Thus, we expect that in princi-ple Wright’s forecasts should be better. We nowcompare Wright’s and Moore’s models in a num-ber of ways. The ﬁrst way is simply to showa scatter plot of the forecast errors from thetwo models. Fig. 8 shows this scatter plot forthe errors normalized by ˆ K √ A (i.e. “Moore-normalized”, see Eqs 8 – 9 and 24 and 25), withthe identity line as a point of comparison. Itis clear that they are highly correlated. When11 ost Production Cumul. Prod. Wright’s lawT ˜ µ ˜ K ˜ g ˜ σ q ˜ r ˜ σ x ˜ ω ˜ σ η ˜ ρ Automotive 21 -0.076 0.047 0.026 0.011 0.027 0.000 -2.832 0.048 1.000Milk 78 -0.020 0.023 0.008 0.020 0.007 0.000 -2.591 0.023 0.019IsopropylAlcohol 9 -0.039 0.024 0.022 0.074 0.023 0.002 -1.677 0.024 -0.274Neoprene Rubber 13 -0.021 0.020 0.015 0.055 0.015 0.001 -1.447 0.020 0.882Phthalic Anhydride 18 -0.076 0.152 0.061 0.094 0.058 0.006 -1.198 0.155 0.321TitaniumDioxide 9 -0.037 0.049 0.031 0.029 0.031 0.001 -1.194 0.049 -0.400Sodium 16 -0.013 0.023 0.013 0.081 0.012 0.001 -1.179 0.023 0.407Pentaerythritol 21 -0.050 0.066 0.050 0.107 0.050 0.006 -0.954 0.068 0.343Methanol 16 -0.082 0.143 0.097 0.081 0.091 0.004 -0.924 0.142 0.289Hard Disk Drive 19 -0.593 0.314 0.590 0.307 0.608 0.128 -0.911 0.364 -0.569Geothermal Electricity 26 -0.049 0.022 0.043 0.116 0.052 0.013 -0.910 0.023 0.175Phenol 14 -0.078 0.090 0.088 0.055 0.089 0.004 -0.853 0.092 -1.000Transistor 38 -0.498 0.240 0.585 0.157 0.582 0.122 -0.849 0.226 -0.143Formaldehyde 11 -0.070 0.061 0.086 0.078 0.085 0.005 -0.793 0.063 0.489Ethanolamine 18 -0.059 0.042 0.076 0.076 0.080 0.005 -0.748 0.041 0.355Caprolactam 11 -0.103 0.075 0.136 0.071 0.142 0.009 -0.746 0.071 0.328Ammonia 13 -0.070 0.099 0.096 0.049 0.102 0.007 -0.740 0.095 1.000Acrylic Fiber 13 -0.100 0.057 0.127 0.126 0.137 0.020 -0.726 0.056 -0.141Ethylene Glycol 13 -0.062 0.059 0.089 0.107 0.083 0.006 -0.711 0.062 -0.428DRAM 37 -0.446 0.383 0.626 0.253 0.634 0.185 -0.680 0.380 0.116Benzene 16 -0.056 0.083 0.087 0.114 0.087 0.012 -0.621 0.085 -0.092Aniline 12 -0.072 0.095 0.110 0.099 0.113 0.008 -0.620 0.097 -1.000VinylAcetate 13 -0.082 0.061 0.131 0.080 0.129 0.010 -0.617 0.065 0.341Vinyl Chloride 11 -0.083 0.050 0.136 0.085 0.137 0.008 -0.613 0.049 -0.247Polyethylene LD 15 -0.085 0.076 0.135 0.075 0.139 0.009 -0.611 0.075 0.910Acrylonitrile 14 -0.084 0.108 0.121 0.178 0.134 0.025 -0.605 0.109 1.000Styrene 15 -0.068 0.047 0.112 0.089 0.113 0.008 -0.585 0.050 0.759Maleic Anhydride 14 -0.069 0.114 0.116 0.143 0.119 0.013 -0.551 0.116 0.641Ethylene 13 -0.060 0.057 0.114 0.054 0.114 0.005 -0.526 0.057 -0.290Urea 12 -0.062 0.094 0.121 0.073 0.127 0.011 -0.502 0.093 0.003Sorbitol 8 -0.032 0.046 0.067 0.025 0.067 0.002 -0.473 0.046 -1.000Polyester Fiber 13 -0.121 0.100 0.261 0.132 0.267 0.034 -0.466 0.094 -0.294Bisphenol A 14 -0.059 0.048 0.136 0.136 0.135 0.012 -0.437 0.048 -0.056Paraxylene 11 -0.103 0.097 0.259 0.326 0.228 0.054 -0.417 0.104 -1.000Polyvinylchloride 22 -0.064 0.057 0.137 0.136 0.144 0.024 -0.411 0.062 0.319Low Density Polyethylene 16 -0.103 0.064 0.213 0.164 0.237 0.069 -0.400 0.071 0.473Sodium Chlorate 15 -0.033 0.039 0.076 0.077 0.084 0.006 -0.397 0.039 0.875TitaniumSponge 18 -0.099 0.099 0.196 0.518 0.241 0.196 -0.382 0.075 0.609Photovoltaics 41 -0.121 0.153 0.315 0.202 0.318 0.133 -0.380 0.145 -0.019Monochrome Television 21 -0.060 0.072 0.093 0.365 0.130 0.093 -0.368 0.074 -0.444Cyclohexane 17 -0.055 0.052 0.134 0.214 0.152 0.034 -0.317 0.057 0.375Polyethylene HD 15 -0.090 0.075 0.250 0.166 0.275 0.074 -0.307 0.079 0.249CarbonBlack 9 -0.013 0.016 0.046 0.051 0.046 0.002 -0.277 0.016 -1.000Laser Diode 12 -0.293 0.202 0.708 0.823 0.824 0.633 -0.270 0.227 0.156Aluminum 17 -0.015 0.044 0.056 0.075 0.056 0.004 -0.264 0.044 0.761Polypropylene 9 -0.105 0.069 0.383 0.207 0.414 0.079 -0.261 0.059 0.110Beer 17 -0.036 0.042 0.137 0.091 0.146 0.016 -0.235 0.043 -1.000Primary Aluminum 39 -0.022 0.080 0.088 0.256 0.092 0.040 -0.206 0.080 0.443Polystyrene 25 -0.061 0.086 0.205 0.361 0.214 0.149 -0.163 0.097 0.074Primary Magnesium 39 -0.031 0.089 0.135 0.634 0.158 0.211 -0.131 0.088 -0.037Wind Turbine 19 -0.038 0.047 0.336 0.570 0.357 0.337 -0.071 0.050 0.750

Table 1: Parameter estimates −5 0 5 10−50510 ε M A ε W A τ … Figure 8: Scatter plot of Moore-normalized forecasterrors (cid:15) M and (cid:15) W (forecasts are made using m =5 ). This shows that in the vast majority of casesWright’s and Moore’s law forecast errors have thesame sign and a similar magnitude, but not always. In Fig. 9, the main plot shows the meansquared Moore-normalized forecast error (cid:15) M and (cid:15) W (see Section 2.7), where the average is takenover all available forecast errors of a given fore-cast horizon (note that some technologies aremore represented than others). The solid diag-onal line is the benchmark for the Moore modelwithout autocorrelation, i.e. the line y = m − m − A (Farmer & Lafond 2016). Wright’s model ap-pears slightly better at the longest horizons,however there are not many forecasts at thesehorizons so we do not put too much emphasison this ﬁnding. The two insets show the distri-bution of the rescaled Moore-normalized errors (cid:15)/ √ A , either as a cumulative distribution func-tion (top left) or using the probability integraltransform (bottom right). All three visualiza- The Probability Integral Transform is a transforma-tion that allows to compare data against a theoreticaldistribution by transforming it and comparing it againstthe Uniform distribution. See for example Diebold et al. tions conﬁrm that Wright’s model only slightlyoutperform Moore’s model. τ M ean ε MooreWright−4 −2 0 2 40.00.20.40.60.81.0

StudentMooreWright

Figure 9: Comparison of Moore-normalized forecasterrors from Moore’s and Wright’s models. The mainchart shows the mean squared forecast errors at dif-ferent forecast horizons. The insets show the distri-bution of the normalized forecast errors as an em-pirical cumulative distribution function against theStudent distribution (top left) and as a probabil-ity integral transform against a uniform distribution(bottom right).

In this section, we analyze in detail the forecasterrors from Wright’s model. We will use theproper normalization derived in Section 2.4, butsince it does not allow us to look at horizon spe-ciﬁc errors we ﬁrst look again at the horizon spe-ciﬁc Moore-normalized mean squared forecasterrors. Fig. 10 shows the results for diﬀerentvalues of m (for m = 5 the empirical errors arethe same as in Fig. 9). The conﬁdence intervalsare created using the surrogate data proceduredescribed in Section 2.2, in which we simulatemany random datasets using the autocorrelatedWright’s law model (Eq. 17) and the param-eters of Table 1 forcing ρ j = ρ ∗ = 0 . (seebelow). We then apply the same hindcasting,error normalization and averaging procedure to (1998), who used it to construct a test for evaluatingdensity forecasts. τ m=5 τ m=8 τ m=11 τ m=14 Figure 10: Mean squared Moore-normalized fore-cast errors of the Wright’s law model (Mean (cid:15) W )versus forecast horizon. The 95% intervals (dashedlines) and the mean (solid line) are computed usingsimulations as described in the text. The grey line,associated with the right axis, shows the number offorecast errors used to make an average. We now analyze the forecast errors fromWright’s model normalized using the (approx-imate) theory of Section 2.4. Again we usethe hindcasting procedure and unless otherwisenoted, we use an estimation window of m = 5 points (i.e. 6 years) and a maximum forecast-ing horizon τ max = 20 . To normalize the errors,we need to choose a value of ρ . This is a dif-ﬁcult problem, because for simplicity we haveassumed that ρ is the same for all technologies,but in reality it probably is not. We have ex-perimented with diﬀerent methods of choosing ρ based on modelling the forecast errors, forinstance by looking for the value of ρ whichmakes the distribution of normalized errors clos-est to a Student distribution. While these meth-ods may suggest that ρ ≈ . , they generallygive diﬀerent values of ρ for diﬀerent values of m and τ max (which indicates that some non-stationarity/misspeciﬁcation is present). More-over, since the theoretical forecast errors do notexactly follow a Student distribution (see Ap-pendix A) this estimator is biased. For simplic-ity, we use the average value of ˜ ρ in our dataset,after removing the 9 values of ˜ ρ whose absolutevalue was greater than 0.99 (which may indicatea misspeciﬁed model). Throughout the paper,we will thus use ρ ∗ = 0 . . −4 −2 0 2 40.00.20.40.60.81.0 c u m u l a t i v e d i s t r i bu t i on Student ρ = ρ = ρ * Figure 11: Cumulative distribution of the normal-ized forecast errors, m = 5 , τ max = 20 , and theassociated ρ ∗ = 0 . . m=5 m=8 m=11 m=14 Figure 12: Probability Integral Transform of thenormalized forecast errors, for diﬀerent values of m .

14n Fig. 11, we show the empirical cumulativedistribution function of the normalized errors(for ρ ∗ and ρ = 0 ) and compare it to the Stu-dent prediction. In Fig. 12 we show the prob-ability integral transform of the normalized er-rors (assuming a Student distribution, and us-ing ρ = ρ ∗ ). In addition, Fig. 12 shows theconﬁdence intervals obtained by the surrogatedata method, using data simulated under theassumption ρ = ρ ∗ . Again, the results conﬁrmthat the empirical forecast errors are compatiblewith Wright’s law, Eq. (17). In this section we apply our method to solarphotovoltaic modules. Technological progressin solar photovoltaics (PV) is a very prominentexample of the use of experience curves. Ofcourse, the limitations of using experience curvemodels are valid in the context of solar PV mod-ules; we refer the reader to the recent studiesby Zheng & Kammen (2014) for a discussion ofeconomies of scale, innovation output and poli-cies; by de La Tour et al. (2013) for a discussionof the eﬀects of input prices; and by Hutchby(2014) for a detailed study of levelized costs(module costs represent only part of the costof producing solar electricity), and to Farmer& Makhijani (2010) for a prediction of levelizedsolar photovoltaic costs made in 2010.Historically (Wright 1936, Alchian 1963), theestimation of learning curves suggested thatcosts would drop by 20% for every doubling ofcumulative production, although these early ex-amples are almost surely symptoms of a sam-ple bias. This corresponds to an estimated elas-ticity of about ω = 0 . . As it turns out, es-timations for PV have been relatively close tothis number. Here we have a progress ratio of − . = 23% . A recent study (de La Tour et al. Neij (1997), Isoard & Soria (2001), Schaeﬀer et al.(2004), Van der Zwaan & Rabl (2004), Nemet (2006), Pa-pineau (2006), Swanson (2006), Alberth (2008), Kahouli-Brahmi (2009), Junginger et al. (2010), Candelise et al.(2013). y T + τ ∼ N (ˆ y T + τ , V (ˆ y T + τ )) (26)Following the discussion of Sections 2.4 and 2.6,the point forecast is ˆ y T + τ = y T + ˜ ω ( x T + τ − x T ) and the variance is V ( y T + τ ) = ˆ σ η ρ ∗ (cid:16) ρ ∗ H + T − (cid:88) j =2 ( H j + ρ ∗ H j +1 ) + ( ρ ∗ + H T ) + ( τ − ρ ∗ ) + 1 (cid:17) . (27)15here H j = − (cid:80) T + τi = T +1 X i (cid:80) Ti =2 X i X j , and recalling that X t ≡ y t − y t − . We now assume that the growth of cumulativeproduction is exactly ˜ r in the future, so the pointforecast simpliﬁes to ˆ y T + τ = y T + ˜ ω ˜ rτ (28)Regarding the variance, Eq. 27 is cumbersomebut it can be greatly simpliﬁed. First, we as-sume that past growth rates of experience wereconstant, that is X i = ˜ r for i = 1 . . . T (i.e. ˆ σ x ≈ ), leading to the equivalent of Eq. 21.As long as production grows exponentially thisapproximation is likely to be defensible (see Eq.15). Although from Table 1 we see that solar PVis not a particularly favourable example, we ﬁndthat this assumption does not aﬀect the varianceof forecast errors signiﬁcantly, at least for shortor medium run horizons.Since Eq. 21 is still a bit complicated, we canfurther assume that τ (cid:29) and T (cid:29) , so thatwe arrive at the equivalent of Eq. 22, that is V ( y T + τ ) ≈ ˆ σ η (1 + ρ ∗ ) ρ ∗ (cid:16) τ + τ T − (cid:17) . (29)This equation is very simple and we will see thatit gives results extremely close to Eq. 27, so thatit can be used in applications.Our point of comparison is the distributionalforecast of Farmer & Lafond (2016) based onMoore’s law with autocorrelated noise, and esti-mating θ ∗ = 0 . in the same way as ρ ∗ (averageacross all technologies of the measured MA(1)coeﬃcient, removing the | ˜ θ j | ≈ ). All otherparameters are taken from Table 1. Fig. 13shows the forecast for the mean log cost andits prediction interval for the two models.The point forecasts of the two models are al-most exactly the same because ˆ ω ˆ r = − . ≈ ˆ µ = − . . Moreover, Wright’s law predic-tion intervals are slightly smaller because ˆ σ η =0 . < ˆ K = 0 . . Overall, the forecasts arevery similar as shown in Fig. 13. Fig 13 doesalso show the prediction intervals from Eq. 29,in red dotted lines, but they are so close to those calculated using Eq. 27 that the diﬀerence canbarely be seen. PV m odu l e p r i c e i n $ / W p Figure 13: Comparison of Moore’s and Wright’s lawdistributional forecasts (95% prediction intervals). PV m odu l e p r i c e i n $ / W p ± ± ± Figure 14: Distributional forecast for the price ofPV modules up to 2025, using Eqs. 26, 27 and 28.

In Fig. 14, we show the Wright’s law-baseddistributional forecast, but against cumulativeproduction. We show the forecast intervals cor-responding to 1, 1.5 and 2 standard deviations(corresponding approximately to 68, 87 and 95%conﬁdence intervals, respectively). The ﬁgurealso makes clear the large scale deployment as-sumed by the forecast, with cumulative PV16roduction (log) growth rate of 32% per year.Again, we note as a caveat that exponentialdiﬀusion leads to fairly high numbers as com-pared to expert opinions (Bosetti et al. 2012)and the academic (Gan & Li 2015) and pro-fessional (Masson et al. 2013, International En-ergy Agency 2014) literature, which generallyassumes that PV deployment will slow down fora number of reasons such as intermittency andenergy storage issues. But other studies (Zheng& Kammen 2014, Jean et al. 2015) do take moreoptimistic assumptions as working hypothesis,and it is outside the scope of this paper to modeldiﬀusion explicitly.

We presented a method to test the accuracy andvalidity of experience curve forecasts. It leadsto a simple method for producing distributionalforecasts at diﬀerent forecast horizons. We com-pared the experience curve forecasts with thosefrom a univariate time series model (Moore’slaw of exponential progress), and found thatthey are fairly similar. This is due to the factthat production tends to grow exponentially ,so that cumulative production tends to grow ex-ponentially with low ﬂuctuations, mimicking anexogenous exponential time trend. We appliedthe method to solar photovoltaic modules, show-ing that if the exponential trend in diﬀusion con-tinues, they are likely to become very inexpen-sive in the near future.There are a number of limitations and caveatsthat are worth reiterating here: our time se-ries are examples from the literature so thatthe dataset is likely to have a strong samplebias, which limits the external validity of theresults. Also, many time series are quite short,measure technical performance imperfectly, andwe had to estimate initial experience in a waythat is largely untested. Clearly, the experi-ence curve model also omits important factorssuch as R&D. Finally, we make predictions con-ditional on future experience, which is not the Recall that we selected technologies with a strictlypositive growth rate of production. same as doing prediction solely based on time.In settings where production is a decision vari-able, e.g. Way et al. (2017), forecasts condi-tional on experience are the most useful. How-ever, it remains true that to make an uncon-ditional forecast for a point in time in the fu-ture, using Wright’s law also requires an addi-tional assumption about the speed of technol-ogy diﬀusion. Thus in a situation of businessas usual where experience grows exponentially,using Moore’s law is simpler and almost as ac-curate.The method we introduce here is closely anal-ogous to that introduced in Farmer & Lafond(2016). Although Moore’s law and Wright’s lawtend to make forecasts of similar quality, it is im-portant to emphasize that when it comes to pol-icy, the diﬀerence is potentially very important.While the correlation between costs and cumu-lative production is well-established, we shouldstress that the causal relationship is not. But tothe extent that Wright’s law implies that cumu-lative production causally inﬂuences cost, costscan be driven down by boosting cumulative pro-duction. In this case one no longer expects thetwo methods to make similar predictions, andthe method we have introduced here plays a use-ful role in making it possible to think about notjust what the median eﬀect would be, but ratherthe likelihood of eﬀects of diﬀerent magnitudes.

AppendixA Comparison of analyticalresults to simulations

To check whether the analytical theory is rea-sonable we use the following setting. We sim-ulate 200 technologies for 50 periods. A singletime series of cumulative production is gener-ated by assuming that production follows a ge-ometric random walk with drift g = 0 . andvolatility σ q = 0 . (no correction for previousproduction is made). Cost is generated assum-ing Wright’s law with σ η = 0 . , ω = − . and ρ = 0 . .17orecast errors are computed by the hind-casting methodology, and normalized using ei-ther the true ρ or ρ = 0 . The results are pre-sented in Fig. 15 for m = 5 , and for esti-mated or true variance ( ˆ σ v = ˆ σ η / (cid:112) ρ or σ v = σ η / (cid:112) ρ ). In all cases, using the propernormalization factor ρ = ρ ∗ makes the distri-bution very close to the predicted distribution(Normal or Student). When m = 5 and the vari-ance is estimated, we observe a slight departurefrom the theory as in Farmer & Lafond (2016),which seems to be lower for large m or when thetrue σ v is known. Figure 15: Test of the theory for forecast errors.Top: m = 5 ; Bottom: m = 40 . Left: estimatedvariance; Right: True variance. . . . . . . . . . . . . . . . Figure 16: Test of the theory for forecast errors.In the 3 cases m = 5 . On the left, the varianceis estimated. In the center, errors are normalizedusing the true variance. On the right, we also usedthe true variance but the errors are i.i.d. To see the deviation from the theory moreclearly, we repeat the exercise but this time weapply the probability integral transform to theresulting normalized forecast errors. We usethe same parameters, and another realizationof the (unique) production time series and ofthe (200) cost time series. As a point of com-parison, we also apply the probability integraltransform to randomly generated values fromthe reference distribution (Student when vari-ance is estimated, Normal when the known vari-ance is used), so that conﬁdence intervals can beplotted. This allows us to see more clearly thedeparture from the Student distribution whenthe variance is estimated and m is small (leftpanel). When the true variance is used (cen-ter panel), there is still some departure but itis much smaller. Finally, for the latest panel(right), instead of generating 200 series of 50periods, we generated 198000 time series of 7periods, so that we have the same number offorecast errors but they do not suﬀer from beingcorrelated due to the moving estimation window(only one forecast error per time series is com-puted). In this case we ﬁnd that normalizedforecast errors and independently drawn normalvalues are similar.Overall these simulation results conﬁrm un-der what conditions our theoretical results areuseful (namely, m large enough, or knowing thetrue variance). For this reason, we have usedthe surrogate data procedure when testing theerrors with small m and estimated variance, andwe have used the normal approximation whenforecasting solar costs based on almost 40 yearsof data. B Derivation of the proper-ties of cumulative produc-tion

Here we give an approximation for the volatil-ity of the log of cumulative production, assum-ing that production follows a geometric ran-dom walk with drift g and volatility σ q . Weuse saddle point methods to compute the ex-18ected value of the log of cumulative produc-tion E [log Z ] , its variance Var (log Z ) and even-tually our quantity of interest σ x ≡ Var ( X ) ≡ Var (∆ log Z ) . The essence of the saddle pointmethod is to approximate the integral by tak-ing into account only that portion of the rangeof the integration where the integrand assumeslarge values. More speciﬁcally in our calcula-tion we ﬁnd the maxima of the integrands andapproximate ﬂuctuations around these pointskeeping quadratic and neglecting higher orderterms. Assuming the initial condition Z (0) = 1 ,we can write the cumulative production at time t as Z t = 1 + (cid:80) ti =1 e gi + (cid:80) ij a j , where a , . . . , a t are normally distributed i.i.d. random variableswith mean zero and variance σ q , describing thenoise in the production process. E [log Z ] is de-ﬁned by the (multiple) integral over a i E (log Z ) = (cid:90) ∞−∞ log Z t (cid:89) i =1 da i (cid:112) πσ q exp (cid:104) − a i σ q (cid:105) = (cid:90) ∞−∞ e S ( { a i } ) t (cid:89) i =1 da i (cid:112) πσ q (30)with S ( { a i } ) = log(log Z ) − (cid:80) ti =1 a i σ q , which wewill calculate by the saddle point method as-suming σ q (cid:28) .The saddle point is deﬁned by the system ofequations ∂ i S ( { a ∗ i } ) = 0 , ∂ i = ∂/∂ a i , i = 1 · · · t for which we can write S ( { a i } ) = S ( { a ∗ i } ) + (cid:88) ij ( a i − a ∗ i )( a j − a ∗ j ) G ij + O ( { ( a i − a ∗ i ) } ) (31)where a ∗ i is the solution of the saddle point equa-tions and G ij = ∂ i ∂ j S ( { a i } ) | a i = a ∗ i . In the sad-dle point approximation we restrict ourselvesto quadratic terms in the expansion (31) whichmakes the integral (30) Gaussian. Then we ob-tain E (log Z ) = (det G ) − / e S ( { a ∗ i } ) (2 σ q ) t/ (32)The saddle point equation leads to ∂ n (cid:16) log(log Z ) − a n σ q (cid:17) = 0 , which can be written as a i = σ q ∂ i ZZ log Z = a ∗ i + O ( σ q ) a ∗ i = σ q ∂ i ZZ log Z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a i =0 (33)Substituting this a ∗ i into the e S ( { a ∗ i } ) term in (32)we obtain after some algebra e S ( { a ∗ i } ) = (cid:32) log Z + σ q (cid:80) ti =1 ( ∂ i Z ) Z log Z (cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a i =0 + O ( σ q ) (34)The calculation of G ij as a second derivativegives G i,j = 12 σ q (cid:32) δ i,j + σ q · (35) (cid:18) (1 + log Z ) ∂ i Z∂ j ZZ log Z − ∂ i ∂ j ZZ log Z (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a i =0 (cid:33) + O ( σ q ) , which leads to (2 σ q ) − t/ (det G ) − / = 1 + σ q (cid:18) (cid:80) ti =1 ∂ i ZZ log Z − (cid:80) ti =1 ( ∂ i Z ) (1 + log Z ) Z log Z (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) a i =0 + O ( σ q ) . (36)Here we used the formula det G = exp( tr log G ) and an easy expansion of log G over σ q . Nowputting formulas (34) and (36) into (32) we ob-tain E (log Z ) = log Z | a i =0 + σ q t (cid:88) i =1 (cid:18) ∂ i ZZ − ( ∂ i Z ) Z (cid:19) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) a i =0 + O ( σ q ) (37)The calculation of Z and its derivatives at a i = 0 is straightforward. If g > , for large t it givesthe very simple formula E (log Z ( t )) | t →∞ = g ( t + 1) − log ( e g − σ q g ) + O ( σ q ) (38)With the same procedure as for (30-38) wecalculate the expectation value of E (log Z ) ,19 (log Z ( t ) log Z ( t +1)) − E (log Z ( t )) E (log Z ( t +1)) which leads to similar formulas as (33-37),but with diﬀerent coeﬃcients. The result for g > and t → ∞ readsVar (log Z ( t )) = E (log Z ) − E (log Z ) = σ q (cid:18) e g + 11 − e g + t (cid:19) + O ( σ q ) (39)andVar (∆ log Z ) = E ((log Z ( t + 1) − log Z ( t )) ) − ( E (log Z ( t + 1) − log Z ( t ))) = σ q tanh (cid:16) g (cid:17) + O ( σ q ) . (40) References

Alberth, S. (2008), ‘Forecasting technology costsvia the experience curve: Myth or magic?’,

Technological Forecasting and Social Change (7), 952–983.Alchian, A. (1963), ‘Reliability of progresscurves in airframe production’, Econometrica (4), 679–693.Anzanello, M. J. & Fogliatto, F. S. (2011),‘Learning curve models and applications: Lit-erature review and research directions’, In-ternational Journal of Industrial Ergonomics (5), 573–583.Argote, L., Beckman, S. L. & Epple, D. (1990),‘The persistence and transfer of learningin industrial settings’, Management Science (2), 140–154.Arrow, K. J. (1962), ‘The economic implicationsof learning by doing’, The Review of EconomicStudies pp. 155–173.Ayres, R. U. (1969),

Technological forecastingand long-range planning , McGraw-Hill BookCompany.Bailey, A., Bui, Q. M., Farmer, J. D., Margolis,R. & Ramesh, R. (2012), Forecasting tech-nological innovation, in ‘ARCS Workshops(ARCS), 2012’, pp. 1–6. Benson, C. L. & Magee, C. L. (2015), ‘Quan-titative determination of technological im-provement from patent data’, PloS One (4), e0121635.Berndt, E. R. (1991), The practice of econo-metrics: classic and contemporary , Addison-Wesley Reading, MA.Bettencourt, L. M., Trancik, J. E. & Kaur, J.(2013), ‘Determinants of the pace of globalinnovation in energy technologies’,

PloS One (10), e67864.Bosetti, V., Catenacci, M., Fiorese, G. & Ver-dolini, E. (2012), ‘The future prospect of pvand csp solar technologies: An expert elicita-tion survey’, Energy Policy , 308–317.Boston Consulting Group (1972), Perspectiveson Experience , 3 edn, The Boston ConsultingGroup, Inc., One Boston Place, Boston, Mas-sachusetts 02106.Candelise, C., Winskel, M. & Gross, R. J.(2013), ‘The dynamics of solar pv costs andprices as a challenge for technology forecast-ing’,

Renewable and Sustainable Energy Re-views , 96–107.Clark, T. & McCracken, M. (2013), ‘Advancesin forecast evaluation’, Handbook of EconomicForecasting , 1107–1201.Clements, M. P. & Hendry, D. F. (2001),‘Forecasting with diﬀerence-stationary andtrend-stationary models’, Econometrics Jour-nal (1), 1–19.Colpier, U. C. & Cornland, D. (2002), ‘Theeconomics of the combined cycle gas turbine,an experience curve analysis’, Energy Policy (4), 309–316.de La Tour, A., Glachant, M. & Ménière, Y.(2013), ‘Predicting the costs of photovoltaicsolar modules in 2020 using experience curvemodels’, Energy , 341–348.20iebold, F. X., Gunther, T. A. & Tay, A. S.(1998), ‘Evaluating density forecasts with ap-plications to ﬁnancial risk management’, In-ternational Economic Review pp. 863–883.Dutton, J. M. & Thomas, A. (1984), ‘Treat-ing progress functions as a managerial op-portunity’,

Academy of Management Review (2), 235–247.Farmer, J. D. & Lafond, F. (2016), ‘How pre-dictable is technological progress?’, ResearchPolicy (3), 647–665.Farmer, J. & Makhijani, A. (2010), ‘A U.S. nu-clear future: Not wanted, not needed’, Nature , 391–393.Ferioli, F. & Van der Zwaan, B. (2009), ‘Learn-ing in times of change: A dynamic explana-tion for technological progress’,

Environmen-tal Science & Technology (11), 4002–4008.Funk, J. L. & Magee, C. L. (2015), ‘Rapid im-provements with no commercial production:How do the improvements occur?’, ResearchPolicy (3), 777–788.Gallant, A. (1968), ‘A note on the measurementof cost/quantity relationships in the aircraftindustry’, Journal of the American StatisticalAssociation (324), 1247–1252.Gan, P. Y. & Li, Z. (2015), ‘Quantitative studyon long term global solar photovoltaic mar-ket’, Renewable and Sustainable Energy Re-views , 88–99.Goldberg, M. S. & Touw, A. (2003), Statisticalmethods for learning curves and cost analy-sis , Institute for Operations Research and theManagement Sciences (INFORMS).Goldemberg, J., Coelho, S. T., Nastari, P. M. &Lucon, O. (2004), ‘Ethanol learning curve, theBrazilian experience’,

Biomass and Bioenergy (3), 301–304.Hall, B. H. & Mairesse, J. (1995), ‘Exploringthe relationship between r&d and productiv-ity in french manufacturing ﬁrms’, Journal ofEconometrics (1), 263–293. Hall, G. & Howell, S. (1985), ‘The experi-ence curve from the economist’s perspective’, Strategic Management Journal (3), 197–212.Harvey, A. C. (1980), ‘On comparing regressionmodels in levels and ﬁrst diﬀerences’, Inter-national Economic Review pp. 707–720.Hutchby, J. A. (2014), ‘A “Moore’s law”-like ap-proach to roadmapping photovoltaic technolo-gies’,

Renewable and Sustainable Energy Re-views , 883–890.International Energy Agency (2014), Technol-ogy roadmap: Solar photovoltaic energy (2014ed.), Technical report, OECD/IEA, Paris.Isoard, S. & Soria, A. (2001), ‘Technical changedynamics: evidence from the emerging renew-able energy technologies’, Energy Economics (6), 619–636.Jamasb, T. (2007), ‘Technical change theoryand learning curves: patterns of progress inelectricity generation technologies’, The En-ergy Journal (3), 51–71.Jean, J., Brown, P. R., Jaﬀe, R. L., Buonassisi,T. & Bulović, V. (2015), ‘Pathways for solarphotovoltaics’, Energy & Environmental Sci-ence (4), 1200–1219.Junginger, M., van Sark, W. & Faaij, A. (2010), Technological learning in the energy sector:lessons for policy, industry and science , Ed-ward Elgar Publishing.Kahouli-Brahmi, S. (2009), ‘Testing for the pres-ence of some features of increasing returns toadoption factors in energy system dynamics:An analysis via the learning curve approach’,

Ecological Economics (4), 1195–1212.Lieberman, M. B. (1984), ‘The learning curveand pricing in the chemical processing in-dustries’, The RAND Journal of Economics (2), 213–228.Lipman, T. E. & Sperling, D. (1999), ‘Forecast-ing the costs of automotive pem fuel cell sys-tems: Using bounded manufacturing progress21unctions’, Paper presented at IEA Interna-tional Workshop, Stuttgart, Germany, 10—11May. .Magee, C. L., Basnet, S., Funk, J. L. & Benson,C. L. (2016), ‘Quantitative empirical trendsin technical performance’,

Technological Fore-casting and Social Change , 237–246.Martino, J. P. (1993),

Technological forecastingfor decision making , McGraw-Hill, Inc.Masson, G., Latour, M., Rekinger, M., The-ologitis, I.-T. & Papoutsi, M. (2013), ‘Globalmarket outlook for photovoltaics 2013-2017’,

European Photovoltaic Industry Association pp. 12–32.McDonald, A. & Schrattenholzer, L. (2001),‘Learning rates for energy technologies’,

En-ergy Policy (4), 255–261.McDonald, J. (1987), ‘A new model for learningcurves, DARM’, Journal of Business & Eco-nomic Statistics (3), 329–335.Meese, R. A. & Rogoﬀ, K. (1983), ‘Empirical ex-change rate models of the seventies: Do theyﬁt out of sample?’, Journal of InternationalEconomics (1), 3–24.Moore, G. E. (2006), ‘Behind the ubiquitous mi-crochip’.Nagy, B., Farmer, J. D., Bui, Q. M. &Trancik, J. E. (2013), ‘Statistical basis forpredicting technological progress’, PloS One (2), e52669.Neij, L. (1997), ‘Use of experience curves toanalyse the prospects for diﬀusion and adop-tion of renewable energy technology’, EnergyPolicy (13), 1099–1107.Neij, L., Andersen, P. D., Durstewitz, M.,Helby, P., Hoppe-Kilpper, M. & Morthorst, P.(2003), Experience curves: A tool for energypolicy assessment , Environmental and EnergySystems Studies, Univ. Nemet, G. F. (2006), ‘Beyond the learningcurve: factors inﬂuencing cost reductions inphotovoltaics’,

Energy Policy (17), 3218–3232.Nordhaus, W. D. (2014), ‘The perils of the learn-ing model for modeling endogenous technolog-ical change’, The Energy Journal (1), 1–13.Papineau, M. (2006), ‘An economic perspectiveon experience curves and dynamic economiesin renewable energy technologies’, EnergyPolicy (4), 422–432.Sahal, D. (1979), ‘A theory of progress func-tions’, AIIE Transactions (1), 23–29.Sampson, M. (1991), ‘The eﬀect of parameteruncertainty on forecast variances and conﬁ-dence intervals for unit root and trend sta-tionary time-series models’, Journal of Ap-plied Econometrics (1), 67–76.Schaeﬀer, G. J., Seebregts, A., Beurskens, L.,Moor, H. d., Alsema, E., Sark, W. v.,Durstewicz, M., Perrin, M., Boulanger, P.,Laukamp, H. et al. (2004), ‘Learning fromthe sun; analysis of the use of experiencecurves for energy policy purposes: The caseof photovoltaic power. ﬁnal report of the pho-tex project’, Report ECN DEGO: ECN-C–04-035, ECN Renewable Energy in the Built En-vironment .Schilling, M. A. & Esmundo, M. (2009), ‘Tech-nology s-curves in renewable energy alterna-tives: Analysis and implications for industryand government’,

Energy Policy (5), 1767– 1781.Sinclair, G., Klepper, S. & Cohen, W. (2000),‘What’s experience got to do with it? sourcesof cost reduction in a large specialty chemicalsproducer’, Management Science (1), 28–45.Söderholm, P. & Sundqvist, T. (2007), ‘Empir-ical challenges in the use of learning curvesfor assessing the economic prospects of renew-able energy technologies’, Renewable Energy (15), 2559–2578.22wanson, R. M. (2006), ‘A vision for crys-talline silicon photovoltaics’, Progress inphotovoltaics: Research and Applications (5), 443–453.Thompson, P. (2010), ‘Learning by doing’, Handbook of the Economics of Innovation , 429–476.Thompson, P. (2012), ‘The relationship be-tween unit cost and cumulative quantity andthe evidence for organizational learning-by-doing’, The Journal of Economic Perspectives (3), 203–224.Van der Zwaan, B. & Rabl, A. (2004), ‘Thelearning potential of photovoltaics: impli-cations for energy policy’, Energy Policy (13), 1545–1554.Van Sark, W. (2008), ‘Introducing errors inprogress ratios determined from experiencecurves’, Technological Forecasting and SocialChange (3), 405–415.Vigil, D. P. & Sarper, H. (1994), ‘Estimatingthe eﬀects of parameter variability on learningcurve model predictions’, International Jour-nal of Production Economics (2), 187–200.Way, R., Lafond, F., Farmer, J. D., Lillo,F. & Panchenko, V. (2017), ‘Wright meetsMarkowitz: How standard portfolio theorychanges when assets are technologies follow-ing experience curves’, arxiv .West, K. D. (2006), ‘Forecast evaluation’, Hand-book of Economic Forecasting , 99–134.Witajewski-Baltvilks, J., Verdolini, E. &Tavoni, M. (2015), ‘Bending the learningcurve’, Energy Economics , S86–S99.Womer, N. K. & Patterson, J. W. (1983),‘Estimation and testing of learning curves’, Journal of Business & Economic Statistics (4), 265–272.Wright, T. P. (1936), ‘Factors aﬀecting the costof airplanes’, Journal of the Aeronautical Sci-ences (4), 122—128. Yelle, L. E. (1979), ‘The learning curve: Histor-ical review and comprehensive survey’, Deci-sion Sciences (2), 302–328.Zhao, J. (1999), ‘The diﬀusion and costs of nat-ural gas infrastructures’.Zheng, C. & Kammen, D. M. (2014), ‘Aninnovation-focused roadmap for a sustainableglobal photovoltaic industry’, Energy Policy67