[PDF] Renewable Generation Data for European Energy System Analysis

Abstract

In the process of decarbonization, the global energy mix is shifting from fossil fuels to renewables. To study decarbonization pathways, large-scale energy system models are utilized. These models require accurate data on renewable generation to develop their full potential. Using different data can lead to conflicting results and policy advice. In this work, we compare several datasets that are commonly used to study the transition towards highly renewable European power system. We find significant differences between these datasets and cost-difference of about 10% result in the different energy mix. We conclude that much more attention must be paid to the large uncertainties of the input data.

Full PDF

RRenewable Generation Data for European EnergySystem Analysis

Alexander Kies , Bruno U. Schyska , Mariia Bilousova , , Omar El Sayed , ,Jakub Jurasz , , , Horst Stoecker , , Frankfurt Institute for Advanced Studies, Goethe-University Frankfurt,Ruth-Moufang-Str. 1, 60438 Frankfurt, Germany German Aerospace Center (DLR), Institute of Networked Energy Systems,Carl-von-Ossietzky-Straße 15, 26129 Oldenburg, Germany Goethe-University, Frankfurt, Germany Faculty of Environmental Engineering, Wroclaw University of Science and Technology,50-370 Wroclaw, Poland Faculty of Management, AGH University, 30-059 Cracow, Poland School of Business, Society and Engineering, M¨alardalens H¨ogskola, 72113 V¨aster˚as,Sweden GSI Helmholtzzentrum f¨ur Schwerionenforschung, Planckstraße 1, 64291 Darmstadt,Germany

Abstract

In the process of decarbonization, the global energy mix is shifting fromfossil fuels to renewables. To study decarbonization pathways, large-scaleenergy system models are utilized. These models require accurate data onrenewable generation to develop their full potential. Using diﬀerent datacan lead to conﬂicting results and policy advice. In this work, we compareseveral datasets that are commonly used to study the transition towardshighly renewable European power system. We ﬁnd signiﬁcant diﬀerencesbetween these datasets and cost-diﬀerence of about 10% result in the diﬀerentenergy mix. We conclude that much more attention must be paid to the largeuncertainties of the input data.

Keywords:

Energy System Analysis, Renewable Generation Data, EnergyMeteorology, MERRA-2, ERA5, renewables.ninja, EMHires

Preprint submitted to Journal Name January 22, 2021 a r X i v : . [ phy s i c s . s o c - ph ] J a n . Introduction Sustainable energy sources are a major solution for the imminent threadof climate change [1, 2]. Increasing the number of installations of renewablegeneration capacities and the electriﬁcation of energy sectors like heatingand transportation fosters decarbonization. As part of the Paris Agreement,many countries around the world commited to reducing their greenhouse gasemissions. In its turn, the EU set an aim to reduce emissions by at least50% by 2030, as compared to 1990 levels [3]. In this context, the sensitivityof energy systems to weather and climate rises [4, 5]. Energy system modelsplay an important role in understanding and investigating energy systemsof various scales and scopes. For the renewable future, the use of adequatemeteorological data in the ﬁeld of energy system analysis and modelling isessential.Common large-scale energy models for Europe cover the EU [6] and aggre-gate quantities such as generation from renewable sources to country levels.There are multiple datasets that provide data on hourly generation poten-tials from renewable sources such as wind and solar PV. They are basedon reanalysis data, which is an assimilation of historical measurements andnumerical models into a consistent estimate of a state of the atmosphere.These datasets provide relevant variables such as wind speed, irradiation ortemperature used to compute potential generation from renewables.A comparison of two of these datasets was performed by Moraes et al.[7]. They found that datasets diverge from each other even if they are basedon the same meteorological source. This is likely based on diﬀerences intechnological assumptions that are made to convert meteorological data togeneration. These include, for instance, assumptions on the wind turbinesused, which can take in historical data or projections into the future such asincreasing hub heights [8], as well as their placement.In this paper, we compare seven datasets which provide time-resolvedaggregated generation data on the country level. Only ﬁve of them coverthe same time period (2003-2012), therefore we focus on them in particular.Section 2 discusses these datasets and shows the major steps in convertingmeteorological data to energy system model input. Section 3 analyses thedatasets with respect to diﬀerent means such as annual capacity factors, ramprates and optimised mixes. Section 4 discusses the results and implicationsand Section 5 concludes this paper.This paper contains novelties relevant for the research community. It2rovides the broadest comparison of renewable generation datasets to date,introduces energy system-related measures to compare them and derives im-portant conclusions for research on the energy transition.

2. Data

Data on renewable generation potentials is essential for the modelling andanalysis of energy systems with signiﬁcant shares of generation from renew-able sources. Among the most used datasets in the study of weather andclimate are reanalyses. A meteorological reanalysis is a method to createlong-term weather data using numerical weather prediction models and as-similating historical data. Reanalyses are used to study climate variability[9] as well as are commonly employed to study energy systems [10]. Windspeeds for the datasets investigated in this work are mostly based on tworeanalyses: ERA5 and MERRA-2. MERRA [11] is provided by NASA. Theoﬃcial data production was launched in 2008 with the use of the up-to-date GEOS-5 (Goddard Earth Observing System Data Assimilation SystemVersion 5) produced at NASA GMAO (Global Modeling and AssimilationOﬃce). ERA5 [12] is the newest global reanalysis by the European Cen-tre for Medium-Range Weather Forecasts (ECMWF). Wind speed data areprovided for heights of 10 m as well as 100 m. The RE-Europe is the only con-sidered dataset that uses ECMWF forecasts (RE-Europe) and the regionalCOSMO-REA6 reanalysis for the modelling of wind power [13]. For solardata, CM SAF SARAH is commonly used alongside the reanalyses [14]. It isa satellite-based climate data record of irradiance data and other variables.MERRA-2 and ERA5 were compared by various researchers. Olauson[15] compared ERA5 and MERRA-2 for the modelling of wind power ona country-level as well as for individual turbines. His ﬁndings indicate thatERA5 performs considerably better than MERRA-2 on both levels. Camargoet al. [16] used ERA5 data to model multi-annual time series of solar PVgeneration. They performed a validation with hourly data of PV plants inChile and found a slightly superior performance of ERA5 as compared tomodelling based on MERRA-2. Gruber et al. [17] compared MERRA-2 andERA5 for wind power simulation bias-corrected with the global wind atlasfor the US, Brazil, New Zealand and South Africa and found ERA5 to out-perform MERRA-2. Urraca et al. [18] evaluated global horizontal irradianceestimates from ERA5 and COSMO-REA6, a regional reanalysis from theGerman Meteorological Service (DWD), and concluded that both reanalyses3educe the quality gap between reanalysis and satellite data. Jourdier [19]investigated ERA5, MERRA-2 and other datasets to examine wind powerproduction in France, and found that ERA5 is skilled, however, it underes-timates wind speeds, especially in mountainous areas. Piasecki et al. [20]compared ERA5 data with measurements in several locations in Poland andconcluded that they are in good agreement for solar PV, with hourly correla-tion coeﬃcients above 0.9, while wind comparison showed a large variabilitywith diﬀerences in capacity factors of up to 15 percentage points.Dataset years wind PV capacitylayout temporalresolution ref.renewables.ninja 1980-2019 MERRA Sarah yes 1h [21]EMHires 1986-2015 MERRA Sarah yes 1h [22]Restore 2003-2012 MERRA Sarah no 1h [23]UReading-E 1979-2018 ERA5 ERA5 yes 3h [24]UReading-M 1979-2018 MERRA MERRA yes 1h [25]PyPSA-Eur 2013 ERA5 Sarah yes 1h [26]RE-Europe 2012-2014 diﬀ. diﬀ. no 1h [27]

Table 1: Datasets providing data on renewable generation that were analysed and com-pared in this work. Abbreviations used in this work are Ninja, URead-E/M and RE-Eur-E/C.

Table 1 shows key features of each dataset. Figure 1 shows the countriescovered. To use a suﬃciently long time period for every dataset, we set thetime frame of the following analyses to the period 2003-2012 and compare 5datasets covering it.For the sake of readability, we only brieﬂy summarize the major stepsof modelling renewable energy generation for the datasets investigated inthis section, except for the Restore dataset, where we describe them in moredetail. In general, similar steps are taken for all datasets (e.g. Fig. 2). Ameteo model converts meteorological data to the needed input for a powermodel, which yields space- and time-dependent power per installed capacity.The output of this model is aggregated by a capacity model, which representshow much capacity is installed at what location. The aggregated sum of thismodel yields the generation time series for the renewable generation.4 mhires/rninja reading restore

Figure 1: Country coverage of the datasets. Purple-colored countries include both solarPV and wind time series, yellow-colored ones include only wind. Both UReading datasetsexclude Estonia. The Restore dataset excludes Norway and for countries Ireland, Den-mark, Finland and Estonia provides only wind time series.

The Restore dataset [23] was produced in the context of the RESTORE2050 project. Besides wind and solar PV generation time series, it alsocontains time series of other generation technologies, such as CSP, wave andhydro inﬂow. To calculate wind power per grid cell in the Restore dataset,wind speeds are interpolated to the grid of the COSMO-EU model [28], andextrapolated to the turbine hub height using the wind speed log proﬁle s ( z ) s ( z ) = log( z /z )log( z /z ) , (1)where z , z are the heights at which wind speeds are given and desired,respectively and z is the surface roughness length provided by the dataset.Wind speeds are corrected using a linear regression according to COSMO-EUwind speeds. The power curve of an Enercon E-126 turbine is then used toconvert wind speeds to wind power at 100m hub height. Additionally, thedataset provides capacity factors generated with the same power curve forthe hub heights of 140m and 180m. In the following comparison, we use thedataset based on 100m wind speeds. Any wind turbine power curve mappingwind speed s to power output g is approximately given by [29, 30]5 atellite images Meteo model Power model Capacity model cloudindex (n)clear-sky index (k*)global horizontal irradianceirradiance on inclined planesmodule efﬁciencyambient temperature ! (NWP MERRA)module temperaturetemperature downscaling capacity distribution ! Germany, 2012resource-dependent distributioninstalled capacity at each grid pointmapping of distribution function to all countries & all yearscountry-level installed capacity from ! meta-studiesmodule conﬁguration from meta-studyfeed-in per grid pointaccumulation on ! country-levelac-powerdc-power inverter model PV feed-in per country wind speed from MERRA reanalysis

Meteo model Power model Capacity model spatial interpolation and vertical extrapolationlinear regressionstatistical downscaling of wind speedvertically interpolated wind speeds from COSMO-EU analysis capacity distribution ! Germany, February, 2013resource-dependent distributioninstalled capacity at each grid pointmapping of distribution function to all countries & all yearscountry-level installed capacity from ! meta-studiesfeed-in per grid pointaccumulation on ! country-levelpower curve (Enercon E-126) wind feed-in per country Figure 2: Schematic ﬂowchart for PV/wind generation modelling for the RESTOREdataset. Figure from Kies et al. [23]. =  s < s ∨ s > s max ∝ ( s − s ) for s < s < s nom = G for s nom < s < s max . (2)For low wind speeds, torque caused by the wind on blades of a turbine istoo low to generate angular momentum. Above a certain speed, referred toas cut-in speed s , turbines start to rotate and produce power. The powercurve now resembles the characteristic v -dependency of the kinetic energy ofwind passing through a certain area. At the rated wind speed s nom , the ratedpower of the turbine is reached. This power is kept constant by the turbinebeyond this speed value, commonly by the adjustment of blade angles. Ifspeeds increase further, they can reach a critical level, referred to as cut-outspeed s max , at which the rotor blades are turned out of the wind to preventa structural damage to the turbine.For PV Power, global horizontal irradiance is converted to irradiation oninclined surfaces based on the Klucher model [31]. For tilt angles, optimizedvalues per country are used.The eﬃciency of the PV modules in dependency of incoming irradiation I and temperature T is modelled via a parametric model for a standardtemperature η ( I, ◦ C ) = a a × I + a I, (3)where a , a , a η ( I, T ) = η ( I, ◦ C )(1 − . T ) . (4)PV power output is then directly computable via g t = η t I t A. (5)Unlike the other datasets investigated in this work, the Restore datasetdoes not take the locations of existing generation capacities into account.Instead, country-wise capacities are distributed proportionally to the under-lying resource [23]. 7 .2. UReading The UReading datasets [24, 25] were produced by researchers from theUniversity of Reading, UK. They contain hourly values of aggregated powergeneration from wind and solar based on a representative distribution ofwind and solar farms as well as ERA5 (Reading-E) or MERRA-2 (Reading-M) reanalyses data for 28 European countries. In addition, a daily timeseries of electricity demand is provided. It is used in this paper to optimisegeneration mixes.

Renewables.ninja [21] is a web tool that provides potential generation ofwind and solar PV for single locations or countries globally. Pfenninger andStaﬀell [32, 33] also found that signiﬁcant correction factors are necessaryto model renewable feed-in from reanalyses in Europe. To model PV power,they use the model by Huld et al. [34] with irradiation based on SARAHfor diﬀerent azimuth/tilt combinations. For wind power, they use a virtualwind farm model [35] and bias-correct it.

European Meteorological derived HIgh resolution RES generation time se-ries for present and future scenarios (EMHires) is a dataset produced by thejoint research centre (JRC) with an aim to allow users to assess the impactof meteorological and climate variability on renewable generation in Europe[22, 36, 37]. For wind power, this dataset combines MERRA-2 reanalysisdata with wind farms data from thewindpower.net [38]. Wind power timeseries are further normalised to the reported ENTSO-E annual productionstatistics. Wind speeds are statistically downscaled and interpolated to thedesired hub height using a wind proﬁle power law. Wind speed data are thenconverted to wind power using a speciﬁc power curve assigned to each windfarm considering the characteristics of the wind farm, such as manufacturer.For PV generation, the PVGIS model [39, 40, 41] is used to provide gener-ation in dependency of irradiation and solar module parameters. Togetherwith assumptions such as inclinations, PVGIS output is then aggregated tocountry levels.

PyPSA-EUR is a dataset for European generation and transmission ex-pansion planning studies from freely available data [26]. Besides time series8f renewable generation, it contains additional data to model a renewable Eu-ropean energy system, such as alternating/direct current transmission lines,substations, data on conventional generators and demand as well as renew-able capacity installation potentials. For renewable generation data, it usesdata from the ERA5 reanalysis with a logarithmic wind power proﬁle to ex-trapolate wind speeds to the desired hub height for wind power modelling.For PV conversion, it also uses the PVGIS model [40].

RE-Europe [27] is a dataset produced at the Technical University of Den-mark for the modelling of a highly renewable European power system. Analo-gously to PyPSA-EUR, alongside renewable generation data it contains otherdata, for instance data on transmission lines. To provide meteorological vari-ables for both, wind and solar PV, meteorological data from the ECMWFforecasts as well as the COSMO-REA6 reanalysis are used. For the ﬁnalconversion step, it uses a smoothed power curve of a speciﬁc wind turbine(Siemens SWT 107) as well as a speciﬁc solar panel (Scheuten P6–54 215Multisol Integra Gold) to convert meteorological variables to a potential gen-eration. Generation capacities are assigned to sub-national regions using twoheuristic capacity layouts.

3. Results

In this section, we study diﬀerent properties of the provided time seriesof generation per unit g in,s,t . Here i denotes the dataset, n is the country, s is the technology and t is the timestep. The capacity factor of a generator is the ratio of the potential energyoutput over a given period of time to the maximum possible energy outputas given by the rated capacity over that period [42],cf = (cid:88) t g + n,s,t G n,s . (6)For renewable generators the capacity factor heavily depends on the re-source availability as well as on the technical parameters. Occasionally, it9s referred to as full load hours. Besides reductions due to resource inavail-ability, one can, for instance, factor net congestions into eﬀective capacityfactors [43].Figure 3 shows annual capacity factors of the diﬀerent datasets. Annualreported capacity factors were calculated by dividing reported generation byreported installed capacities obtained from IRENA [44]. For wind, overall ca-pacity factors diﬀer signiﬁcantly between datasets, while relative year-to-yearchanges are quite similar. Reading-M in general shows the highest capacityfactor in almost all cases for wind and solar PV, while EMHires shows thelowest. Absolute capacity factor diﬀerences go up to around 20% (UK wind),and relative capacity factors for some cases diﬀer by a factor of almost 3 (Italywind). The UReading-M and UReading-E dataset have diﬀerent meteoro-logical reanalyses as input but are based on the same assumptions. Theyshow a relatively good agreement for wind and are closest to each other inall cases, except for the UK. For solar PV, UReading-M has relatively lowagreement with the other datasets. This potentially supports the observationby Camargo et al. [16] that modelled PV power based on ERA5 has betteragreement with measurements than MERRA-2. For wind, diﬀerences whenswitching between MERRA-2 and ERA5 seem to have a smaller eﬀect thanwhen choosing a diﬀerent methodological approach, as demonstrated by theproximity of capacity factors for UReading-M and -E.Another important point is the diﬀerence of the modelled values to thereported capacity factors. In some cases, such as German wind, reportedvalues are well captured by Ninja and the EMHires dataset, while results forPV seem to not meet the modelled values, in both absolute terms and year-to-year variations. However, one should keep in mind that not all datasetshave the goal to reproduce historical values of capacity factors. Anotherlikely interpretation of the fact that year-to-year changes of reported val-ues diﬀer from the modelled dataset is that modelled values are a subjectto meteorological year-to-year changes only, while reported values are alsoinﬂuenced by changing capacity layouts or technological parameters. Thediscrepancy between modelled and realized capacity factors with the latterbeing considerably lower has already led to public debates [45]. There is alsoa discrepancy between wind capacity factor estimates and realized valuesbased on oversimplifcations [46]. For PyPSA-EUR, the year 2013 and forRE-Europe the years 2012-2014 were considered. Capacity factors in bothdatasets look unremarkable with the exception of solar PV for RE-Europe;its capacity factors are substantially higher in all considered countries based10 .20.30.40.5 U K Wind CF

Solar PV CF D E F R E S I T Figure 3: Annual capacity factors for wind and solar PV for the chosen countries. RE-Europe does not cover the UK. Solar PV capacity factors are the highest for the RE-Europe dataset. Year-to-year changes are well caught by all datasets for wind, implyingthe diﬀerence resulting from the conversion process from meteorological data to generation.For solar PV, capacities in some countries such as the UK and Italy were very small (wellbelow 1 GW) at the beginning of the investigated period. Strong changes in year-to-yearreported capacity factors are potentially just caused by rounding or reporting errors. Alsotaking the large year-to-year changes in reported capacity factors into account, it seemsquestionable, how valueable these values are in many cases for a comparison with modelleddata. Longitude [Deg. East] L a t i t u d e [ D e g . N o r t h ] Solar PV

Longitude [Deg. East] L a t i t u d e [ D e g . N o r t h ] Wind

Figure 4: Average capacity factors for wind and solar PV in the years 2003-2012. Thegreyscale bar indicates the average capacity factor. The colorbar plots per country showthe average deviation in percentage points from the mean of the ensembles (Restore,EMHires, URead-E, URead-M and Ninja). Colors are: EMHires (red), URead-E (blue),URead-M (green), Restore (purple), Ninja (orange) and reported (yellow). For solar PV,trends seem to be consistent over all for Europe. For wind, there is more variation. Forinstance, Restore produces comparably small capacity factors in South-East Europe andNorway, while for other North Sea countries Restore capacity factors are above average.

12n both, ECMWF forecasts as well as COSMO-REA6.Figure 4 shows averaged capacity factors per country for ﬁve datasetsand reported values. Reported values for solar PV are for most countries aresigniﬁcantly below the modelled data mean, except for Sweden and Finland.U-Read-M reports highest solar PV capacity factors for almost all countries.For solar PV, it can be concluded, that the methodology and the chosendatasets are largely independent of the location. For wind, the picture looksdiﬀerent. Restore, for instance, has mostly lower capacity factors in SouthernEurope, while in North- and Northwestern Europe they are close to or abovethe ensemble mean. For EMHires, the picture looks the opposite, with ca-pacity factors below average in the North and Northwest and close or aboveaverage in the South-East.

The Pearson correlation coeﬃcient measures linear correlation betweentwo variables. In the context of renewable generation, the Pearson correlationcoeﬃcient is deﬁned as ρ g α ,g β = cov (cid:0) g α , g β (cid:1) σ g α σ g β . (7)The Pearson correlation coeﬃcient is commonly used to study whetherdiﬀerent renewable generation sources or renewable generation sources atdiﬀerent locations are connected via transmission complement each other[47, 48, 49]. Here, we are interested in the correlations of the fast (in theorder of hours) variations of the wind and solar PV time series. The slowvariations caused by large-scale meteorological phenomena and the seasonalcycle of the sun should be similar in all datasets. Therefore, we apply a localnormalisation as described in Sch¨afer and Guhr [50]: Wind time series arenormalised by computing the average and standard deviation over a movingwindow of 30 days. Solar PV time series are normalised separately for eachhour of the day and with a window size of 20 days. This approach was usedto reduce the dependency of the time series on the diurnal cycle in case ofsolar PV and large-scale synoptic conditions for wind. Figure 5 shows pair-wise correlation coeﬃcients over the ten years from 2003-2012 for exemplarycountries. In general, correlations are very high between datasets despiteemphasising the diﬀerences by using the moving window approach, except13 MHiresUReading-EUReading-MRestoreNinja 1 0.95 0.91 0.99 0.980.95 1 0.89 0.96 0.950.91 0.89 1 0.92 0.890.99 0.96 0.92 1 0.980.98 0.95 0.89 0.98 1

DE wind

DE solar PV

FR wind

FR solar PV

UK wind

UK solar PV

ES wind

ES solar PV

IT wind

EMHires UReading-E UReading-M Restore Ninja1 0.8 0.82 0.96 0.940.8 1 0.77 0.82 0.80.82 0.77 1 0.82 0.820.96 0.82 0.82 1 0.930.94 0.8 0.82 0.93 1

IT solar PV

Figure 5: Correlation of locally normalised three-hourly values from 2003-2012. Wind timeseries were normalised by computing the average and standard deviation over a movingwindow of 30 days. For solar PV, this was done for each hour of the day separately anda window size of 20 days.

Low generation events are times, when resources for both wind andsolar PV are not available. In Germany, these events are referred to as”Dunkelﬂaute”, which can be translated as dark doldrums. Low-wind-powerevents for Germany were studied by Ohlendorf and Schill [51]. They foundthat long low-wind-power events are rare. An average wind capacity factorbelow 10% for around ﬁve consecutive days occurs on yearly basis and fora period of around eight days every ten years. We characterise these eventsby the aggregated generation from wind and solar PV of all countries to-gether. First, normalised time series are multiplied with installed capacitiesper country and aggregated G t = (cid:88) n,s G n,s g n,s,t . (8)ctry wind solar ctry wind solar ctry wind solarAT 2.1 0.8 DE 44.9 38.2 PL 3.8 0.0BE 2.0 3.1 GR 2.0 2.6 PT 4.9 0.4BG 0.7 1.0 HU 0.3 0.0 RO 3.0 1.3CR 0.3 0.0 IE 2.3 0.0 SK 0.0 0.6CY 0.2 0.0 IT 8.7 18.4 SI 0.0 0.3CZ 0.3 2.1 LV 0.0 0.0 ES 23.0 4.8DK 4.8 0.6 LT 0.3 0.1 SE 5.4 0.1EE 0.3 0.0 LU 0.1 0.1 UK 12.4 5.2FI 0.6 0.0 MT 0 0.1FR 9.3 5.6 NL 2.8 1.1 Table 2: Capacities per country in [GW] [52, 53] used to produce Fig. 6 and Fig. 7.

Generation capacities by country from wind and solar PV for 2014 aregiven in Table 2. 15he resulting time series { G , G , ...G | t | } are then split into disjoint sub-sets { G i , ..., G j } with G k < α (cid:80) n,s G n,s ∀ k ∈ [ i, ..., j ] and G i − > = α (cid:80) n,s G n,s or i =0 and G j +1 > = α (cid:80) n,s G n,s or j = | t | . The number of events is then givenas the cardinality of the set of subsets |{{ G i , ..., G j } , ... }| and the averagelength by the average number of elements of this set. α is the threshold,below which an event is considered to be a low generation event. N u m b e r o f e v e n t s Events of low generation

EMHiresUReading-ENinjaRestoreUReading-M0.00 0.02 0.04 0.06 0.08 0.10Threshold of generation capacity0246810 A v e r a g e l e n g t h o f e v e n t [ h ] Events of low generation

EMHiresUReading-ENinjaRestoreUReading-M

Figure 6: Number of EU28-wide low generation events (top) and their average length(bottom). EMHires shows the highest number of events and length. It should be noted thatUReading-E provides only three-hourly data, which makes the direct comparison diﬃcult.Besides, the other ERA5-based datasets show similar slopes for growing thresholds.

Figure 6 shows the statistics on low generation events for the datasets16nvestigated. Average observed lengths are a few hours and the numbersseems to grow relatively linearly to the growing threshold until a 10% thresh-old of total renewable generation. While all datasets show a similar shape,the overall number of events diﬀers quite a lot. It should be noted thatfor the UReading-E dataset the number of events is inﬂuenced by its three-hourly resolution. It is also interesting that the Ninja-dataset demonstratesa stronger growth rate, while most datasets show a similar growth of thenumber of events with increasing threshold. Really extreme low generationevents are very scarce in Ninja, however the number grows rapidly with thegrowing threshold overtaking the UReading-M dataset for α = 0 . Another measure of importance for a power system are ramp rates. Ramprates describe changes in power output that can impact power system oper-ation disproportionately. Batteries are considered to be important compo-nents that could deal with ramp rates of renewable generators [54, 55]. Ramprates are often considered in energy resource assessment studies [56, 57].Ramp rates of a power plant are calculated by taking the diﬀerence of theoutput over passed time: RR t,δ t = g t + δ t − g t . (9)Figure 7 shows three-hourly europe-wide ramp rates of the diﬀerent datasetsfor wind and PV generation. While the distribution for wind is monotonousand symmetric, the solar PV ramp rate distributions show minor peaks ataround ±

5% of generation capacities. These are likely caused by the deter-ministic diurnal pattern of PV generation resulting from the rotation of theEarth. UReading-M ramp rates diﬀer signiﬁcantly from ramp rates calcu-lated based on other datasets if wind and solar PV are considered separately.However, if wind and solar PV are added together, these eﬀects cancel outand the distribution becomes very similar, with the exception of the relativelyprominent second peaks at values > ± . Generation data are commonly used to study highly renewable powersystems. These include a variety of technologies, such as conventional andrenewable generation technologies, storage, transmission technologies, etc.17 .4 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0.4Normalised three-hourly change in generation02468101214 D e n s i t y Ramp rates wind + solar PV restoreemhiresreadingrninjareading_merra0.2 0.1 0.0 0.1 0.2Normalised three-hourly change in generation051015202530 D e n s i t y Ramp rates wind restoreemhiresreadingrninjareading_merra0.2 0.1 0.0 0.1 0.2Normalised three-hourly change in generation0510152025 D e n s i t y Ramp rates solar PV restoreemhiresreadingrninjareading_merra

Figure 7: EU28-wide three-hourly ramp rates relative to installed renewable capacities.The small-shoulder peaks for solar PV are likely caused by the deterministic diurnal pat-tern of the sun. Note that the URead-M data show considerably higher ramp rates forwind than the other datasets. From the system point of view, high ramp rates lead toadditional need for balancing. ∝ capacity factor) [60, 61, 62]. To check the inﬂuence of thediﬀerent datasets on a cost-optimal power system, we present the resultsof a simpliﬁed optimisation in this section. We optimise the mix of wind,solar PV and open cycle gas turbines (OCGT) as ﬂexible backup power tocover the demand of the EU-28+CH+NO countries. The choice of gas wasmade because gas is often considered to be a bridge technology towards cleanenergy that buys time to foster the energy transition [63, 64].The optimisation objective readsmin g,G,F (cid:32)(cid:88) n,s c n,s · G s + (cid:88) n,s,t o s · g n,s,t (cid:33) . (10)It consists of capital costs c n,s for installed capacity G n,s of a carrier (wind/solarPV) s at node n and marginal costs of generation o s for energy generation g n,s,t of carrier s at node n and time t . It is furthermore subject to thefollowing constraints. (cid:88) s g n,s,t − d n,t = 0 ∀ n , t . (11)Demand in space and time needs to be met by dispatched generation fromthe various generating technologies and0 · G n,s ≤ g n,s,t ≤ g + n,s,t · G n,s ∀ n , t . , (12)dispatched generation is limited by generation capacity times a weather-dependent availability g + n,s,t for renewables. This availability is the hourlyrenewable capacity factor studied in the beginning of the Results section.19echnology Capital Cost Marginal Cost Emissions[USD/MW/a] [USD/MWh] CO [ton/MWh]OCGT 47,235 58.385 0.635Wind 136,428 0.005 0Solar PV 76,486 0.003 0 Table 3: Annualised cost assumptions for generation and storage technologies derived fromdiﬀerent sources [65, 66].

Cost assumptions for generation capacities c n,s and marginal costs o s aswell as emission assumptions are given in Table 3. W i n d S h a r e Generation Capacity reading_ereading_merra emhiresrestore rninja0 50 100 150 200CO_2 price [USD/ton]0.10.20.3 S o l a r S h a r e Figure 8: Shares of installed renewable capacities and generation. Capacity curves are lesssmooth (i.e. monotoneous) than generation curves. Besides, there is a shift in capacitybetween wind and solar for datasets Restore and Ninja at low CO prices indicating aﬂat optimisation maximum with respect to capacity. The remaining shares comprise gaspower plants. Figure 8 shows shares of the generation and capacities for renewables independency of the price for emitting carbon dioxide. This price is multipliedwith emissions per generation unit and added to the marginal cost of gener-20tion. A growing CO price supports the cost-competitiveness of renewablesand increases their shares reading_e reading_merra emhires restore rninja0102030405060 L C O E [ U S D / M W h ] LCOE CO2 price 0 USD/ton

OCGT CAPEXSolar CAPEXWind CAPEXOCGT OPEXreading_e reading_merra emhires restore rninja010203040506070 L C O E [ U S D / M W h ] LCOE CO2 price 100 USD/ton

OCGT CAPEXSolar CAPEXWind CAPEXOCGT OPEX

Figure 9: Levelized cost of electricity at CO prices of 0 USD/ton (top) and 100USD/ton (bottom) consisting of capital expenditures (CAPEX) and operational expen-ditures (OPEX), which are neglible for renewables. Note that overall diﬀerences are notsigniﬁcantly larger at a high CO price. Costs are calculated without the imposed car-bon emission price. At a low CO price, some datasets (URead-E and URead-M) alreadyrender wind generation relatively cost-competitive, while the other datasets do not. Figure 9 shows levelized cost of electricity and their composition for CO prices of 0 USD/ton and 100 USD/ton without the cost-contribution fromemitting carbon dioxide. Signiﬁcant diﬀerences can be observed. Based on21 Longitude [Deg. East] L a t i t u d e [ D e g . N o r t h ] U S D / M W h Longitude [Deg. East] L a t i t u d e [ D e g . N o r t h ] U S D / M W h Figure 10: Country-wise LCOE in dependency of the meteorological dataset used forcomparison in relation to the ensemble mean. Colors are: EMHires (red), URead-E (blue),URead-M (green), Restore (purple), Ninja (orange) and reported (yellow). These LCOEreﬂects the cost of autark energy supply per country from renewables plus gas as a backuptransition technology. LCOE strongly depends on the datasets at high LCOE prices.Datasets that provide relatively low capacity factors, like EMHires in case of Denmark,lead to a signiﬁcantly higher LCOE of more than 25%. price,wind contributes more than 10% to LCOE indicating relatively large sharesof installed capacities. At 100 USD per ton of carbon dioxide emissions,renewables gain signiﬁcant ground with respect to their cost-competitiveness.Now, for all datasets, wind contributes around 30% to LCOE and solar PVaround 10%.Figure 10 shows levelized cost of electricity on the country level for thediﬀerent datasets and carbon emission prices of 0 and 100 USD/ton. At 0USD/ton, most countries rely entirely on gas, because renewables are notcost-competitive. However, for some countries, such as UK, Ireland, or Den-mark, where renewables successfully compete with gas, diﬀerences betweenthe datasets are signiﬁcant. Using datasets such as URead-E and Uread-Mthat provide relatively high capacity factors, signiﬁcantly lowers LCOE, byabout 10%. The diﬀerence in LCOE for countries that entirely rely on gasis based on diﬀerences in the demand patterns, especially between winterand summer. The eﬀect of the diﬀerence in LCOE between datasets is moreremarkable at the CO price of 100 USD/ton. Cost diﬀerences are drasticalin some countries such as Denmark, where the results of diﬀerent datasets forLCOE can diﬀer by more than 50%. The country-resolved plot shows thatdiﬀerences between countries can be signiﬁcant, if diﬀerent meteorologicaldatasets for power system optimisation are considered. In case the wholecontinent is considered, they level out to a large degree.

4. Discussion

Data on generation from renewable sources is commonly used in energysystem models to study their behaviour, transition pathways or create pol-icy advice. Diﬀerent data potentially leads to conﬂicting policy advice andmisallocation of large amounts of money.The integration of renewable energy is a challenging task due to theirweather dependency. To cope with non-dispatchable renewable energy sources,various concepts can be applied, such as optimizing the mix of diﬀerent gen-eration technologies, energy storage [67, 68], transmission, demand-side man-agement [69, 70] or sector-coupling [58, 59]. When studying multiple datasetsthat are commonly used as input into large-scale energy system models, weobserved signiﬁcant diﬀerences between the models. As the diﬀerences are23lso observed in datasets based on the same meteorological database, theycan likely be attributed to diﬀerent assumptions in the conversion process tocountry-aggregated time series.Low generation events are deﬁned as periods of continent-wide low gen-eration from both, wind and solar PV. As these occur continent-wide, atransmission grid does not help to tackle these. Instead, storage is a viablesolution. With an average length of a few hours, daily and synoptic stor-age options seem to be a suitable choice to cope with these low generationevents. Commonly, lithium-ion batteries are proposed as a daily storage [71]and hydrogen storage or pumped hydro storage (PHS) is a viable storage forthe synoptic scale [72, 59]. Energy system models suﬀer from uncertaintiesnot only in meteorological data, but also in assumptions made. Various ap-proaches have been studied to tackle the problem of uncertainty in energysystem models [73, 74, 75, 76, 77].Besides uncertainty from the generation data source itself and the as-sumptions made in creating it, additional uncertainty arises from what periodto choose. Renewable generation resources tend to vary on decadal scales andmodels predict that climate change has also profound eﬀects on renewablegeneration: Schlott et al. [78] investigated the eﬀects of climate change on afuture European energy system and found an increasing competitiveness ofsolar PV due to changing correlation patterns. Wohland et al. [79] found anincreasing need of backup energy due to climate change, Weber et al. [80]concluded the same and found besides the increasing need for backup energyan increasing need for storage due to climate change. Kozarcanin et al. [81]studied various metrics such as variability of renewable generation or shortterm dispatchable capacity under climate change but concluded that thereis no discernible eﬀect on these measures. Bloomﬁeld et al. [82] saw signiﬁ-cant uncertainty in power system design due to climate change and pledgedfor better understanding of this climate uncertainty. Besides uncertaintyarising from meteorological data, other uncertainties aﬀect power systems,such as uncertainty in cost assumptions as well as technological developments[83, 84]. The relevance of these eﬀects should be compared to the relevanceof uncertainty arising from meteorological data.Considering LCOE that were studied in subsection 3.5 it is interesting tonote that diﬀerences in the results did not appear to get larger as renewableshares grew. This seems contradictory to the naive estimate that the choiceof the renewable generation database should increase the dependency on the24hoice of meteorological data. However, in a fully detailed energy systemmodel, complex interdependencies might exist that render smaller or largereﬀect of diﬀerent datasets. One should also note that capacity factors arenot the sole factor in determining cost-eﬃciency of renewables, because non-dispatchable renewables might start cannibalising themselves at high pene-tration levels reducing their market value [85, 86]. Brown and Reichenberg[87] have argued that this is the result of policy. A possible way to deal withreductions in market value are system-friendly renewable generator designs[88, 89] that aim at producing more at times of higher prices, for instancePV modules with diﬀerent azimuths/tilts [90].

A number of systematic diﬀerences increase the diﬃculty to perform theanalysis of the scenarios compared in this paper: datasets are given with dif-ferent temporal resolution. While one-hourly data are considered suﬃcientto model renewable country-spanning energy systems with suﬃcient robust-ness [91], for three-hourly values this is less evident. However, due to thefact that computational limitations and linear optimisation problems are inthe complexity class P-complete, we had to ﬁnd a sweet spot between modeldetails, temporal and spatial resolution [92, 93] and computational feasibil-ity. At last, we decided to focus on onshore wind. Oﬀshore wind energy wastreated diﬀerently in the diﬀerent datasets. However, this eﬀect is rathersmall, as the installed oﬀshore capacities contribute only a small percentageof the overall wind capacities as of 2014. Nevertheless, its shares are risingquickly, due to technical advances, cost reductions and limited onshore windpotentials [94, 95, 96, 97], although installation potentials vary signiﬁcantlyin the literature and some researchers suggests far higher numbers [98].

5. Summary and Conclusions

This paper, compares several distinct datasets, which provide renewablegeneration time series for both sources, wind and solar PV, on the coun-try level. Diﬀerent measures are used, such as ramp rates, correlation co-eﬃcients, annual capacity factors and optimized mixes of generation, fromwind/solar/gas. From the presented results, the following conclusions can bedrawn: • Diﬀerences between model statistics are signiﬁcant, even if they arebased on the same meteorological database. These diﬀerences are likely25ue to diﬀerent assumptions about the conversion from weather toenergy. There is a signiﬁcant need for more research on the eﬀects andinteractions of choices made in the modelling chain weather-to-energyat every step, to achieve better understanding. • These signiﬁcant diﬀerences have severe consequences for the optimi-sation and further studies of power systems. Diﬀerences in capacityfactors directly aﬀect both, CAPEX and OPEX, of renewable energygeneration, transport and storage technologies. Hence their optimizedshares in power system expansion models are quite sensible to theseuncertainties. • Emphasis in renewable energy systems research must be put on the useof adequate generation data and discussion of its properties, before wecan oﬀer reliable research results and robust policy advice.Future research must focus on the study of these critical issues using a reliablepower system model, which allows to capture complex dependencies betweenboth, spatial and temporal eﬀects as well as diﬀerent types of technologies.The methods proposed by Schyska and Kies [75], Nacken et al. [73] andNeumann and Brown [74] shall be considered to improve the study of theeﬀects of the diﬀerences in the input weather data on the resulting output ofthe large-scale energy system models.

Acknowledgements

Research is funded by the Federal Ministry of Economic Aﬀairs and En-ergy (BMWi) under grant nr. FKZ03EI1028A (EnergiesysAI) and the Fed-eral Ministry of Research and Education (BMBF) under grant nr. FKZ03EK3055C (CoNDyNet II). HS acknowledges the Judah Eisenberg Profes-sor Laureatus of the Fachbereich Physik and the Walter Greiner Gesellschaft.

Data Availability

The datasets analysed in this work are available in a harmonised formunder https://github.com/alexﬁas/compare met data/.26 eferences