[PDF] Global Household Energy Model: A Multivariate Hierarchical Approach to Estimating Trends in the Use of Polluting and Clean Fuels for Cooking

Abstract

In 2017 an estimated 3 billion people used polluting fuels and technologies as their primary cooking solution, with 3.8 million deaths annually attributed to household exposure to the resulting fine particulate matter air pollution. Currently, health burdens are calculated using aggregations of fuel types, e.g. solid fuels, as country-level estimates of the use of specific fuel types, e.g. wood and charcoal, are unavailable. To expand the knowledge base about impacts of household air pollution on health, we develop and implement a Bayesian hierarchical model, based on Generalized Dirichlet Multinomial distributions, that jointly estimates non-linear trends in the use of eight key fuel types, overcoming several data-specific challenges including missing or combined fuel use values. We assess model fit using within-sample predictive analysis and an out-of-sample prediction experiment to evaluate the model's forecasting performance.

Full PDF

GGlobal Household Energy Model: A Multivariate Hier-archical Approach to Estimating Trends in the Use ofPolluting and Clean Fuels for Cooking

Oliver Stoner, Gavin Shaddick and Theo Economou

Department of Mathematics, University of Exeter, Exeter, UK

Sophie Gumy, Jessica Lewis, Itzel Lucio, Giulia Ruggeri and Heather Adair-Rohani

World Health Organization, Geneva, Switzerland

Summary . In 2017 an estimated 3 billion people used polluting fuels and technolo-gies as their primary cooking solution, with 3.8 million deaths annually attributed tohousehold exposure to the resulting ﬁne particulate matter air pollution. Currently,health burdens are calculated using aggregations of fuel types, e.g. solid fuels, ascountry-level estimates of the use of speciﬁc fuel types, e.g. wood and charcoal, areunavailable. To expand the knowledge base about impacts of household air pollutionon health, we develop and implement a Bayesian hierarchical model, based on Gen-eralized Dirichlet Multinomial distributions, that jointly estimates non-linear trends inthe use of eight key fuel types, overcoming several data-speciﬁc challenges includingmissing or combined fuel use values. We assess model ﬁt using within-sample pre-dictive analysis and an out-of-sample prediction experiment to evaluate the model’sforecasting performance.

Keywords : Air pollution; Bayesian hierarchical model; Forecasting; GeneralizedDirichlet; Household; Solid fuels.

1. Introduction

In 2017, an estimated 3 billion people, or 39% of the global population, used a solidfuel (charcoal, coal, crop residues, dung, or wood) or kerosene as their primary fuelfor cooking. This results in the emission of dangerous levels of pollutants, includingﬁne particulate matter (PM . ) and carbon monoxide (World Health Organization,2014). The World Health Organization (WHO) has estimated that about 3.8 milliondeaths per year worldwide can be attributed to pollution from household cooking(World Health Organization, 2018b). This harm is compounded by the burden onpeople, notably women and children, who must dedicate large amounts of time tofuel collection which might otherwise be spent on education or work, and the riskof burn injuries.To address this leading cause of disease and premature death in low- and middle-income countries, the 2030 Agenda for Sustainable Development, adopted by allUnited Nations member states, set a target of universal access to clean fuels and a r X i v : . [ s t a t . A P ] N ov Stoner et al. technologies for cooking (Sustainable Development Goal (SDG) 7.1.2) and to sub-stantially reduce the number of deaths from the joint eﬀects of ambient and house-hold air pollution (SDG 3.9). Although there have been improvements in the pro-portion with access to clean fuels and technologies in some regions, globally thesehave been largely outpaced by population growth. This means that the absolutenumber of people without access to clean fuels and technologies has stagnated,decreasing only by 3% between 2000 and 2017. As a result, the world is only pro-jected to achieve 74% clean fuel use by 2030 under current policy scenarios (SDG 7Custodial Agencies, 2019).In 2016 the World Health Assembly adopted a roadmap consisting of four pri-ority areas of action to tackle the health risks of air pollution, notably ‘expandingthe knowledge base about impacts of air pollution on health’ (World Health Organi-zation, 2016). Currently, the WHO publishes estimates of ‘polluting fuel use’ and‘clean fuel and technology use’, representing the combined use of all polluting fuelsand all clean fuels and technologies, respectively, for SDG monitoring. Here ‘use’is deﬁned as the proportion of people primarily relying on a given fuel or tech-nology for cooking. In addition, the WHO presently assumes that ‘clean fuel use’= ‘clean fuel and technology use’, due to the limited availability of data on thetypes of stoves used for cooking and the current absence of any scalable biomassstoves which can be considered ‘clean’ for health. These estimates are available formost countries, separately for urban and rural areas where fuel use trends oftendiﬀer systematically, and for each year between 1990 and 2017. Conventionally,these estimates then serve as a practical surrogate for estimating the global burdenof disease associated with using polluting fuels for cooking (Bonjour et al., 2013).However, basing estimates of health impacts on the combined use of polluting fuelsfails to take into account variation in the risks associated with diﬀerent fuels andtechnologies. Recently, Shupler et al. (2018) introduced a method for estimatingexposure for several speciﬁc fuel types that takes into account variation in exposurebetween countries. Despite this, global burden of disease estimates based on theuse of speciﬁc fuels remain unavailable, as this would also require global estimatesof speciﬁc fuel use.In this article, by developing and implementing a model for the use of eightspeciﬁc fuel types, we make a substantial contribution to the expansion of theknowledge base on the impacts of household air pollution. Our aims are to:(i) Estimate trends in speciﬁc fuel usage, together with coherent estimates ofuncertainty.(iii) Provide meaningful estimates of individual fuel usage for countries where datais limited.(iiii) Predict present-day fuel usage, addressing lags in data collection, and projectestimated trends into the future.Trends in the use of speciﬁc fuel types are modelled together with survey sam-pling variability, which may vary between urban and rural areas and by country. lobal Household Energy Model Where data for a given country is limited, the model structure can derive informa-tion from regional trends. The model allows for diﬀerent fuel use trends in urbanand rural areas and is able to produce predictions (with associated uncertainty) offuture use of diﬀerent fuel types, providing policy makers with a baseline againstwhich they can evaluate the eﬀectiveness of future interventions.The remainder of the paper is organized as follows: Section 2 provides details ofthe available data and the proposed modelling approach, including the implemen-tation of the model using Markov chain Monte Carlo (MCMC); Section 3 presentsposterior predictive model checking and a future forecasting experiment; and, ﬁ-nally, Section 4 provides an overall summary and a concluding discussion of themodel’s impact.

2. Methodology

Information on the types of technologies and fuels used by households for cookingis regularly collected in nationally-representative household surveys or censuses andcompiled in the WHO Household Energy Database (World Health Organization,2018a). As of mid 2019, the database contains over 1100 surveys, with over 150countries having at least one survey over the period 1990 to 2017. For each survey,the database contains the proportion of surveyed households using as their primarycooking fuel each of 10 key types: biogas; charcoal; coal; crop residues; dung;electricity; kerosene; liquid petroleum gas (LPG); natural gas; and wood.Over the period 1990 to 2017, the average number of surveys per country per yearis around 0.3. Even if survey coverage were far greater, survey sampling variabilitymeans that individual surveys would still not be a reliable indicator on which to basepolicy decisions. Statistical models can be used to separate trends from samplingvariability, while also allowing uncertainty in the trends to be appropriately quan-tiﬁed. Information from other sources, such as economic or social indicators, canalso be included to allow for more reliable inference in countries with few surveys.For example, Rehfuess et al. (2006) use regression methods to quantify the associ-ation between solid fuel usage and a number of socio-economic factors to predictusage in countries where no data was available. An alternative source of informationwhich can be exploited by statistical models is that the proportion of people usingeach fuel type as their primary cooking fuel tends to be more similar, on average,between countries in the same region, than between countries in diﬀerent regions.Figure 1 illustrates diﬀerences in wood use by WHO region, with smooth densityestimates of the proportion of households using wood as their primary cooking fuel,from surveys in years 1990 to 2010. For example, the density estimates suggest thatuse of wood is more prevalent in African countries than in European countries overthis period.Using this data to evaluate trends in the use of speciﬁc fuels presents a number ofchallenges related to inconsistencies in both the quality and quantity of informationthat is available from the surveys. We speciﬁcally address four of these issues inour modelling approach:

Stoner et al.

Proportion of Respondents Mainly Using Wood D en s i t y WHO Region

AfricaAmericasEastern MediterraneanEuropeSouth−East AsiaWestern Pacific

All Surveys (1990−2017)

Primary Reliance on Wood for Cooking by Region

Source: Household Energy Database 2019 (WHO)

Fig. 1.

Smooth density estimatesof the proportion of survey respon-dents relying on wood as their pri-mary cooking fuel by WHO region,from all surveys contained in theWHO Household Energy Database(1990-2017). (a) Many surveys report fuel values which are in some sense incomplete. Thisoften includes combining more than one speciﬁc fuel type (e.g. LPG andnatural gas) into a single option in the survey (e.g. gas). In some cases thiscan arise due to cultures and/or languages having a single term which includesseveral distinct fuel types (e.g. the French language term ‘charbon’ which caninclude both coal and charcoal). Another common problem is inconsistency inhow sub-fuels are categorised: for example, grass may be included in the cropresidues category in one survey and in the dung category in another. Other,less common, issues include non-exhaustive lists of individual fuel options, withkey fuels included in an ‘other’ category, resulting in missing values for thosefuels. These issues mean that the time series of survey values for some fuelsin some countries can be highly unstable.(b) The total number of respondents is only available for approximately 50% ofsurveys in the database. For surveys where this information is not available,only the proportions using each fuel are given and the original counts (thenumber of respondents using each fuel) are non-recoverable.(c) Information on trends in the use of speciﬁc fuels is required for both urbanand rural areas but, in many cases, surveys only provide data for the overallpopulation.

For clarity of exposition, the following explanation relates to y i , the number ofrespondents in a survey using fuel type i as their primary fuel for cooking, ignoringfor now any indices related to the country and the year. If we knew the total numberof survey respondents n for all data, a ﬁrst approach to modelling could be to assumethat data on y = { y i } arise from a Multinomial( p , n ) distribution. Then p i wouldrepresent the proportion of people in the population using fuel i . This assumesthat the survey sample is representative of the overall population. In reality, surveysamples are imperfect and the Multinomial model may not be suﬃciently ﬂexibleto capture the extra variability caused by ﬂaws in the survey design. For instance,the survey may not cover the whole geographical area of interest. lobal Household Energy Model A ﬂexible extension of this approach is to model y using a Generalized DirichletMultinomial( α , β , n ) (GDM) distribution, a mixture of the Generalized Dirichlet(GD) with probability density function (pdf): p ( p , p , ..., p k | α , β ) = p β k − − k k − (cid:89) i =1  p α i − i B ( α i , β i )  k (cid:88) j = i p j  β i − − ( α i + β i )  (1)and the Multinomial distribution, so that p ∼ Generalized-Dirichlet( α , β ); y | p ∼ Multinomial( p , n ) . (2)The marginal probability mass function (pmf) of the GDM is then: p ( y , y , ..., y k | α , β , n ) = Γ( n + 1)Γ( y k + 1) k − (cid:89) i =1 (cid:34) Γ( y i + α i )Γ( (cid:80) kj = i +1 y j + β i ) B ( α i , β i )Γ( y i + 1)Γ( α i + β i + (cid:80) kj = i y j ) (cid:35) . (3)Any additional variability caused by non-representative sampling can be poten-tially captured by the GD component. The GD also has a very ﬂexible covariancestructure compared to the Dirichlet, which it reduces to in the special case that β i = α i +1 + β i +1 for i ∈ , ..., k − β k − = α k .Recall from Section 2 (b) that, for around half of the available data, only theproportion x = { y i /n } of respondents using each fuel is available, with the totalnumber of respondents n being unknown. This means that we cannot use the GDMto directly model the number of respondents primarily using each fuel, if we wishto use all of the available data. However, as the principal interest lies in estimatingor predicting trends the fuel usage proportions x , an alternative approach would beto model the proportions themselves, for example using a GD distribution. In thatcase, though, the presence of many 0% and 100% fuel usage observations (whichfall outside the range space of the GD) make this impractical. Instead, we opt foran approximate procedure for modelling x , namely by transforming observationsof x i into conceptual counts v i , out of a chosen total N . To ensure that the sumof the transformed counts does not exceed N , one can compute v i = (cid:98) N x i (cid:99) (usingthe ﬂoor function, as opposed to rounding). The counts v can then be modelledas GDM( α , β , N ), so that predictions are based on v i /N . The idea behind this isthat the ﬂexibility of the GDM means that we can still capture the distribution of x well: any variability lost or gained from the Multinomial component, by respectivelyusing a larger or a smaller N compared to the original n , can be accounted for byappropriate adjustment in the parameters of the GD component.In Appendix A, we present a simulation study using the observed sample sizes n from the data. We illustrate that this approximate method yields an inference forthe population-wide fuel usage which converges (as N increases) to the inferenceobtained by modelling y directly. Our simulation experiment suggested that valuesgreater than N = 10000 are likely suﬃciently large, so we conservatively opt for N = 100000. This results in a virtually zero contribution to the variability of v /N from the Multinomial component, bearing that in mind that the GD componentcan absorb any additional variation associated with smaller sample sizes. Stoner et al.

To motivate the way in which we will employ the GDM for these data, it is instruc-tive to consider Figure 2, which illustrates key cooking fuel types and how they aretypically aggregated into more general classiﬁcations, e.g. solid fuels. In principle,it is possible to model the use of speciﬁc fuels directly using the GDM: v , . . . , v ∼ GDM( α , β , N ); (4) { , . . . , } ≡ { wood , cropwaste , dung , charcoal , coal , kerosene , (5)electricity , LPG , natural gas , biogas , others } . Predictions for aggregate groups, e.g. solid fuels, can then be achieved by aggre-gating predictions for the individual fuels. However, recall that one of the keychallenges with modelling this data, (a), is inconsistency in data collection. For ex-ample, some surveys combine more than one fuel type (e.g. charcoal and coal) intoa single category. Furthermore, there is sometimes inconsistency in the way surveyscategorise sub-fuels (e.g. grass). The result of this issue is that, for some countries,the time series of aﬀected individual fuels are unstable. As such, modelling the useof all individual fuel types with one GDM (as in (4)) will adversely impact estimatesfor the mean trends, sampling variability and any associated uncertainty, not justfor aﬀected fuels but for the other fuels as well, owing to the multivariate nature ofthe model and the data.

Fig. 2.

Hierarchy of cooking fueltypes in the Global Household En-ergy Model.

Wood Crop Waste DungBiomass Charcoal CoalSolidFuels Kerosene GaseousFuelsElectricityLPG BiogasNaturalGas OtherFuels

Hierarchy of Cooking Fuel Types

Global Household Energy Model

Fortunately, as they are the result of ‘confusion’ among certain fuel types, theseissues can be resolved by aggregating individual fuels into more general fuel types.For example, confusion between wood, cropwaste and dung can be resolved byaggregating data for these fuels into the more general category ‘biomass’ (whichin this paper includes raw/unprocessed biomass fuels but excludes charcoal), whileany outstanding confusion between charcoal and coal or between charcoal and woodcan be resolved by aggregating into ‘solid fuels’. Similarly, LPG and natural gasare very commonly combined at the survey level, which can be recognised by theformation of a ‘gas’ aggregate category.This motivates the adoption of a tiered approach, where the use of the mostaggregated fuel categories (e.g. solid fuels and gaseous fuels) are modelled as GDM lobal Household Energy Model at the ‘top’ tier (note that the tier does not relate to the merits or abundance ofeach fuel, only how we organise the fuels for modelling purposes), alongside otherfuels that are unlikely to be confused or combined (e.g. kerosene and electricity)and an aggregation of other minor fuels and technologies (e.g. alcohol, solar stoves): { v solid , v kerosene , v gas , v electricity , v others } ∼ GDM( α , β , N ) . (6)This ensures that any instabilities arising from erroneous convolution of individualfuel types, e.g. charcoal and coal, does not propagate into the other fuel categoriesin the top tier. These categories can then be progressively disaggregated throughnested GDM models. As in some countries there is convolution between biomassfuel types (e.g. wood and cropwaste), fully disaggregating solid fuels means that, inthese countries, predictions for charcoal and coal will still be needlessly impacted.To address this, a ‘mid’ tier is introduced to aggregate the biomass fuel types andmodel these alongside charcoal and coal: { v biomass , v charcoal , v coal } ∼ GDM( α , β , v solid ) . (7)The biomass fuel types can then be disaggregated in the ‘lower’ tier with a thirdGDM model: { v wood , v cropwaste , v dung } ∼ GDM( α , β , v biomass ) . (8)We could then disaggregate ‘gas’ into the three individual gaseous fuels with a fourthGDM model (a parallel mid-tier). This is however not essential for our application(estimating population exposure to household air pollution) as the diﬀerence be-tween the diﬀerent gaseous fuels in terms of pollutant concentrations is minimalcompared to the diﬀerence between the gaseous fuels and the polluting fuels (WorldHealth Organization, 2014). Following this approach, the result is that a joint pre-dictive inference for 8 individual fuel types is achieved, but in a way which preventsinconsistency in particular fuel types from aﬀecting the others. Recall that an additional challenge, (a), is that occasionally a value x i (and thus v i )is missing for at least one individual fuel (for a given country-year combination). Tomodel this data in a way that easily allows prediction of the missing fuel values, weimplement each GDM (from the three tiers) using the implicit conditional densitiesrather than the joint one. Speciﬁcally, for counts v and total N , the conditionaldistribution of (fuel) v i given the others is: v i | v − i , α , β ∼ Beta-Binomial  α i , β i , n i = N − (cid:88) j

1) representthe mean proportion of survey respondents living in an urban area, in country c andyear t . To capture structured demographic variability between countries and overtime, UN estimates (United Nations, 2018) of the proportion of people living in anurban area for each country and year, P c,t , are included as oﬀsets in the model for π c,t . For each country, any remaining structured variability in the urban proportionis modelled using a smooth function g c ( t ). These functions should ideally be ﬂexible lobal Household Energy Model enough to capture the mean urban proportions well. However, from a modellingperspective, they also introduce extra degrees of freedom to capture the overallsurvey observations well. Therefore, to avoid over-ﬁtting, we once again employpenalized thin-plate splines for g c ( t ): g c ( t ) = κ ,c + κ ,c X t, + K (cid:88) k =2 κ k,c X t,k ; (27) κ ,c ∼ Normal(0 , σ κ )0 ); (28) κ ,c ∼ Normal(0 , σ κ )1 ); (29) { κ , . . . , κ k } ∼ Multivariate-Normal( { κ , . . . , κ k } , Ω − κ ) c ) . (30)Each precision matrix Ω − κ ) c is, for one ﬁnal time, a known matrix, scaled bya penalty parameter λ ( κ ) c for smoothness. Then, log( λ ( κ ) c ) ∼ Normal( υ ( κ ) , σ κ )2 ).Unlike the splines for ν i,j,c,t , the prior expectations are zero, as opposed to regional orsuper-regional. This is because we have no prior belief that residual deviation fromUN estimates in the sampling of urban respondents should be regionally structured.Employing thin-plate splines here allows g c ( t ) to capture non-linear deviations from P c,t over time, but only when there is ample evidence in the data for a given country. In addition to as the main data-speciﬁc modelling challenges highlighted in Section2, the database contains some recorded values which truly defy the observed trendin their country. These values often can’t be explained by normal survey variabil-ity alone, and can have an undue inﬂuence on the estimated trend if treated likeordinary observations. While the Beta-Binomial conditional models we employ arealready more robust to outliers than equivalent Binomial models, severe outliers canstill cause issues, including causing the estimated trend to deviate substantially fromother surveys to be closer to the outlier, or the over-estimation of survey variability.To address this problem, we model each observation as arising from a mixturedistribution, which combines the Beta-Binomial conditional model with a discreteUniform distribution. The extent to which the model is either Beta-Binomial orUniform is controlled by the mixing parameter ρ : as ρ approaches 0, the mixturebecomes Beta-Binomial and vice-versa: p ( v i | v − i , α , β ) = ρ (cid:18) n i v i (cid:19) B ( v i + α i , n i − v i + β i ) B ( α i , β i ) + (1 − ρ ) 1 n i . (31)This approach eﬀectively allows the model to decide, given suﬃcient evidence inthe data, whether or not a survey observation could plausibly have arisen from thesame model as other nearby (in time) surveys for that country and area. The degreeof evidence required can be controlled through the prior distribution speciﬁed foreach ρ . For example, a strong prior distribution with most of the probability massclose to 0 for each ρ corresponds to a strong belief that each survey value is veryunlikely to be an outlier. Stoner et al.

For this application, we introduce one ρ for each unique survey. This means thatif a survey has an urban, rural, and an overall value, a single ρ controls the extent ofmixing for all three. The reason for this is that if, for example, the model indicatesan urban value is a very severe outlier, we would prefer to also reduce the eﬀect of thecorresponding rural value on estimated trends and uncertainty. Including this layerin the model means that estimated trends are considerably more robust to outliers,as we will highlight in Section 3.1. Additionally, predictions for ρ are useful as anindicator to eﬃciently ﬂag surveys that may warrant further investigation. For all hyper-parameters υ which are the mean of a Normal distribution (e.g. υ ( β ) i,j ), we speciﬁed non-informative Normal(0 , ) prior distributions. For all hyper-parameters σ which are the standard deviation of a Normal distribution (e.g. σ ( β )0 ,i,j ),we speciﬁed non-informative positive-truncated Normal(0 , ) prior distributions.All code was written and executed using R (R Core Team (2018)) and the modelwas implemented using NIMBLE (de Valpine et al., 2017), a facility for highlyﬂexible implementation of Markov Chain Monte Carlo (MCMC) models. For thisapplication, we needed to add the Beta-Binomial distribution to NIMBLE, whichwas straightforward using only a few lines of R code. Four MCMC chains wererun for 80,000 iterations from diﬀerent randomly generated initial values and withdiﬀerent random number generator seeds. The ﬁrst 40,000 samples were discardedas burn-in and, to limit system memory usage, the remaining samples were thinnedby 10. Convergence of the MCMC chains is discussed in Appendix B. The modelwas applied to a subset of the data consisting of 1084 surveys and predictions weremade for all countries with at least one survey (after selection). Survey selectioncriteria are discussed in Appendix D. Associated NIMBLE model code is includedas supplementary material and data is available on request.

3. Model Checking

The task of assessing the validity of the statistical model is divided into two parts:basic procedures to check there are no systematic issues with reproducing the ob-served data and a forecasting experiment to evaluate the ability of the model topredict future fuel usage values.

Given the Bayesian implementation of the model, assessing the ﬁt to both in-sampleand out-of-sample data is based on posterior predictive model checking (Gelmanet al., 2014). For in-sample data, this involves using samples from the joint pos-terior distribution of parameters and random eﬀects (which are already availablefrom MCMC) to simulate v i from the conditional Multinomial distribution. Thisresults in samples from the posterior predictive distribution for replicates ˜ x | x ofthe observed fuel proportions x . The statistical properties of these replicates can lobal Household Energy Model then be compared to properties of the corresponding observations. For brevity, wepresent predictive checking for solid fuel use in this subsection and for all of theother fuel types in Appendix C. llllll llll llll lllllllllllllll lllllllllll llllllllllllllllll ll lllllll llllllllll llllllll llllllllll lllll lllllll ll lllll lllll llll llllllllllllll lll lll lllllll lllll llll lllllllll ll llllllll llllll llllllllllllllllllllllllll llll llllll l lllll lllll lllllllllllllllllllllllll lllll llllllllll lllll llllllllll llllll llllll llllllllllllllllllll l lllllllllllllll lll lllllllll lllllllll lllllll l lll llllll ll lllll ll lll lllll llll ll l ll l llllllll llllllllllll llllllllllll l lllllll llllllllllll ll lll lll llllll l llllll ll ll llllll ll ll l lll ll lllll l l llll l llll lll ll l ll llll llll l llllll lll ll ll ll l ll lll lll lll l ll lll lll lll l ll ll ll l lllll ll llll lll llllllll lll l ll lll l ll ll llll l llll lllll l llll ll lll lll l llll llll l ll llllll l ll ll l llll ll lllllll llllll llll lllll l ll lllllll l ll ll llll l lllll ll ll llll ll l lllll lllll l lll llll llll l ll lll llll l lll lll ll lllllllllllll llll lllll ll llllll lll llll ll llll llllllll llllllllllll l llllll lllll l llllllll llllll lllllllllll llllllllllllllllll ll lllllll llllllllllllllllllllllllllll lllll llllllll ll lllll lllll lll lllllllllllllll lll lll lllllllll lll llll lllllllll ll llllllll llllll lllllllllllllll lllllll ll ll llllllllll llllll lllll lllllllllllllllllllllllll l lllllllllll llll lllll llllllllll llllll llllll lllllllll lllllll llllllllllll llll lll lllllll l lllllllll l lllllllllll llllll lllllll ll l lllllll llll lll ll lllllllll lllllllll lll lllllllllll lll llllllll llllllllllll ll lll lll ll ll ll l lll lll ll ll llllll ll ll lll lllll ll l l llll lllll llllllllll ll llll lllllll lll ll ll ll llllll lll lll l ll l ll lll l lll l lll llllllll lll llllllllllllll lll lll ll llllllll l llllllllllllll lllll lll lllll llll lll llll ll l ll ll l lll l ll lllll l l llllll llll lllll llllllllll lllll lll lllll ll lllllll l lllll l llllll llll l lllll llll l lll lll lllllllll l l lll llllll ll ll lll llllllll ll l llll ll lll lll lllll lllllllllllll l llllll llllll lll ll lllllllllll lllllll lllllllllll lllllllllllllllll ll llllllll llllllllllll lllllllllllllllllllll llllll llllllll llllllllllllllllllll lllll lllll lllllllllll llllllll llll lll llllllllllllll llllll lllllllllll lll llllllllllll llllll lllllllllllllllllll lllll ll ll llllllllll lllllllll llllllll llllllllllllllllllllllllllll lll llllll llllllll llllllll l l lllll lllllllllll l lllllllllllllll llllllllllllllllllll ll llllllllllllllllll lllll llllll lllllll llllllllllllll lllll lll llllllllllll lllllll l lll llllllllllllllllll ll ll ll llllll llll lll lllllll l llllllllll llllllllllll llllllllllll ll lllllllll llllllllllll ll lll lll ll ll ll l lll lll ll ll llllll llll lllll lll ll llllll llllll l lll l l llll lll l lll ll ll ll l llll lllllll lll lll ll l l lll lll l ll lll l ll l ll lllll lll l lll ll l lllll ll llllllllllllllll lll lll ll ll l llll llllll l ll ll lllll lllll llllll lll ll ll lllll llll l ll l lll ll llllll llllllllll ll llll llllll l ll ll l lll lll llllll lll ll llll llllll l lllll llllll l lll lllllll ll llll lll l llll lll lll llllll ll ll lllll ll l ll lllll lllll ll ll llllllll llll l ll lll lllll lll lll ll llllllllllll lllll l lll lllll lll ll llllll ll llll ll llll ll llll l lllllll ll lllllllllllllllll l S o li d Median Predicted Usage O b s e r v ed U s age Fig. 3.

Scatter plots com-paring posterior means ofsolid fuel usage replicates ˜ x ,j,c,t to their correspond-ing observed values. In the ﬁrst instance, scatter plots comparing the posterior means of the replicateswith the observed values can give an indication of any systematic issues. These areshown for solid fuels in Figure 3 and, for the most part, there are no obvious sys-tematic problems. Also shown are coverage values: the proportion of observed solidfuel use values which lie within the 95% posterior predictive intervals, computedfrom the corresponding replicates. A coverage substantially lower than 95% wouldmean a high proportion of observed values are extreme values with respect to theposterior predictive model, implying a poor ﬁt. In this case, the coverage valuesfor the 95% credible intervals were higher than 95% for all fuels and areas. Takentogether, these two checks indicate that the model captures the observed data well. ll lllll lllll l l ll lll lllll lllll l l ll lll lllll lllll l l ll lll lllll lllll l l ll l lll lllll lllll l l ll llll lllll lllll l l ll llll lllll lllll l l ll llll lllll lllll l l ll l lll lllll lllll l l ll llll lllll lllll l l ll llll lllll lllll l l ll llll lllll lllll l l ll l

Urban Rural Overall W ood C r op w a s t e D ung C ha r c oa l India ll lllll lllll l l ll lll lllll lllll l l ll lll lllll lllll l l ll lll lllll lllll l l ll l lll lllll lllll l l ll llll lllll lllll l l ll lll lllll lllll l l ll lll lllll l lll l l ll l lll lllll lllll l l ll llll lllll lllll l l ll lll lllll lllll l l ll lll lllll lllll l l ll l

Urban Rural Overall C oa l K e r o s ene G a s * E l e c t r i c i t y Fig. 4.

Predicted fuel usage trends (median and 95% prediction intervals) for India.Coloured points are survey observations and black points are removed surveys. For eachfuel, the left, central and right plots show urban, rural and overall usage, respectively.

Another way of checking the model is to compare predicted trends to survey Stoner et al. observations on an individual country basis. Figure 4 shows the median predictedproportion using each fuel in each segment (urban, rural and overall) of India, withassociated 95% posterior predictive intervals. Here it can be seen that the predictedtrends follow the observed trends well, with prediction intervals that envelop a rea-sonable number of surveys. Moreover, by examining the tightness of the predictionintervals with respect to the variance of the observations, we can see that the highcoverage values obtained for the replicate prediction intervals are not simply causedby excessively high model uncertainty. l l l l ll ll l llll lll ll l l ll lll ll l l ll lll l l l ll ll l llll ll l l l l ll ll l llll lll ll l llll ll ll l llll ll l l l ll ll l llll ll l l l l l ll ll lllllll lll l ll lllllll llll l ll lllllll llll l l l l ll ll lllllll ll

Urban Rural Overall W ood C r op w a s t e D ung C ha r c oa l Colombia l l l l ll ll l llll lll l l l l ll l llll lll l l l ll ll l llll lll l l l ll ll l llll ll l l l l ll ll l llll lll l l l l ll l llll lll l l l ll ll l llll lll l l l ll ll l llll ll ll l l l ll ll lllllll llll l l l l l ll lllllll lll l l l l ll ll lllllll lll l l l l ll ll lllllll ll

Urban Rural Overall C oa l K e r o s ene G a s * E l e c t r i c i t y Fig. 5.

Predicted fuel usage trends (median and 95% prediction intervals) for Colombia.

A similar plot is shown for Colombia in Figure 5. Looking at the use of gas in2010, we can see there is one survey with an unusually high overall value. Throughthe ρ corresponding to this survey, the model suggests that this value is likely anoutlier, such that the estimated trend and variability are not adversely aﬀected. Thisillustrates the eﬀectiveness of incorporating mixture distributions (as described inSection 2.6) in making the model more robust to outliers.Note that to check whether the model reproduces the observed data well, theoverall predictions in Figures 4 and 5 incorporate the model’s prediction of anysystematic deviation ( g c ( t )) from the UN estimates of urban and rural proportions,in the sampling of urban and rural respondents. If desired, predictions of overallfuel usage can instead be based solely on the UN estimates of urban and ruralproportions (rather than based on the proportions in the surveys). This is achievedby removing g c ( t ) from (26) during simulation.Predicted fuel usage plots which include survey sampling variability (as in Fig-ures 4 and 5) are included as supplementary material for the 8 most populouscountries (as of late 2019, excluding the US and Russia).We can also inspect the model’s ability to capture structured between-countryand temporal variability in the proportions of urban and rural respondents in lobal Household Energy Model the survey samples: Figure 6 shows the proportion of (unweighted) respondentsrecorded as urban in the fuel surveys for Kenya (left) and Malawi (right) comparedto UN estimates and predicted values from the model. The plot for Kenya showsevidence that the proportion of urban respondents in the surveys is, on average,higher than the UN estimates ( g c ( t ) > g c ( t ) ≈ ll l l l l ll l U r ban P r opo r t i on l SurveysModel PredictionU.N. Estimate

Kenya l l l l l l ll U r ban P r opo r t i on l SurveysModel PredictionU.N. Estimate

Malawi

Fig. 6.

Predicted meansurvey urban propor-tions for Kenya (left) andMalawi (right), comparedto observed survey urbanproportions and associ-ated U.N. urban populationestimates.

The model’s ability to predict (forecast) fuel usage beyond the range of the datacan be assessed using out-of-sample predictive testing. This is important to validatethe model’s use for predicting present-day fuel use, as there is a lag in data collec-tion of 1-2 years, and for projecting estimated trends into the future, to provide abaseline against which the eﬀects of interventions can be compared. To emulate ahypothetical forecasting scenario, the model was ﬁtted only to surveys up to andincluding year 2012, therefore excluding 5 years (approximately 22% of the data).We then used the model to predict 5 years into the future and produce predictivedistributions for the out-of-sample surveys. As it is not our primary interest toforecast how any systematic trends in the sampling of urban and rural respondentswill progress in the future, we focus on checking the out-of-sample prediction ofurban and rural surveys.Figure 7 shows scatter plots comparing the out-of-sample survey values to themean predicted values from the model. While there are some values which arenot captured well (some potentially due to errors in data entry), generally themodel does not seem to systematically over or under-predict. Notably, the coveragevalues tend to be quite high, indicating that the model produces reliable uncertaintyestimates when predicting into the future.To guard against high coverage values through unreasonably uncertain predictionintervals, we can assess the model’s performance when forecasting by examiningpredictive plots for individual countries. Figure 8 shows predictive fuel usage plotsfor Ghana, from the model where surveys from 2013 onwards are excluded. Here,the removed surveys are generally well within the 95% predictive intervals, which Stoner et al. grow reasonably larger for predictions further into the future, but are not so widethat they are impractical. ll lll lllll lll ll llll l lll llll ll lll llll lllllll llllll lll l ll llll l ll lll l l lllll lllll l l llll lllll llllll llll llllllll ll lll l llll lllll ll ll ll lllll l l lll l ll ll l ll lll ll llllllll lllll ll llll ll llll llll lll lllllll l l l lll l llll lll ll lll ll llll l l ll ll ll ll ll l ll lllll lll ll lll l ll ll lll ll lll ll lll llll ll llllll llll llllllll ll lll ll lllllll ll lllllll l l ll ll ll l ll ll ll ll llll ll ll ll lll llll llllll lllll lllllllllllllllllllllllllllllllllllllllllllllll llllllll ll lll lll ll ll lll llll l lll llllll lll llll llllll llllll llll ll lllll ll lll lllllll l lll l l lll l l lll l l llllll llll llllllll ll ll ll llllllll l llllllll ll ll l lll l lll lll lllllllll llllll ll llll ll ll ll ll llll lllllll l l ll lll l llllll ll lll lll lll l lll llll ll lllllllllllll lll ll llll llllll lll lll ll lll llll ll llll llllll lllllllll llll llllll ll llllll l ll lllll lll ll lllllll lllll lllll lllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllll llllll B i o m a ss C ha r c oa l C oa l Median Predicted Usage O b s e r v ed U s age Removed Surveys (2013−2017) llllllll ll lllllllll llll lllll lll lll lll lll lll llllllllll lllllllllllllllllllll llll llllll llllllll ll lllllllllll lll lllllllll lllll l l lll lll ll lllll llll llll ll l lll ll l ll l l lll lll lllllll ll llllll ll llllll lllll ll lll ll ll ll l ll lll llll lllll ll llll l lllllllll ll ll ll l ll lllllll llllllll lll l lll l ll l ll lll llllll llll llll lll lllll l ll llll lll ll lllll l ll llll l ll l lll l ll llll llllll lllll ll llll llll lll lll lllllllll llllllllllllllll lllll llllllllllllllllllllllllllllll llll llllll llllllllllllll llllll llllll lllllllllll lll lllll lllllll l lllllllll l llllllllllllllllllllllll lllllll llllll lll ll lllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllll l lll lll ll llll llll llllllllll ll l ll l l ll lll lllll lllllllllllllll llll ll lll ll lllll lllll llll lll ll ll llll l llllllllllllll l ll llll lllllllllll llll ll l lll lllll llll lll llllllll l ll ll l lll ll l llll llllllll lll llll llllllllllll lllll lllll lllllllll lllllllllllllllllllll llll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llll lllllllllll llllllllll K e r o s ene G a s E l e c t r i c i t y Median Predicted Usage O b s e r v ed U s age Fig. 7.

Scatter plots of mean predicted fuel usage values from 2013 onwards, versus theirobserved values, from the model which was only supplied data from 2012 or earlier. l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l l l ll ll ll ll lll ll l ll l ll lll lll ll l ll ll l l l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l l l ll ll ll ll lll ll l ll l ll lll lll ll l ll ll l l l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l l ll l lll ll ll lll l l l lll ll l l ll l lll llll l l l lll l l l l ll ll ll ll l lll ll l l ll l ll lll l lll l ll l ll l ll l l l l ll l lll ll ll lll l l l lll ll l l ll l lll llll l l l lll

Urban Rural Overall W ood C r op w a s t e D ung C ha r c oa l Ghana ll ll ll ll l l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll ll ll ll ll l l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l ll l lll ll ll lll l l lll ll l ll l lll llll l l lll l ll ll ll l ll l l l ll l lll ll ll lll l l l lll ll l l ll l lll llll l l l lll l l ll l lll ll ll lll l l l lll ll l l ll l lll llll l l l lll l l ll l lll ll ll lll l l l lll ll l l ll l lll llll l l l lll

Urban Rural Overall C oa l K e r o s ene G a s * E l e c t r i c i t y Fig. 8.

Predicted fuel usage trends (median and 95% prediction intervals) for Ghana, fromthe model where surveys from 2013 onwards were excluded. The black points from 2013onwards show excluded surveys.

4. Discussion

Currently, the health burdens associated with exposure to air pollution from theuse of polluting fuels for cooking are assessed based on groupings of fuel types (i.e. lobal Household Energy Model solid fuels or polluting fuels). However, this fails to take into account changes inthe use of speciﬁc fuel types that may aﬀect the impacts on health. For example,the results of the analyses performed here suggest that over the last few decadesa substantial proportion of urban households in Sub-Saharan Africa have switchedfrom raw biomass fuels (i.e. wood, cropwaste and dung) to charcoal, which hasvery diﬀerent emissions characteristics. To expand the knowledge base about theimpacts of air pollution on health, burden of disease calculations should instead bebased on the use of speciﬁc fuels, but until now country-speciﬁc estimates of speciﬁcfuel usage have been unavailable.To address this, we have developed and implemented a multivariate hierarchicalmodel for speciﬁc fuel types which aims to: (i) estimate trends and associatedmeasures of uncertainty, for speciﬁc fuels, for every country, and separately for urbanand rural areas within a coherent modelling framework; (ii) provide meaningfulestimates in countries where there is limited data; (iii) forecast fuel usage up topresent day and into the future.Based on Generalized Dirichlet Multinomial distributions, the Global HouseholdEnergy Model (GHEM) automatically constrains the proportions of populations us-ing each of eight key fuel types ensuring that their sum does not exceed one. Setwithin a Bayesian modelling framework, parametric and predictive uncertainty isquantiﬁed (e.g. by 95% prediction intervals) and veriﬁed using within-sample poste-rior predictive checking (see Section 3.1). Where data availability is limited withina country, the model is able to ‘borrow’ information from neighbouring countriesusing nested country, regional and super-regional random eﬀects, reducing predic-tive uncertainty. The model can forecast a number of years beyond the extent ofthe data, with assessment of forecasted values performed using an out-of-samplepredictive experiment (see Section 3). This allows present-day fuel use to be eval-uated, as data collection lags behind by 1-2 years. In addition, fuel use predictionsfor future years provide a baseline representation of what might be expected in theabsence of intervention, to which future surveys conducted post-interventions canbe compared.In achieving these aims, the model overcomes a number of challenges associatedwith using these survey data: (a) inconsistency in survey design and collection,together with missing values, which can lead to highly unstable time series for someindividual fuels in some countries; (b) the total number respondents is unavailablefor around half of surveys; (c) for many surveys, fuel use values are not availableseparately for urban and rural areas.To address (a), we adopted a tiered approach (Section 2.2) where we ﬁrst mod-elled combined fuel use (e.g. solid fuels), which is progressively disaggregated intothe component fuels. This ensures that excess variability and uncertainty among‘confused’ fuels does not propagate into those unaﬀected, and that predictions forthe aggregate quantities are stable. In order to address the problem where the totalnumber of respondents is unknown, (b), we approximate a GDM model for the num-ber of respondents, transforming the proportions using each fuel into counts from anartiﬁcial sample size (Section 2.1). We illustrate that this results in approximatelythe same inference for population-wide fuel usage as modelling the (unavailable) Stoner et al. number of respondents in their original count form, through a simulation experi-ment (presented in Appendix A). We addressed the unavailability of informationon separate urban and rural fuel use for all surveys, (c), by including a layer inthe model which links the urban, rural and overall fuel use values for each survey.Structured between-country and temporal variability in the proportion of urbanrespondents was then accounted for by combining UN estimates with smooth func-tions of time for each country. Finally, in addition to addressing these data-speciﬁcchallenges, mixture distributions were employed to make the model more robust topotential outliers.To date, the model has been adopted by the WHO to produce estimates of theproportion of people in each country who rely on polluting fuels as their primaryfuel and technology for cooking and has played a central role in monitoring SDG7.1.2 (SDG 7 Custodial Agencies, 2019). It has also played an important role inidentifying data that appear to be out-of-line with general country-level patternsfor further investigation. Ultimately, the proposed modelling approach providespolicy-makers with decision-quality information and enables a ground-breaking re-assessment of the health impacts of cooking with polluting fuels and technologies.

Acknowledgments

This work was supported by a Natural Environment Research Council GW4+ Doc-toral Training Partnership studentship [NE/L002434/1] and the World Health Or-ganization contract APW 201790695.

Appendix A Simulation Experiment

To illustrate the validity of our approximation for modelling the proportions usingeach fuel type x = y /n , we present a simulation experiment using the 598 observedsurvey samples sizes n . The majority are in the range 1000-100000, with a mode ofaround 10000. At these large values, the contribution of the Multinomial varianceto the total variance of x would be small.For each available n i ( i = 1 , . . . , y i = { y i, , y i, , y i, , y i, } from a GDM model. Here, each country has a diﬀerent(time constant) marginal mean vector µ c and variance parameters φ c (preservingthe original associations between the countries and observed n i in the data, andignoring countries with no observed n i ) . Note that some countries will only haveone y i and others will have several (each with its own unique n i ). We simulate allof the µ c from a Dirichlet( ) distribution, and all of the φ c independently from aGamma(4 , .

1) distribution (inducing a moderately high degree of over-dispersion,compared to the Multinomial): y i ∼ GDM( µ c , φ c , n i ); µ c ∼ Dirichlet( ); φ c ∼ Gamma(4 , . . (32)In the baseline scenario, to which we will compare our approximate method, wehave observations for all of the n i and all of the y i . This allows us to implement the lobal Household Energy Model above model directly, which we do in a Bayesian setting using a Dirichlet( ) priorfor each µ c and a non-informative Exponential(0.001) prior for each φ c .In the second scenario, we don’t know any of the n i or the y i , but we do haveobservations for x i = y i /n i . In this scenario, we can apply our approximate method(from Section 2.1), where we ﬁt the GDM to constructed counts v i = (cid:98) N x i (cid:99) . Weproceed to apply this method whilst varying N over a range of values (10, 20, 30,50, 100, 300, 1000, 3000, 10000, 30000, 100000, 300000, 1000000), so that we caninvestigate the impact of this choice on parameter inference. l l l l l l l l l l l l l N M ean S qua r ed E rr o r ll ApproximationBaseline

Marginal Mean Proportions l l l l l l l l l l l l l N V a r [ m | x ] Fig. 9.

The top panel shows the median, in-terquartile range (dark) and 95% interval (light) ofthe mean squared differences between the pos-terior samples of the marginal mean proportions µ ,c , . . . , µ ,c and their corresponding true values,from the approximate model with varying N . Simi-larly, the bottom plot shows the median, interquar-tile range and 95% interval of the posterior stan-dard deviations of µ ,c , . . . , µ ,c . The dashed linesrepresent these results from the baseline model. Recall that in our application we are primarily interested in correct inference forthe marginal mean proportions µ c (the population-wide fuel use in each country),and we claimed that a suﬃciently large choice of N yields a parameter inferenceapproximately the same as if we had modelled the y i directly, along with the samplesizes n i . To assess this, we begin by examining the models’ accuracy when predictingthe true marginal mean proportions µ c . For each posterior sample, we can computethe mean squared error between the predicted values of µ c and the true values. Thetop panel of Figure 9 shows the median of this statistic, for varying N , as well asthe inter-quartile range (dark), and 95% prediction interval (light). Compared tothe same statistics for the baseline model, shown as horizontal lines, we can see thatthe distribution of mean squared errors for the approximate method does indeedconverge to the baseline model as N increases, from about N = 10000 onwards.We can also examine how the approximate method quantiﬁes uncertainty in µ c . For each individual µ ,c , . . . , µ ,c , we compute the standard deviation of theposterior samples. The median of these posterior samples are then shown for each N in the bottom panel of Figure 9, once again alongside the inter-quartile range and95% interval. The distribution of posterior standard deviations for the approximatemethod also converges to the baseline model, but does so for a much lower N (between 100 and 1000) than the mean squared error.Finally, if we choose a single value of N , we can compare more closely the ap-proximate method to the baseline model when estimating µ c . Figure 10 compares Stoner et al. ll lll ll lll lll l lll lll ll ll llll l lllll ll lll lll ll l ll ll llll ll llll l lll ll ll ll ll l l lll llll lll l l ll ll lll l lll lllll ll ll l lll ll l ll l lllllll ll l ll ll l ll ll lll l ll lll ll lll ll llll lll l ll l l ll l llll lllllll ll l ll llll ll ll llll lll l lll llllll lll ll lll ll ll ll l ll lll llll ll lll l llll ll lll l ll l ll ll lll l ll lll lll l llll ll lll lll llll ll l ll ll l ll ll ll lll l lll l ll lll ll llll lll ll l lll l lll lllll lll lll ll lll lll ll lll lll l lll ll l ll ll lll l ll ll ll l lll lll l ll lll ll l ll lll l ll ll ll lll ll l ll llll ll llll ll ll ll lll ll lll lll l lll ll l ll ll llll l lllll ll lll lll ll l ll ll llll llllll l lll ll ll ll ll l l lll llll lll l l ll ll lll l lll llll l ll ll l lll ll l ll l ll lllll ll l ll ll l ll ll lll lll lll ll lll ll llll lll l ll l l ll l llll lllllll ll l ll llll ll ll llll lll l lll llllll lll ll lll ll ll ll l ll lll llll ll lll l l lll ll ll l l ll l ll ll lll l ll lll lll l llll ll lll lll llll ll l ll ll l ll ll ll lll l lll l ll lll ll llll lll ll l lll l lll lll ll lll lll ll lll lll ll lll lll l lll ll l ll ll lll l ll ll ll l lll lll lll lll ll l ll lll l ll ll ll lll ll l ll lll l ll llll ll ll ll lll ll lll ll l l lllll l ll ll llll l lllll ll lll lll ll l ll ll llll l lllll l lll ll ll ll ll l l lll llll lll l l ll ll lll l lll l lll l ll ll l lll ll l ll l ll lllll ll l ll ll l ll l l lll lll lll ll lll ll llll lll lll l l ll l llll lllllll ll l ll llll ll ll llll lll l lll llllll lll ll lll ll ll ll l ll lll llll ll lll l l lll ll ll l l ll l ll ll llll ll lll lll l llll ll lll lll ll ll ll l ll ll l ll ll ll ll ll lll l ll lll ll llll lll ll l lll l lll lll ll l ll lll ll lll lll ll ll l ll ll lll ll l ll ll lll l ll ll ll l lll lll lll lll lll ll lll l ll ll ll lll ll l ll llll ll llll ll ll l llll ll lll ll lllllll lll ll lll ll lllll lllll lll ll lllll lll ll llllll lll ll l lll lll llll llll l ll l lll ll llll lll l lll lll ll llll lllll lll lllllllll lll llll lll lllll ll llll l ll lllllllllll llll llll lllllll llll lllll l lll llllll lllll l llllllll l llll ll ll llllllll llll lll lll l lll ll ll lllll ll ll ll lllllll lll lllll lllll ll l ll ll ll llllll llll ll ll ll lll l lllll ll llll lll ll l llll lll llllll llllll llll l ll llll lll lllll ll ll lll lll lll lll lllll lll lll ll ll ll ll l lllllll ll ll lll lll llllll llll ll ll

Baseline Model A pp r o x i m a t e M ode l ( N = ) Marginal Mean Proportions

Fig. 10.

Scatter plots comparing the posterior 2.5%, 50%, and 97.5% posterior quantiles,and posterior standard deviations for the marginal mean proportions µ ,c , . . . , µ ,c , from theapproximate model with N = 10000 , to the baseline model. the 2.5%, 50%, and 97.5% posterior quantiles for the µ ,c , . . . , µ ,c from the ap-proximate model with N = 10000, to the quantiles from the baseline model. Thequantiles are virtually identical, suggesting that for this simulated data the sameinference for µ c would be achieved either by modelling the true counts y i directlyor by modelling the constructed counts v i = (cid:98) ∗ x i (cid:99) . Appendix B Convergence of MCMC Chains

One way to assess the convergence of MCMC chains is to compute the PotentialScale Reduction Factor (PSRF) for a number of key parameters. This comparesthe variance between the MCMC chains to the variance within the chains (Brooksand Gelman, 1998). A PSRF of 1 is obtained when the two variances are the same,so starting the chains from diﬀerent initial values and obtaining a PSRF close to 1(typically taken to be less than 1.05) gives a good indication that the chains haveconverged to the parameter’s posterior distribution.

P.S.R.F. F r equen cy Relative Means

P.S.R.F. F r equen cy Variance Parameters

Fig. 11.

Histograms of the PotentialScale Reduction Factor (PSRF) forthe relative means ν i,j,c,t and vari-ance parameters φ i,j,c . We computed the PSRF for the (26016) relative means ν i,j,c,t correspondingto the survey observations and the (3576) variance parameters φ i,j,c . Figure 11presents these respectively in frequency histograms. For both sets of parameters,the overwhelming majority of the values lie in the closest bin to 1, suggesting thatthe model has converged. lobal Household Energy Model Appendix C Further Model Checking

As discussed in Section 3.1, it is important to verify that the model is able toreproduce the observed data well. We do this by comparing replicates (predictions)of the observed data to the actual observations. In Section 3.1 we checked thereplicates of solid fuel use and here we check the remaining fuels. lllllllllllllllllll llllll llllllllllllllllllllllllllllll lllllllllllllllllll lllllllllllllll lllll lll lllllllll l lllllllllllll lll llllllll lllllllllllllllllllllll llllll lll llllllllllllll llll lllllllll lllllllllllll llllllllllllllllllllllllllllll llllllllllllllll lll llllllllllllll llllllllllllllllllllll llll llllllllllllllllll ll lllllllllllllll lll lllllllll llllllll llllllll lllllllllllll llllllllll llllllllll lllllllllllll llll llllllll llllll llll lllllllll llll llll llll llllllll lll lll l ll l lllll llll lllllllll llll lllllllll llllll lll llll lllll l llllll ll l lll l lllll llllll llllllllll lllllllllllllllllll ll llllllll llll lllllll llllllll llllllllllllllllll ll lllllllllllllllll lll ll lllll llllllllll llllllllll l l lll l llll ll l lll lllllll ll llllllllllll lll l lllllllllllllllllllll lll l llllllllllllll ll l ll llllll l llllllll llll l l l ll l ll llllll llll l lllllllllllllllll l lllllllll l l l lll l llllllll l llllllll ll llll ll lll lllll lllllllll l ll lllll lllll ll llll lllllll lll llllllllllllllllllllll llllll l lll llllllllll lllllllll lll l lll lll l ll lllllllllll l lllllll llllllllllllllll ll ll l ll l lllllllllllllll lll l lllllllllllll llllllll l llll l l lll ll lll llllllll l lllllllllll lllllllllllll lllll ll ll ll llllllllllll lll lll ll ll l llll lll l llll ll llllll ll lll lllll llllllll l llll ll l ll lll ll llllll lllllll lllll ll ll lllll lll ll l lll ll ll ll llll lll llllll l llll lll lllllllllll ll llll ll l lll llll ll lll lllll lll llll ll llll l ll l lll llll ll lll ll ll ll l ll llllll llll llllll lll ll llllll l l lllllll l lllll llll lll ll lllll ll lll ll llllllllllll lllllll lll lll l llll ll l ll ll l lllllllllll llll ll lll lllll llll l lll l lll l ll l llll lllllll l ll l lllllllll lll lllllll lll lllllllll llllllllllllll l lllllllllllllllllllll llll llllllllllllllllllllllllllllllllll llllllllll lllll l ll lllllllllllllllllllll llllllllllllll lllllllllllllllllllllll l lllllllllllllllllllllllll llllllllllll l ll lllllllll llllllllllllll lllllllllllll lllllllllll l l llllllll ll ll llllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllll lllllllllll lllll lllll l l llllllllll l l ll l ll l llllllllllllllllllllllllllllllllllllllllll llllll l lllllllllll ll l l llllllll llllllllllllll lllllllllllllll ll llllll lllllllll ll l llllllllllllll ll lll lll lll ll lll llllll ll llll ll lllll l llll ll llllllllllll lll llllllllllllll lll l ll llllll l l ll llll ll lllllllll llllllll lllll lllll llllllllllllllllllll llllll l llllllll lllllllll llllll l llll ll lllllllll lllll lllllllllllll lllll lllllll llllllllllllllllll ll lllll llll ll llllll llll l llll lllllllllllllll l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllll lllllllll llllllllllllllll lllllllllllllllllllllll lllllllllllllllllllllll llll lllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllll lllllllllllllllll ll lllllllllllllll lll llllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllll llll llllllllllllllll l l lll lllllll lll lllll llllll lllllllllllllllllllllllllllllll lllllllllllllllllllllll lllll lllllllllllllllllllllllllllllllll llllllll llllllllllllllllllllllllllllllll llllllllllllllll llll llllllllllll llllllllll lll l l lll l l lllllllll ll llllllllllll lllllllllllllllllllll llll l ll l l lllllllllllllllllllllllll llllllll lllll l ll llllll l llll lllll l l lllll lllll l llllll lllllllll ll l lll llllllllll llllllll ll llll lllllllllllllllllll l ll l llll ll lllllllll lllllll llllllllllllllllllllllllll lllllll lll llllllllll llllllllllllllll lllllllllllllllllllll l lllllllllllllllll ll l ll lll llllllllllll llll llllllll llllll llllllll l llll ll l llll ll l llllllll l l llllllllll ll l lllllllllll llllll llll l lllllllllllll lll lll lll lllll lll l llll ll llllll ll ll lllll l ll lllll l llll lllllllll ll llll lllllll ll lll ll ll lllll lll ll l lll ll l l ll llll lll lllll l llll llllllll llllllll lllllll lllll lllllllll lllll lll ll llll ll lll ll lllll llllllll llll ll llll lll ll l lllll l l lllllll l lllllllll l ll llllll lll ll lllll llllllllll lll llll ll l lll lllllllllll llll ll ll l llll lllll llll lll l ll l lll lllllll l l l l llll lllll llllllll lll llllllllllllllllllllll l l l lllllllllllllllllll lllllllllllllllllllllllllllllllllllllll llllllllll lllll l lllllllllllllllllllllll llllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l llllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lll lllllllllllllllllllllll lllll lllll l l lllllllllll l ll lll llllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllll lll lllllllll llllllllllllll llllllllllllll llllllllllllllllllll llllllllllllll lllll llllllll lllllllll ll llllll lllll llllllllllllllllllllllllllllllllllllllll llll llllllllllllllllllll llll lllllllllllllllllllllllllllllllllll lllllll llllllllllll llll llllllllllllllllllllllllllllllllllllllllllllll ll llllll lllllllllllllll llll lllllllllllllll l lllllllllll llllllllll llllllllllllllllllllllllllllllllllllll llllllllllllllllllllllll llllllllllllllll llllll llllllllllllllllllllll l llllllllllll llllll llllllllllll lllllllllllllllllllllllllllll lllllll lll llllllllllllll llll lllllllllll lllllllllllllllll llllllllllllllllllllllllllllllllll lllll llllllllllll ll llllllllllllllll llllllllllllllll llllllllllllll lllllllllll lllllll lllllllllllllllllllllllllllllll lllll lllllllllllllllllllll llllllllllllllll lllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllll llllllllll lllllll llllll llllllllllllllllll llll l l l lll lllllll ll ll lll l ll llllll lllllllllllllllllllllllllllll llllll lllllllllllllll llllllllllllllll lllll llll lllll lll lll l lllllllllllllllllllllllllllllllllllllllll ll ll lllllllll llllllllllllllll lllllll l llllllll lllllllllllllllllllll ll llllllllllllllllllllll llll ll lllllllllllllllllll llllllllllllll l ll l l lll lllll l ll lllllllll l lll lllllllllllll llll llllllllllllllllllll lll l llllllllllllllllllllllllllllllll lllllllll llll l l l l llll lll lllllllllllllllll llllll l l llllllllll lllll lll lllll llllllllll ll l l lllll l lllllllll l lllllllllllll l ll llll ll lll llllllllllllll l ll ll l llll lllllllllll llllllllll ll llllllllllllllllllllllllllllll ll l lllll ll l llll llllll lllllllllll llllllllll lllllllllll lll llllllllllllll l llll l llllll lllllllllllll llll lll l llll l llllllllllllllllllll l l lllllllllll l ll llllllllll llllll lllllllllllllllllllllll llllll ll llll ll ll ll l lllllllllll l lllllllllllll lll lllllllllll lllllll ll ll l l llllllllllll lll lll ll ll lllll lll l llll ll l lllll llll ll l ll lllll l lllll ll l l llll lllll ll l lll lll ll lllllll lllllll ll lll l ll ll lllll lll ll l lll ll ll ll lllll lll lllll l llll lllllllllllll ll llllll ll lll ll llllll lll ll lllllll llll l lll lllllllllll llllll llllll l ll lllllllll llll llll l l lll ll ll ll l ll lllllll lllll ll lll llllllll l lll ll lllllll l ll llllllll l llll l lll ll ll l ll l l lllll ll ll lllllll lll ll ll llllllllllllllll lllllll lll llll llll ll l ll ll l llllllllll ll lllll lll ll l lll lllll lllllll l llllll lll l ll l l lll llllll lll l ll l l llllllllllll lll lllllllll lll l lllllllllll lllllllllllllll l l lllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllll llllllllllll lllllllllllllllll l lllllllllllllllllllllllllllll llllllllllllllllll llllllllllllllllllllllllllllll llllllllllllllllllllllllllll llllllllllll l llllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllll lll l l lllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllll llllllllllllllllllll lllllllllllllllll lllllll l llllll ll lllllllllllllllllllllllll l l ll l l llllll llllllllllllllllllllllllllllllllllllllllllllllll llllll l lllllllllll lll lllllllll llllllllllllllll lll llllllllllllllllllll ll llllllllllllllllll l lllllllllllllll ll llll llllll ll lllllllll ll lllll ll lllll lllllllllllllllllll lll llllllllllllll lll lll lllllllllll llll l ll lllllllllllll ll lllllll lllllllllllllllllll lllll lllll llllllllllll lll llllllllll lllllllllllllllllllllll lllllllll llllllll lllllll llllll ll lll lllllllll lll llllll llllllllllllllllll llllllllllllllllllll llll l lll lll llllll llll llll lllllllllll ll llll llll lllllllllllllllll l K e r o s ene G a s E l e c t r i c i t y Median Predicted Usage O b s e r v ed U s age Fig. 12.

Scatter plotscomparing the posteriormeans of kerosene, gasand electricity use repli-cates to their correspond-ing observed values.

Figure 12 shows scatter plots comparing the mean predicted replicates for thethree other main top-tier fuel types, kerosene, gas and electricity, to their corre-sponding observed values. Similarly, Figure 13 shows the same plots for the threemid-tier fuel types, biomass, charcoal and coal, and Figure 14 shows the three lower-tier fuel types, wood, cropwaste and dung. In general the points are scattered aboutthe diagonal line fairly evenly, indicating a good model ﬁt for the diﬀerent fuels.Notably, however, the ﬁt of the model is more precise for fuel types in the uppertiers (e.g. electricity) than those in the lower tier (e.g. dung). This makes sense, asthese fuels are less likely to be aﬀected by the issues described in Sections 2 and 2.3,such as the combination of certain fuel types, where some of the observed valuesare likely to be erroneous and diﬃcult for the model to capture well. Regardless,the coverage of the 95% intervals is very high for all fuels.

Appendix D Survey Selection

The model was applied to a selection of the WHO Household Energy Database.Surveys were excluded from the analyses if they: • only reported the usage of ‘solid fuels’ as a group, rather than the usage of atleast one individual fuel type. • included an excessively high proportion ( > • were ﬂagged in the database as unsuitable for modelling. Stoner et al. llllllllll llll lllllllllllllll lllllllllll llllllllllllllllll ll lllllll llllllllllll llllll llllllllll lll lllllllll lllllll lllll l l llllllllllllllll lll lll llllllllllll llll lllllllll ll lllllllllll lll llllllllllllllllllllllllll llllllllll llllll lllll llll llllllllll lllllllllll lllllllllllllll lllll llllllllll llllll llllll llllllllllllllllllll llllllllllll llll lll lllllllll l l lllllll lllllllllll llllll lllllll ll lll lllll llll lll ll lllllllll llllllllllll llllllllllll llllllll llllllllllll ll lll lll lllllll llllll ll ll llllll llll l lll ll lllll l lllll l ll ll lll lll ll llll lllll llllll lll ll llll l lllll lll lll l ll llll ll lll l ll llll l lllll ll ll ll lll l ll l llll lll l ll l ll lll llllll llllllllll ll lll ll ll l lll lllll l lll lllllllll lllll l llllll lllllll llllll llll lllll l lllllllll l ll ll llll ll llll l l lllllll llllll lllll ll ll lllllll l ll lll l l l ll l ll lll lllllllllllllll llll lllll lll lllll lll l lll ll llllllllllll llllllllllll l llllll lllllllllllllllllll l l l lllllllllllllllllllllllllll llllllllll l llllllllllllllll llllllllllll llll llllll lllllllllllllllllllll lllllll llll l lll llllllllll l ll lllll lllllllllllllllllllllllllllllll llllllllll l llllllllll lllllllllllllll l llll llll l lll llllllllllllllll l ll l llllllllllllllll llllllllllllllll l llllllllllllllllll lllllllll llllll ll l lllllll lll ll ll lll lllll llllll lllllllll llll lllllll l lllllllllll llll llllllllllllllll llllllllllll ll l ll lll lllll l lllll ll lllllll lll lllllllll llll l l llllll ll l llllllll lllllll lll ll ll llll llll ll lllll ll ll llll ll llll l lll llllll lll lllll lll ll llll llll l lll llllll lll llll l ll l l lll llll l llll lll ll lll llll lll llllll llll lllll ll llllllll lll llll l lll l lll ll llllllllll lllllllllll ll l ll llll l ll ll ll ll llll lll ll lllll llll llll llll lllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllll ll lllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllll lllllllllllllllllllllllll llllllllllllllllll llllllllllllllllllllllllllllllllllllllllll lllll llllllllll llllllllllllllllllllllllllllllllllllllll ll lllllllllllllllll llllll lllll l llllllll llllll lllllllllll llllllllllllllllll ll lllllll llllllllllllllllllllllllllll llllll l l lllll lllllll lllll l ll l llllllllllllll lll lll lllllllll lll lllllllllllll ll llllllll lll lll lllllllllllllll lllllll llll llllllllll llllll lllll llll lllllllllll llllllllll l lllllllllll llll lllll lllllll lll llllll llllll lllllllll lllllll llllllllllll lllllll ll llllll lllllllll l lllllllllll llllll lllllll ll l lllllll llll lll ll lllllllll lllllllll lll lllllllllll lll llllllll llllll llllllll lll lll ll ll ll l lll lll l l ll ll llll ll ll lll lllll ll l lllll l llll llllllll ll ll llll lllllll lll ll ll l l llllll lll lll l ll l ll l ll l l ll l lll ll l lllll lll ll lllll llllllll llll ll ll lll lllll lll lllllll ll lll ll lll lll lllll llll lll llllll lllll l l ll l ll lllll l l llllll llll lllll llllllllll llll l lll llll ll l llllll llllll l lll lllllll ll l ll l llll lll lll lllllllll l l lll llllll ll ll lll llllllll ll lll ll ll lll lll lllll lllllllllllll l llllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llll llllll ll llllllllllllllllllllllllllllllllll llllllllllllllllll llllllllllllllllllllllllllllllllllllllllll llllllllll lllllllllllllllllllllllllllll llllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllll lllllllllllllllllllll llll lllllllllllllllllllllllllllllllllllllllll llllllllllll lll llllllllll llllll lllllllll lllllllllll llll llll llllll llllllllllllllllllll lll llll llll ll l l lll ll lll lll l lllllllllllll llllllllllllll lll llll lllll llllllll llll lllll ll lllllll lllllll lll ll lll llll lll llllll lllllllll ll llllllllllll l ll lllll llll lllllll lllllll l lll lllll lll ll llllllll lll lllll lll lll llllllll lllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllll ll llllllllllllllllllllllllllllllllllllllllllllllllllllll l llllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllll llllllllllllllll llllll llllll lll ll lllllllllll lllllll lllllllllll lllllllllllllllll ll llllllll lllllllllllllllllllllllllllllllll llll ll llllllll llllllllllllllllllll lllll l llll lllllllllll llllllll llll lll llllllllllllll llllll lllllllllll lll lll lllllllll lll lll lllllllllllllllllll lllll llll llllllllll lllllllll llllllllllll lllllllllllll lllllllllll lll llllll llllllll llllllll l l lllll lllllll llll l lllllllllllllll lllllllll lllllllllll lllllllllllll lllllll lllll llllll lllllll llllllllllllll llllllll llllllllllll lllllll l lllllllllllllllllllll ll ll ll llllll llll llllllll ll lllllllllll llllllllllll llllllllllll ll lllllllll llllll llllllll lll lll ll ll ll l lll lll ll ll llllll llll lllll lll ll llllll llllll llll l l l lll lll l lll ll ll ll l llll lllllll lll lll ll l l llllll l ll lll l ll l ll l llll lll l lll ll l lllll ll ll lllll llll lllll llll ll ll ll lllll llllll ll l ll ll llllllll ll ll lll ll lllllllllll l ll l lll ll llll ll l lllllllll ll llllllllll lllll l lll lll llllll lll ll llll llllll l lllll llllll llll lllllll ll llll lll l l lll ll l ll l lllll lll lllllll l l l lllll lllll llll llll lllllll l ll l ll l llll l ll lll ll llllllllllll l llll l lll llll lll lll lllll ll llll llllll ll llll llllllll ll lllllllllllllllll l llllll lllllllllllllllllllllll lll l lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllll lllllllllllll llllll ll llllllllllllllllllllllllllllllllllllllll ll lll lllllllllllll l lll lllllll ll lllllllllllllllllllllllllllllll llllllllllll lllllllllllllll llllllllllllllllllllllllllllll l lll lllllllllllllllllllllll lll llllllllllllllllllllllllllllllllllllllll l lllllllllllllllllllllll llllllllllllll lllllll lllllllll llllllllllllll llllllll ll llllllllllllllllll lllllll llllllllll llllllll llllllll l lllllllllllllll lllllllllllllllllll llllllllllll ll l lllll lllll llllll ll lllllll ll lllllll lllllll llll l lll lllll ll l l lllllll lllllll lll l lll llll llll ll l l lll ll lll lll l lllllll l lll llllllllllll lll lll ll llllll llll l llll llllll llll llll lllllllll lllll ll lll l lll ll lll lllll lll ll lll llll llll llllllll lllllllllll ll lllllllll ll l ll ll ll ll lll lllll ll llll lllllllll ll lllllllllll l lll l llll l ll ll ll lllll ll ll llll ll lll ll lllll llll lllll llll lllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllll lll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllll llllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllll lllllll l lllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllll lllllllllllllllllllllllll B i o m a ss C ha r c oa l C oa l Median Predicted Usage O b s e r v ed U s age Fig. 13.

Scatter plotscomparing the posteriormeans of biomass, char-coal and coal use repli-cates to their correspond-ing observed values. lll lllllllllll lllllllllllllll lllllllllll llllllllllllllllll ll lllllll llllllllllll llllll llllllllll lll lllllllll lllllll lllll l l llllllllllllllll lll lll llllllllllll llll lllllllll ll lllllllllll lll llllllllllllllllllllllllll llllllllll llllll lllll llll llllllllll lllllllllll lllllllllllllll lllll llllllllll llllll llllll llllllllllllllllllll llllllllllll llll lll lllllllll l l lllllll lllllllllll llllll lllllll ll lll lllll llll lll ll lllllllll llllllllllll llllllllllll llllllll llllllllllll ll lll lll lllllll llllll ll ll llllll llll l lll ll lllll l lllll l ll ll lll lll ll llll lllll llllll lll ll llll l lllll lll lll l ll llll ll lll l ll llll l lllll ll ll ll llll ll l llll lll l ll l ll l ll llllll llllllllll ll lll ll ll l lll lllll l lll lllllllll lllll l llllll lllllll llllll llll lllll l lllllllll l llll llll ll llll l l lllllll llllll lllll ll ll lllllll l ll lll ll lll l ll lll lllllllllllllll llll lllll lll lllll ll l lll ll llllllllllll llllllllllll l lllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllll lllllllllllllllllllllllllllllllll lllll llllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lll lll lllll l lllll lll llllll lllllllllll llllllllllllllllll ll lllllll llllllllllll llllllllllllllll llllll l l lllll lllllll lllll l ll l llllllllllllll lll l ll lllllllll lll llll lllllllll ll llllllll lll llllllllllllllllll lllllll llll llllllllll llllll lllll llll lllllllllll llllllllll l lllllllllll llllll lll lllllll lll llllll llllll lllllllll lllllll llllllllllll llll lll ll llllll lllllllll l lllllllllll llllll ll lllll ll l lllllll llll lll ll lllllllll lllllllll lll lllllllllll lll llllllll llllll llllllll lll lll ll lll l l lll lll l l ll ll llll ll ll lll lllll ll l lllll l ll ll l ll lllll ll ll llll lllllll lll ll ll l ll lllll l ll lll l ll l ll lll l l ll l lll ll l lllll lll llllllllll lllll ll llll lll lllllll lll llllllll l lll ll ll l lll lllll llll l ll llll ll lllll l l ll l lll llll l l llllll llll lllll llllllllll lllll lll lll l ll l llllll llllll l lll lllllll ll lll l lll l lll lll lllllllll l l lll llllll ll ll lll llllllll l lllll ll lll lll lllll lllllllllllll l l lllllllllllll l lllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllll llll lllllllllll llllllllllllllllllllllllllllllllllll lllll llll llllllllllllllll llllllllll lllllllllllllllllllllllllllllllllll llllllllllll ll llllllllllllllllllllll llll llllll l lllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllll llllllllllllllllllllllllll lllllllllllllllllllllllllllllllll lllllllllllll lllllllllllllllll llllllllll llllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllll llllllll lllllllll llllllllllll lll lll lll ll llll llll lllllll llllllll ll ll lllllllllllll lll lll llllll llll lll llllllllllll lllllllll lll lll llllll lll ll llllll lllll lllllll llllllllllllllllllllllllllll ll llllllll lllllllllllllll llllllllllllllllll llll ll llllllll llllllllllllllllllll lllll l llll lllllllllll llllllll llll lll llllllllllllll llllll lllllllllll lll lll lllllllll lll lll lllllllllllllllllll lllll llll llllllllll lllllllll llllllll llll lllllllllllll lllllllllll lll llllll llllllll llllllll l llllll lllllll llll l lllllllll llllll lllllllll lllllllllll lllllllllllll lllllll lllll llllll lllllll llllllllllllll llllllll llllllllllll lllllll l lll llllllllllllllllll ll ll ll llllll llll lll lllll ll lllllllllll llllllllllll llllllllllll ll lllllllll llllll llllllll lll ll l ll llll l lll lll ll ll llllll llll lllll lll ll llllll llllll llll l l l l ll lll l lll ll ll ll l llll lllllll lll lll ll l l l lllll l ll lll l ll l ll lllll lll l lll ll l lllll ll ll lllll llll lllll ll ll ll ll l ll llll l lllll lll l l ll llllllll ll ll lll ll lllllllllll l ll l lll ll llll ll l lllllllll ll llll llll ll lllll l lll lllllllll lll ll llll llllll l lllll llllll llll lllllll ll llll lll l llll ll l ll l llllll ll lllllll ll l lllll lllll llll llll lllllll l ll l ll l llll l ll lll ll llllllllllll lllll l ll llll lll lll lllll l llll llllll ll llll llllllll ll lllllllllllllllll l l lllllllllllllllll l lllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llll lllllllllll lllllllllllllllllllllllllllllllllll lllll llll llll llllllllllllllllllllllll llllllllll lllllllllllllllllllllllllllll llllllllllllllllll llllllllllllll ll llllllllllllllllllllllllllllll llllll lllllll ll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllll llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllllllllllllllllllll llllllll lllllllllllllllllllll lll lll lll ll lllllll lllllll lll llllllllllllll lll llllllllllllllllllll llllllll ll lllllllll llll llllllllllllll lllllllllllll W ood C r op w a s t e D ung Median Predicted Usage O b s e r v ed U s age Fig. 14.

Scatter plotscomparing the posteriormeans of wood, cropwasteand dung use replicatesto their corresponding ob-served values.

Surveys which were not included for modelling are shown as black points in theplots of predicted fuel use provided as supplementary material.

References

Bonjour, S., Adair-Rohani, H., Wolf, J., Bruce, N. G., Mehta, S., Pr¨uss-Ust¨un, A.,Lahiﬀ, M., Rehfuess, E. A., Mishra, V. and Smith, K. R. (2013) Solid fuel use forhousehold cooking: country and regional estimates for 1980–2010.

EnvironmentalHealth Perspectives , , 784.Brooks, S. P. and Gelman, A. (1998) General methods for monitoring convergenceof iterative simulations. Journal of Computational and Graphical Statistics , ,434–455. URL: .Gelman, A., Carlin, J., Stern, H., Dunson, D., Vehtari, A. and Rubin, D. (2014) lobal Household Energy Model Bayesian Data Analysis, Third Edition (Chapman and Hall/CRC Texts in Sta-tistical Science) . London: Chapman and Hall/CRC, third edn.R Core Team (2018)

R: A Language and Environment for Statistical Computing . RFoundation for Statistical Computing, Vienna, Austria. URL: .Rehfuess, E., Mehta, S. and Pr¨uss-Us¨un, A. (2006) Assessing household solid fueluse: multiple implications for the Millennium Development Goals.

EnvironmentalHealth Perspectives , , 373–378.SDG 7 Custodial Agencies (2019) (IEA, IRENA, UNSD, WB, WHO) Tracking SDG7: The Energy Progress Report 2019. https://trackingsdg7.esmap.org/data/files/download-documents/2019-Tracking%20SDG7-Full%20Report.pdf .Shaddick, G., Thomas, M. L., Green, A., Brauer, M., Donkelaar, A., Burnett,R., Chang, H. H., Cohen, A., Dingenen, R. V., Dora, C., Gumy, S., Liu, Y.,Martin, R., Waller, L. A., West, J., Zidek, J. V. and Pr¨uss-Ust¨un, A. (2017) Dataintegration model for air quality: a hierarchical approach to the global estimationof exposures to ambient air pollution. Journal of the Royal Statistical Society:Series C (Applied Statistics) , , 231–253. URL: https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssc.12227 .Shupler, M., Godwin, W., Frostad, J., Gustafson, P., Arku, R. E. and Brauer,M. (2018) Global estimation of exposure to ﬁne particulate matter (pm2.5) fromhousehold air pollution. Environment International , , 354 – 363. URL: .United Nations (2018) World urbanization prospects: The 2018 revision. Tech. rep.

URL: https://population.un.org/wup/ .de Valpine, P., Turek, D., Paciorek, C. J., Anderson-Bergman, C., Lang, D. T. andBodik, R. (2017) Programming with models: Writing statistical algorithms forgeneral model structures with NIMBLE.

Journal of Computational and Graph-ical Statistics , , 403–413. URL: https://doi.org/10.1080/10618600.2016.1172487 .Wood, S. (2016) Just another gibbs additive modeler: Interfacing jags andmgcv. Journal of Statistical Software, Articles , , 1–15. URL: .World Health Organization (2014) WHO guidelines for indoor air quality: householdfuel combustion .— (2016) Health and the environment: draft road map for an enhanced globalresponse to the adverse health eﬀects of air pollution: report by the secretariat. https://apps.who.int/iris/handle/10665/250653 . Accessed: 2019-08-28.— (2018a) Household energy database. . Accessed: 2018-12-17. Stoner et al. — (2018b) WHO press release.

Related Researches

Monitoring the COVID-19 epidemic with nationwide telecommunication data

by Joel Persson

A Registration-free approach for Statistical Process Control of 3D scanned objects via FEM

by Xueqi Zhao

Infections Forecasting and Intervention Effect Evaluation for COVID-19 via a Data-Driven Markov Process and Heterogeneous Simulation

by Quan-Lin Li

Detection of foraging behavior from accelerometer data using U-Net type convolutional networks

by Manh Cuong Ngô

Independent Action Models and Prediction of Combination Treatment Effects for Response Rate, Duration of Response and Tumor Size Change in Oncology Drug Development

by Linda Z. Sun

Statistical challenges in the analysis of sequence and structure data for the COVID-19 spike protein

by Shiyu He

A Probabilistic Model for Predicting Shot Success in Football

by Edward Wheatcroft

Classification of chemical compounds based on the correlation between \textit{in vitro} gene expression profiles

by Jun-ichi Takeshita

Explaining the difference between men's and women's football

by Luca Pappalardo

A probabilistic risk-based decision framework for structural health monitoring

by Aidan J. Hughes

Bayesian hierarchical modeling and analysis for physical activity trajectories using actigraph data

by Pierfrancesco Alaimo Di Loro

"Old Techniques for New Times": the RMaCzek package for producing Czekanowski's Diagrams

by Krzysztof Bartoszek

Evaluating Fairness in the Presence of Spatial Autocorrelation

by Cheryl Flynn

Assessing Vaccine Durability in Randomized Trials Following Placebo Crossover

by Jonathan Fintzi

Credit Crunch: The Role of Household Lending Capacity in the Dutch Housing Boom and Bust 1995-2018

by Menno Schellekens

Improving forecasting with sub-seasonal time series patterns

by Xixi Li

On Some Statistical and Axiomatic Properties of the Injury Severity Score

by Nassim Dehouche

Predicting replicability -- analysis of survey and prediction market data from large-scale forecasting projects

by Michael Gordon

Using Multiple Pre-treatment Periods to Improve Difference-in-Differences and Staggered Adoption Design

by Naoki Egami

The Optimal Dynamic Treatment Rule SuperLearner: Considerations, Performance, and Application

by Lina Montoya

A Study on the Association between Maternal Childhood Trauma Exposure and Placental-fetal Stress Physiology during Pregnancy

by Eileen Zhang

How many data clusters are in the Galaxy data set? Bayesian cluster analysis in action

by Bettina Grün

Demand forecasting in hospitality using smoothed demand curves

by Rik van Leeuwen

Semiparametric point process modelling of blinking artifacts in PALM

by Louis G. Jensen

The fraud loss for selecting the model complexity in fraud detection

by Simon Boge Brant

«
1

2

3

4

»

Submitted on 9 Jan 2019 (v1), last revised 22 Nov 2019 (this version, v2) Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar