WWho is more ready to get back in shape?
Rajius Idzalika
Pulse Lab Jakarta [email protected]
Abstract
This empirical study estimates resilience (adaptive capacity) around the periodsof the 2013 heavy flood in Cambodia. We use nearly 1.2 million microfinanceinstitution (MFI) customer data and implement the unsupervised learning method.Our results highlight the opportunity to develop resilience by having a betterunderstanding of which areas are likely to be more or less resilient based on thecharacteristics of the MFI customers, and the individual choices or situations thatsupport stronger adaptiveness. We also discuss the limitation of this approach.
The progress of sustainable growth in many developing and low income countries are constantlychallenged by stresses and shocks, including from the extreme climate changes and the recentCOVID-19 pandemic. The capacity to adapt to those hardships is crucial to avoid the disruption ofdevelopment achievements and future development trajectories. Resilience, as the positive resultsof such a successful adaptation [5][3], can contribute to overcome the negative consequences oftragic events. The existing literature around this topic is pretty limited, though, where the typicaldata collection is through survey. In a low supporting environment of quality data collection, findingand then reusing alternative data sources whose main purpose of collection is not about resilience,but some of their features are fairly relevant to resilience, could be an interesting option. This paperintroduces the possibility of using financial data and deploying unsupervised learning to learn thedynamic of adaptive capacity in Cambodia, one of the most vulnerable low income countries in theworld. This is a piece of work on resilience. A more comprehensive study would include other aspectssuch as access to basic services, social safety nets and coping strategies. We hope, nevertheless,our work inspires more resilience studies in countries with little resources by using the existingnon-conventional data sets coupled with machine learning.
Agriculture is a dominant sector in Cambodia with 35 percent of its contribution to total GDP andemploys the majority of the population [2]. Due to the geographical location, this country is highlyprone to natural disasters that heavily impact the agriculture sector. A major flood in 2011 hit 18 of 24provinces, affecting around 52,000 households (13 percent of the population). The next flood arrivedin 2013 affecting six folds of the households size of the previous flood, followed by El-Nino events in2015-2016. The Post Flood Early Recovery Assessment (PFERNA) recorded that the most commoncoping mechanisms are reducing expenses, relying on the assistance from NGO and government ortaking loan from the MFIs [6].One of the fundamental pillars of resilience framework is adaptive capacity [4]. [1] explicitly appliedthe adaptive capacity model in the Philippines by a composite index from the combination of surveydata and experts’ opinion that determines the weights of the indicators. [7] carried out the study of a r X i v : . [ c s . L G ] N ov able 1: Adaptive capacity indicatorsIndicators Weights Proxy of sub-indicatorsHuman resources 0.072 Age, schooledPhysical resources 0.231 The loan is for agricultureFinancial resources 0.339 Number of loan accounts, proportion of local currency,interest rate, loan term, loan balanceInformation 0.236 Urban * FemaleLivelihood diversity 0.122 The number of income sources is more than twoFigure 1: The weights of adaptive capacity variables, 2012adaptive capacity with a gender lens in Vietnam. To the best of our knowledge, there is no quantitativestudy on resilience or adaptive capacity in Cambodia that has taken place to date. The data for this study is generated from a business process of an anonymous MFI in Cambodia, ac-cessed through a partnership with the UN Capital Development Fund (UNCDF). This is a longitudinaldata with annual frequency between 2012-2015. The raw data is at account level that stores financialfeatures such as annual loan balance, interest rate, loan term, and the choice of currency namelythe Cambodia Riel (KHR) or foreign currencies, Thailand Bath (THB) and US Dollar (USD). Thecorresponding customer information is stored together at the account level to reflect the customer’sbackground the first time they applied for a loan account. Some data might not reflect the true natureof time variant variables except for the first year of application.Extensive work on data pre-processing was largely conducted because there are several differentversions of data entry style, even within a single year. Age was specifically corrected according to theyear of birthday in order to maintain its time-variant nature. The last part of the pre-processing step isto aggregate the data to the customer level for each available year by taking the average. The finalnumber of observations used for the analysis is close to 1.2 million.
To enable us using the same indicators by [1] and their associated weights, our first assumption is thatthis particular MFI in Cambodia targets rural societies that predominantly composed of agriculturalcommunities, as which this is [1]’s targeted observations. The second assumption is that those weightsand likewise the agricultural livelihood are transferable across neighboring countries with similarcharacteristics. We further defined the proxy of sub-indicators according to the matched features, andif available, the previous literature like interaction between urban location and female networks togather information [7]. The precise proxy sub-indicators used are reflected in Table 1.Proxy indicators with only one component automatically receive the experts’ weight from [1] suchas physical resources, information and livelihood diversity. Proxy indicators with more than onecomponents such as indicators of human resources and financial resources go through one furtherstep to obtain the weight for each component as described in the next paragraph.2 a) Adaptive capacity in Cambodia (b) Clustering adaptive capacity
Figure 2: Exploratory analysisPrincipal Component Analysis (PCA) was run to produce the standardized principal component scoresfor variables of both indicators separately, for each year. The principal components (PCs) becomethe new sets of proxy sub-indicators with their associated weights obtained from the proportionof variance explained. The final weights for the new proxy sub-indicators are generated from themultiplication between the experts’ weights and the proportion of variance explained. Thus, the sumof the weights for each of the two indicators is being adjusted from one, as the total of proportionexplained, to the proportions according to the experts’ weights. Each variable now has the weightbased on their status, either as the indicator or as the new proxy of sub-indicator. Figure 1 illustratesthe variables and their associated weights in 2012, where bold fonts indicate their inclusion incalculating adaptive capacity index. The weights are always consistent for all indicators each year,but variant over time for the PCs. The individual adaptive capacity index is simply the summation ofthe weighted variables for each observation in a given year. The index was then aggregated at thedistrict level for exploratory and cluster analyses.Model based clustering analysis was performed at the individual and district level to capture patternsattributed to the pre, peri, and post 2013 flood with the mclust package in R [8] that employs Gaussianfinite mixture models. The best model is determined simultaneously with the best number of clustersusing Bayesian Information Criteria (BIC). Further, key determinants of the clusters were selectedthrough the combination of visual examination and decision tree. Insights from the comparison ofsubstantial determinants between the lowest and the highest adaptability clusters might tell if any ofthose features can be improved by public policy interventions to develop resilience capacity.
The exploratory analysis in Figure 2 shows how the scoring results of adaptive capacity is furtherused to understand the dynamics of resilience in Cambodia. After aggregating at the district levelthen being pooled for all the years, Figure 2a demonstrates the normalized index where a darkershade represents a higher score. There are districts that show contrast shades between non-disasterand disaster years, such as some areas in the northwest border and in the shore of Thailand Gulf (westborder). They show a better capacity for adaptation during the disaster time.Although this snapshot is imperfect due to selection bias, as well as possibly not all adaptive capacitytranslates into adaptation, it offers some potential insights for disaster relief management and thelonger term development projects. The clustering based classification in Figure 2b additionallyexhibits that the number of clusters at individual level reaches the peak in 2013. This is a potentialinsight of immediate divergence or inequality within the communities after the flood taking place.Figure 3 contrasts the vital characteristics between clusters with the highest and the lowest adaptivecapacity over time to reveal the dynamics of adaptation ability determinants. Insights from thesecharts potentially offer an explanation on different coping strategies to respond to the disasters thatcould be useful to inform interventions. For instance, the most adaptable cluster shows a strongpreference to borrow money for agriculture purposes and a general tendency to take loan with lowerinterest rate, while the least adaptable cluster is totally on the opposite side. Designing policy3igure 3: The heatmap of key determinants on adaptabilityresponses could capitalize this information by investigating further how those choices or situationsare linked to improving resilience capacity. Another potential further examination is the individualconsiderations or the structural problems in those two aspects.
The validation of adaptive capacity index against two external data sources, Demographic andHealth Survey (DHS) 2014 and Finscope 2015, was conducted via their matched variables. Theresults consistently show that customers in this particular MFI are narrowly segmented comparedto the general population. The observations are skewed to those with formal education, female andconcentrated at middle age. Results from Finscope offer an additional insight that the amount theyborrowed is relatively smaller than that of the average population.
First, the unsupervised model used in this study limits the maximum number of clusters to be nine.The divergence in Figure 2b might be more extreme without the limit. Second, the coverage issegmented therefore the results do not hold for the whole Cambodia population. Third, the datasource comes from the financial sector known for their strict confidentiality, thus our approach mightbe hard for scaling up or reproducing in other settings. We hope that in the future, advanced techniquesor strategies could create a nice balance between data privacy and data as public good/infrastructure,hence more data from the private sector could be used responsibly for a greater social impact.
Vulnerability to natural disasters in many low income countries exacerbates their already devastatingsituation. Given the lack of sufficient public data infrastructure as well as typical budget constraintcircumstances, suitable alternative data sources from the private sector is a potential quick solutionto inform resilience planning and design via machine learning approach. In this paper, we showan exploratory study to understand the adaptive capacity in Cambodia with MFI data by usingunsupervised learning methods. Such a study would help identify the baseline level of the ability foradaptation and advise the essential characteristics. These insights could inform a broader study aboutresilience or a more responsive public policy in humanitarian spaces.4 cknowledgements
Dr. Jong Gun Lee is acknowledged for pitching the initial research idea. Sriganesh Lokanathanis acknowledged for editorial support. George Hodge is acknowledged for peer support. UNCDF,in particular Robin Gravesteijn Ph.D, is acknowledged for the continuous supports on the datapartnership. Demographic and Health Survey is acknowledged for the data access. Finally, thesupport of the Government of Australia in the funding aspect of this work is gratefully acknowledged.
References [1] G. Defiesta, C. Rapera, et al. Measuring adaptive capacity of farmers to climate change andvariability: Application of a composite index to an agricultural community in the philippines.
Journal of Environmental Science and Management , 17(2), 2014.[2] FAO. Cambodia at a glance, 2020. URL .[3] J. B. Houston. Bouncing forward: Assessing advances in community resilience assessment,intervention, and theory to guide future work, 2015.[4] R. II. Analysing resilience for better targeting and action.[5] B. Manyena, G. O’Brien, P. O’Keefe, and J. Rose. Disaster resilience: a bounce back or bounceforward ability?
Local Environment: The International Journal of Justice and Sustainability , 16(5):417–424, 2011.[6] K. of Cambodia. Post-Food Early Recovery Need Assessment Report, 2014.[7] L. T. Phan, S. C. Jou, and J.-H. Lin. Gender inequality and adaptive capacity: The role of socialcapital on the impacts of climate change in vietnam.
Sustainability , 11(5):1257, 2019.[8] L. Scrucca, M. Fop, T. B. Murphy, and A. E. Raftery. mclust 5: clustering, classification anddensity estimation using gaussian finite mixture models.