What Drives the International Development Agenda? An NLP Analysis of the United Nations General Debate 1970-2016
WWhat Drives the International DevelopmentAgenda? An NLP Analysis of the United NationsGeneral Debate 1970-2016
Alexander Baturo
School of Law and GovernmentDublin City UniversityEmail: [email protected]
Niheer Dasandi
School of Government and SocietyUniversity of BirminghamEmail: [email protected]
Slava J. Mikhaylov
Institute for Analytics and Data ScienceDepartment of GovernmentUniversity of EssexEmail: [email protected]
Abstract —There is surprisingly little known about agendasetting for international development in the United Nations(UN) despite it having a significant influence on the processand outcomes of development efforts. This paper addresses thisshortcoming using a novel approach that applies natural languageprocessing techniques to countries’ annual statements in theUN General Debate. Every year UN member states deliverstatements during the General Debate on their governments’perspective on major issues in world politics. These speechesprovide invaluable information on state preferences on a widerange of issues, including international development, but havelargely been overlooked in the study of global politics. This paperidentifies the main international development topics that statesraise in these speeches between 1970 and 2016, and examine thecountry-specific drivers of international development rhetoric.
I. I
NTRODUCTION
Decisions made in international organisations are fundamen-tal to international development efforts and initiatives. It is inthese global governance arenas that the rules of the globaleconomic system, which have a huge impact on developmentoutcomes are agreed on; decisions are made about large-scale funding for development issues, such as health andinfrastructure; and key development goals and targets areagreed on, as can be seen with the Millennium DevelopmentGoals (MDGs). More generally, international organisationshave a profound influence on the ideas that shape internationaldevelopment efforts [1].Yet surprisingly little is known about the agenda-settingprocess for international development in global governanceinstitutions. This is perhaps best demonstrated by the lackof information on how the different goals and targets ofthe MDGs were decided, which led to much criticism andconcern about the global governance of development [2]. Moregenerally, we know little about the types of development issuesthat different countries prioritise, or whether country-specificfactors such as wealth or democracy make countries morelikely to push for specific development issues to be put onthe global political agenda.The lack of knowledge about the agenda setting processin the global governance of development is in large part dueto the absence of obvious data sources on states’ preferences about international development issues. To address this gapwe employ a novel approach based on the application ofnatural language processing (NLP) to countries’ speeches inthe UN. Every September, the heads of state and other high-level country representatives gather in New York at the startof a new session of the United Nations General Assembly(UNGA) and address the Assembly in the General Debate.The General Debate (GD) provides the governments of thealmost two hundred UN member states with an opportunityto present their views on key issues in international politics –including international development. As such, the statementsmade during GD are an invaluable and, largely untapped,source of information on governments’ policy preferences oninternational development over time.An important feature of these annual country statements isthat they are not institutionally connected to decision-makingin the UN. This means that governments face few externalconstraints when delivering these speeches, enabling them toraise the issues that they consider the most important. There-fore, the General Debate acts “as a barometer of internationalopinion on important issues, even those not on the agendafor that particular session” [3]. In fact, the GD is usually thefirst item for each new session of the UNGA, and as suchit provides a forum for governments to identify like-mindedmembers, and to put on the record the issues they feel theUNGA should address. Therefore, the GD can be viewed asa key forum for governments to put different policy issues oninternational agenda.We use a new dataset of GD statements from 1970 to2016, the UN General Debate Corpus (UNGDC), to examinethe international development agenda in the UN [4]. Ourapplication of NLP to these statements focuses in particularon structural topic models (STMs) [5]. The paper makes twocontributions using this approach: (1) It sheds light on the maininternational development issues that governments prioritise inthe UN; and (2) It identifies the key country-specific factorsassociated with governments discussing development issues in UNGDC is publicly available at the Harvard Dataverse at http://dx.doi.org/10.7910/DVN/0TJX8Y a r X i v : . [ c s . C L ] A ug llllllll lllllll lllllllllllllllll lllllllll llllll Semantic coherence E xc l u s i v i t y Selecting optimal number of topics
Fig. 1.
Optimal model search . Semantic coherence and exclusivity results fora model search from 3 to 50 topics. Models above the regression line providea better trade off. Largest positive residual is a 16-topic model. their GD statements.T HE UN G
ENERAL D EBATE AND INTERNATIONALDEVELOPMENT
In the analysis we consider the nature of internationaldevelopment issues raised in the UN General Debates, andthe effect of structural covariates on the level of developmentalrhetoric in the GD statements. To do this, we first implementa structural topic model [5]. This enables us to identify thekey international development topics discussed in the GD.We model topic prevalence in the context of the structuralcovariates. In addition, we control for region fixed effectsand time trend. The aim is to allow the observed metadatato affect the frequency with which a topic is discussed inGeneral Debate speeches. This allows us to test the degree ofassociation between covariates (and region/time effects) andthe average proportion of a document discussing a topic.
A. Estimation of topic models
We assess the optimal number of topics that need to be spec-ified for the STM analysis. We follow the recommendations ofthe original STM paper and focus on exclusivity and semanticcoherence measures. [6] propose semantic coherence measure,which is closely related to point-wise mutual informationmeasure posited by [7] to evaluate topic quality. [6] showthat semantic coherence corresponds to expert judgments andmore general human judgments in Amazon’s Mechanical Turkexperiments.Exclusivity scores for each topic follows [8]. Highly fre-quent words in a given topic that do not appear very oftenin other topics are viewed as making that topic exclusive.Cohesive and exclusive topics are more semantically useful.Following [9] we generate a set of candidate models rangingbetween 3 and 50 topics. We then plot the exclusivity andsemantic coherence (numbers closer to 0 indicate higher co-herence), with a linear regression overlaid (Figure 1). Modelsabove the regression line have a “better” exclusivity-semanticcoherence trade off. We select the 16-topic model, which hasthe largest positive residual in the regression fit, and provides
Top Topics
Expected Topic ProportionsTopic 14: intern, state, law, nation, right, territori, govern, peac, conflict, politTopic 3: nation, intern, develop, nuclear, unit, countri, world, peac, will, cooperTopic 11: countri, state, govern, peopl, nation, america, intern, unit, american, peacTopic 4: nation, develop, peac, unit, intern, countri, world, will, econom, newTopic 16: unit, nation, peac, must, will, oper, countri, state, intern, worldTopic 10: intern, peac, secur, arab, state, palestinian, nation, unit, israel, resolutTopic 5: peopl, countri, africa, south, state, intern, unit, peac, nation, independTopic 6: african, countri, peac, africa, nation, govern, unit, intern, secur, republTopic 15: nation, countri, world, will, organ, intern, state, unit, develop, problemTopic 2: develop, countri, econom, world, intern, trade, economi, nation, must, needTopic 7: develop, nation, countri, unit, global, will, chang, climat, sustain, internTopic 12: nation, unit, will, develop, intern, govern, peac, general, new, assemblTopic 9: world, peopl, will, nation, can, one, war, year, unit, todayTopic 1: nation, unit, secur, intern, organ, state, council, will, general, peacTopic 13: nation, unit, must, secur, human, intern, will, terror, right, globalTopic 8: intern, countri, develop, peac, will, communiti, econom, peopl, effort, nation
Fig. 2.
Topic quality . 20 highest probability words for the 16-topic model. higher exclusivity at the same level of semantic coherence.The topic quality is usually evaluated by highest probabilitywords, which is presented in Figure 2.
B. Topics in the UN General Debate
Figure 2 provides a list of the main topics (and the highestprobability words associated these topics) that emerge fromthe STM of UN General Debate statements. In addition to thehighest probability words, we use several other measures ofkey words (not presented here) to interpret the dimensions.This includes the FREX metric (which combines exclusivityand word frequency), the lift (which gives weight to words thatappear less frequently in other topics), and the score (whichdivides the log frequency of the word in the topic by the logfrequency of the word in other topics). We provide a briefdescription of each of the 16 topics here.
Topic 1 - Security and cooperation in Europe .The first topic is related to issues of security and coopera-tion, with a focus on Central and Eastern Europe.
Topic 2 - Economic development and the global system .This topic is related to economic development, particularlyaround the global economic system. The focus on ‘trade’,‘growth’, ‘econom-’, ‘product’, ‘growth’, ‘financ-’, and etc.suggests that Topic 2 represent a more traditional view ofinternational development in that the emphasis is specificallyon economic processes and relations.
Topic 3 - Nuclear disarmament .This topic picks up the issue of nuclear weapons, which hasbeen a major issue in the UN since its founding.
Topic 4 - Post-conflict development .This topic relates to post-conflict development. The coun-tries that feature in the key words (e.g. Rwanda, Liberia,Bosnia) have experienced devastating civil wars, and theemphasis on words such as ‘develop’, ‘peace’, ‘hope’, and‘democrac-’ suggest that this topic relates to how these coun-tries recover and move forward.
Topic 5 - African independence / decolonisation .his topic picks up the issue of African decolonisationand independence. It includes the issue of apartheid in SouthAfrica, as well as racism and imperialism more broadly.
Topic 6 - Africa .While the previous topic focused explicitly on issues ofAfrican independence and decolonisation, this topic moregenerally picks up issues linked to Africa, including peace,governance, security, and development.
Topic 7 - Sustainable development .This topic centres on sustainable development, picking upvarious issues linked to development and climate change. Incontrast to Topic 2, this topic includes some of the newerissues that have emerged in the international developmentagenda, such as sustainability, gender, education, work andthe MDGs.
Topic 8 - Functional topic .This topic appears to be comprised of functional or process-oriented words e.g. ‘problem’, ‘solution’, ‘effort’, ‘general’,etc.
Topic 9 - War .This topic directly relates to issues of war. The key wordsappear to be linked to discussions around ongoing wars.
Topic 10 - Conflict in the Middle East .This topic clearly picks up issues related to the Middle East– particularly around peace and conflict in the Middle East.
Topic 11 - Latin America .This is another topic with a regional focus, picking up onissues related to Latin America.
Topic 12 - Commonwealth .This is another of the less obvious topics to emerge fromthe STM in that the key words cover a wide range of issues.However, the places listed (e.g. Australia, Sri Lanka, PapuaNew Guinea) suggest the topic is related to the Commonwealth(or former British colonies).
Topic 13 - International security .This topic broadly captures international security issues (e.g.terrorism, conflict, peace) and in particularly the internationalresponse to security threats, such as the deployment of peace-keepers.
Topic 14 - International law .This topic picks up issues related to international law,particularly connected to territorial disputes.
Topic 15 - Decolonisation .This topic relates more broadly to decolonisation. As wellas specific mention of decolonisation, the key words include arange of issues and places linked to the decolonisation process.
Topic 16 - Cold War .This is another of the less tightly defined topics. The topicsappears to pick up issues that are broadly related to theCold War. There is specific mention of the Soviet Union, anddetente, as well as issues such as nuclear weapons, and theHelsinki Accords.Based on these topics, we examine Topic 2 and Topic 7as the principal “international development” topics. While anumber of other topics – for example post-conflict develop-ment, Africa, Latin America, etc. – are related to development crisi bank system billion pe r price becom effect servic m u l t il a t e r l e v e l can institut million increas now willdebt develop o r de r one problem structurprovid need oil invest importunit major programm f ood high polici economi good growth reduc r e s u l t nation market resourc region global financi incom well social s i t ua t mani measur f l o w bene f i t produc effortrequir technolog interestserious year energi fund export econom term a c t i on decad f i nan c face uselong poverti rate rich new communiti responsmake cent howev c hang p r odu c t even third also popul poor countri assist po li t therefor time must agricultur t r ade continu achiev industri intern world continu challeng organresourc small post peopl promot c oope r li k e goal contribut countri year general millennium commitsecur mani effect agenda women implement effort remain address new human peac elect confer polici poverti ensur progress intern island programm sustain m ee t i m p r o v work needissu chang communiti mdgs natur right w e ll call econom achiev assistregardstrengthen presidmember growthrespons govern region session p r o v i d nation import live assembl takehealth food therefor success high actioniniti opportun must unit level p r o c e ss also climat partnership i n c l ud sixti develop state environ will support educ made vulner world tradeper social global reform Fig. 3.
Topic content . 50 highest probability words for the 2nd and 7th topics. issues, Topic 2 and Topic 7 most directly capture aspects ofinternational development. We consider these two topics moreclosely by contrasting the main words linked to these twotopics. In Figure 3, the word clouds show the 50 words mostlikely to mentioned in relation to each of the topics.The word clouds provide further support for Topic 2 repre-senting a more traditional view of international developmentfocusing on economic processes. In addition to a strongemphasis on ’econom-’, other key words, such as ‘trade’,‘debt’, ‘market’, ‘growth’, ‘industri-’, ‘financi-’, ‘technolog-’, ‘product’, and ‘argicultur-’, demonstrate the narrower eco-nomic focus on international development captured by Topic2. In contrast, Topic 7 provides a much broader focus on evelop countri economworld intern trade economi nation must need global per product increas resourc centgrowthfinanci will system debtproblem marketindustri crisi unit chang climatsustainsupport governgoalpeopl also statecommit achiev securchallenggeneral continu assembl year region
Topic 7Topic 2
Fig. 4.
Comparing Topics 2 and 7 quality . 50 highest probability wordscontrasted between Topics 2 and 7. development, with key words including ‘climat-’, ‘sustain’,‘environ-’, ‘educ-’, ‘health’, ‘women’, ‘work’, ‘mdgs’, ‘peac-’, ‘govern-’, and ‘right’. Therefore, Topic 7 captures many ofthe issues that feature in the recent Sustainable DevelopmentGoals (SDGs) agenda [10].Figure 4 calculates the difference in probability of a wordfor the two topics, normalized by the maximum difference inprobability of any word between the two topics. The figuredemonstrates that while there is a much high probabilityof words, such as ‘econom-’, ‘trade’, and even ‘develop-’ being used to discuss Topic 2; words such as ‘climat-’, ‘govern-’, ‘sustain’, ‘goal’, and ‘support’ being used inassociation with Topic 7. This provides further support for theTopic 2 representing a more economistic view of internationaldevelopment, while Topic 7 relating to a broader sustainabledevelopment agenda.We also assess the relationship between topics in the STMframework, which allows correlations between topics to beexamined. This is shown in the network of topics in Figure 5.The figure shows that Topic 2 and Topic 7 are closely related,which we would expect as they both deal with internationaldevelopment (and share key words on development, such as‘develop-’, ‘povert-’, etc.). It is also worth noting that whileTopic 2 is more closely correlated with the Latin Americatopic (Topic 11), Topic 7 is more directly correlated with theAfrica topic (Topic 6).II. E
XPLAINING THE RHETORIC
We next look at the relationship between topic proportionsand structural factors. The data for these structural covariatesis taken from the World Bank’s World Development Indicators(WDI) unless otherwise stated. Confidence intervals producedby the method of composition in STM allow us to pick upstatistical uncertainty in the linear regression model.Figure 6 demonstrates the effect of wealth (GDP per capita)on the the extent to which states discuss the two internationaldevelopment topics in their GD statements. The figure showsthat the relationship between wealth and the topic proportions
Topic 1Topic 2 Topic 3Topic 4Topic 5 Topic 6 Topic 7Topic 8 Topic 9Topic 10Topic 11Topic 12 Topic 13 Topic 14Topic 15Topic 16
Fig. 5.
Network of topics . Correlation of topics. l − . − . . . . . . Effect of Wealth E x pe c t ed T op i c P r opo r t i on Topic 2Topic 7
Fig. 6.
Effect of wealth . Main effect and 95% confidence interval. linked to international development differs across Topic 2and Topic 7. Discussion of Topic 2 (economic development)remains far more constant across different levels of wealth thanTopic 7. The poorest states tend to discuss both topics morethan other developing nations. However, this effect is larger forTopic 7. There is a decline in the proportion of both topics ascountries become wealthier until around $30,000 when thereis an increase in discussion of Topic 7. There is a furtherpronounced increase in the extent countries discuss Topic 7at around $60,000 per capita. However, there is a decline inexpected topic proportions for both Topic 2 and Topic 7 forthe very wealthiest countries.Figure 7 shows the expected topic proportions for Topic 2and Topic 7 associated with different population sizes. The fig-ure shows a slight surge in the discussion of both developmenttopics for countries with the very smallest populations. Thisreflects the significant amount of discussion of developmentissues, particularly sustainable development (Topic 7) by thesmall island developing states (SIDs). The discussion of Topic2 remains relatively constant across different population sizes,with a slight increase in the expected topic proportion forthe countries with the very largest populations. However, with − . . . . . . . Effect of Population size E x pe c t ed T op i c P r opo r t i on Topic 2Topic 7
Fig. 7.
Effect of population . Main effect and 95% confidence interval. l − . . . . . . Effect of ODA E x pe c t ed T op i c P r opo r t i on Topic 2Topic 7
Fig. 8.
Effect of ODA . Main effect and 95% confidence interval.
Topic 7 there is an increase in expected topic proportion untilcountries have a population of around 300 million, after whichthere is a decline in discussion of Topic 7. For countrieswith populations larger than 500 million there is no effect ofpopulation on discussion of Topic 7. It is only with the verylargest populations that we see a positive effect on discussionof Topic 7.We would also expect the extent to which states discuss in-ternational development in their GD statements to be impactedby the amount of aid or official development assistance (ODA)they receive. Figure 8 plots the expected topic proportionaccording to the amount of ODA countries receive. Broadly-speaking the discussion of development topics remains largelyconstant across different levels of ODA received. There is,however, a slight increase in the expected topic proportions ofTopic 7 according to the amount of ODA received. It is alsoworth noting the spikes in discussion of Topic 2 and Topic 7for countries that receive negative levels of ODA. These arecountries that are effectively repaying more in loans to lendersthan they are receiving in ODA. These countries appear to raisedevelopment issues far more in their GD statements, which isperhaps not altogether surprising. l −10 −5 0 5 10 . . . . . . . . Effect of Democracy E x pe c t ed T op i c P r opo r t i on Topic 2Topic 7
Fig. 9.
Effect of democracy . Main effect and 95% confidence interval.
We also consider the effects of democracy on the expectedtopic proportions of both development topics using the PolityIV measure of democracy [11]. Figure 9 shows the extentto which states discuss the international development topicsaccording to their level of democracy. Discussion of Topic 2 isfairly constant across different levels of democracy (althoughthere are some slight fluctuations). However, the extent towhich states discuss Topic 7 (sustainable development) variesconsiderably across different levels of democracy. Somewhatsurprisingly the most autocratic states tend to discuss Topic7 more than the slightly less autocratic states. This maybe because highly autocratic governments choose to discussdevelopment and environmental issues to avoid a focus ondemocracy and human rights. There is then an increase in theexpected topic proportion for Topic 7 as levels of democracyincrease reaching a peak at around 5 on the Polity scale,after this there is a gradual decline in discussion of Topic7. This would suggest that democratizing or semi-democraticcountries (which are more likely to be developing countrieswith democratic institutions) discuss sustainable developmentmore than established democracies (that are more likely to bedeveloped countries).We also plot the results of the analysis as the differencein topic proportions for two different values of the effect ofconflict. Our measure of whether a country is experiencinga civil conflict comes from the UCDP/PRIO Armed ConflictDataset [12]. Point estimates and 95% confidence intervalsare plotted in Figure 10. The figure shows that conflict affectsonly Topic 7 and not Topic 2. Countries experiencing conflictare less likely to discuss Topic 7 (sustainable development)than countries not experiencing conflict. The most likelyexplanation is that these countries are more likely to devotea greater proportion of their annual statements to discussingissues around conflict and security than development. The factthat there is no effect of conflict on Topic 2 is interesting inthis regard.Finally, we consider regional effects in Figure 11. We usethe World Bank’s classifications of regions: Latin America andthe Caribbean (LCN), South Asia (SAS), Sub-Saharan Africa −0.03 −0.02 −0.01 0.00
Effect of Conflict l Topic 2 l Topic 7
Fig. 10.
Effect of conflict . Point estimates and 95% confidence intervals. l −0.15 −0.10 −0.05 0.00 0.05 0.10 0.15 Effect of Region l Topic 2: SAS l Topic 7: SAS l Topic 2: SSF l Topic 7: SSF l Topic 2: ECS l Topic 7: ECS l Topic 2: MEA l Topic 7: MEA l Topic 2: LCN l Topic 7: LCN l Topic 2: EAS l Topic 7: EAS l Topic 2: NAC l Topic 7: NAC
Fig. 11.
Regional effects . Point estimates and 95% confidence intervals. (SSA), Europe and Central Asia (ECS), Middle East andNorth Africa (MEA), East Asia and the Pacific (EAS), NorthAmerica (NAC). The figure shows that states in South Asia,and Latin America and the Caribbean are likely to discussTopic 2 the most. States in South Asia and East Asia andthe Pacific discuss Topic 7 the most. The figure shows thatcountries in North America are likely to speak about Topic 7least.The analysis of discussion of international developmentin annual UN General Debate statements therefore uncoverstwo principle development topics: economic development andsustainable development. We find that discussion of Topic2 is not significantly impacted by country-specific factors,such as wealth, population, democracy, levels of ODA, andconflict (although there are regional effects). However, wefind that the extent to which countries discuss sustainabledevelopment (Topic 7) in their annual GD statements variesconsiderably according to these different structural factors.The results suggest that broadly-speaking we do not observelinear trends in the relationship between these country-specificfactors and discussion of Topic 7. Instead, we find that thereare significant fluctuations in the relationship between factors such as wealth, democracy, etc., and the extent to which thesestates discuss sustainable development in their GD statements.These relationships require further analysis and exploration.III. C
ONCLUSION
Despite decisions taken in international organisations havinga huge impact on development initiatives and outcomes, weknow relatively little about the agenda-setting process aroundthe global governance of development. Using a novel approachthat applies NLP methods to a new dataset of speeches inthe UN General Debate, this paper has uncovered the maindevelopment topics discussed by governments in the UN, andthe structural factors that influence the degree to which gov-ernments discuss international development. In doing so, thepaper has shed some light on state preferences regarding theinternational development agenda in the UN. The paper morebroadly demonstrates how text analytic approaches can helpus to better understand different aspects of global governance.R
EFERENCES[1] D. Hudson and N. Dasandi, “The global governance of development:development financing, good governance and the domestication ofpoverty,”
Handbook of the International Political Economy of Gover-nance. Cheltenham: Edward Elgar , pp. 238–258, 2014.[2] A. Saith, “From universal values to millennium development goals: Lostin translation,”
Development and change , vol. 37, no. 6, pp. 1167–1199,2006.[3] C. Smith,
Politics and Process at the United Nations: The Global Dance .Boulder, CO: Lynne Rienner, 2006.[4] A. Baturo, N. Dasandi, and S. J. Mikhaylov, “Understanding statepreferences with text as data: Introducing the un general debate corpus,”
Research & Politics , vol. 4, no. 2, p. 2053168017712821, 2017.[5] M. E. Roberts, B. M. Stewart, D. Tingley, E. M. Airoldi et al. , “Thestructural topic model and applied social science,” in
Advances inNeural Information Processing Systems Workshop on Topic Models:Computation, Application, and Evaluation , 2013.[6] D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum,“Optimizing semantic coherence in topic models,” in
Proceedings ofthe conference on empirical methods in natural language processing .Association for Computational Linguistics, 2011, pp. 262–272.[7] D. Newman, J. H. Lau, K. Grieser, and T. Baldwin, “Automatic evalu-ation of topic coherence,” in
Human Language Technologies: The 2010Annual Conference of the North American Chapter of the Association forComputational Linguistics . Association for Computational Linguistics,2010, pp. 100–108.[8] J. Bischof and E. M. Airoldi, “Summarizing topical content with wordfrequency and exclusivity,” in
Proceedings of the 29th InternationalConference on Machine Learning (ICML-12) , 2012, pp. 201–208.[9] M. Roberts, B. Stewart, and D. Tingley, “stm: R package for structuraltopic models 2014,”
R package version 0.6 , vol. 21, 2016.[10] J. Waage, C. Yap, S. Bell, C. Levy, G. Mace, T. Pegram, E. Unterhalter,N. Dasandi, D. Hudson, R. Kock et al. , “Governing the un sustainabledevelopment goals: interactions, infrastructures, and institutions,”
TheLancet Global Health , vol. 3, no. 5, pp. e251–e252, 2015.[11] M. Marshall and K. Jaggers, “Political regime characteristics and tran-sitions, 1800-2003,”
Polity IV Project , 2003.[12] N. P. Gleditsch, P. Wallensteen, M. Eriksson, M. Sollenberg, andH. Strand, “Armed conflict 1946-2001: A new dataset,”