Global health science leverages established collaboration network to fight COVID-19
Stefano Bianchini, Moritz Müller, Pierre Pelletier, Kevin Wirtz
GGlobal health science leverages established collaboration networkto fight COVID-19
Stefano Bianchini , Moritz M¨uller , Pierre Pelletier , and Kevin Wirtz BETA, Universit´e de Strasbourg, France { s.bianchini,mueller,p.pelletier,kevin.wirtz } @unistra.fr February 2, 2021
Abstract
How has the science system reacted to theearly stages of the COVID-19 pandemic?Here we compare the (growing) internationalnetwork for coronavirus research with thebroader international health science network.Our findings show that, before the outbreak,coronavirus research realized a relatively smalland rather peculiar niche within the globalhealth sciences. As a response to the pandemic,the international network for coronavirus re-search expanded rapidly along the hierarchicalstructure laid out by the global health sciencenetwork. Thus, in face of the crisis, the globalhealth science system proved to be structurallystable yet versatile in research. The observedversatility supports optimistic views on the roleof science in meeting future challenges. How-ever, the stability of the global core-peripherystructure may be worrying, because it reduceslearning opportunities and social capital ofscientifically peripheral countries — not onlyduring this pandemic but also in its “normal”mode of operation.
Keywords
COVID-19 | Scientific Networks | Interna-tional Collaboration | Health Sciences
Introduction
International scientific collaboration is on the rise sincethe early 1980s [1]. The phenomenon is one aspect ofglobalization in science. International collaboration isobserved in particular among productive researchersfrom top-tier universities located in advanced nationalscientific systems [2, 3]. The gain is (more) excellentresearch [1, 3]. The tendency of ‘excellence-attracting-excellence’, however, entails the risk of increasing stratifi-cation not only within but also between national sciencesystems [2, 4]. In order to catch-up scientifically, or atleast not to fall behind, being well connected to the global knowledge flows has become a science policy im-perative in most countries.The paper at hand treats the outbreak of the novelcoronavirus Sars-CoV-2 in January 2020 as an exogenousshock to the international health science system. Ourmain interest is in the structural effects of the shock onthe international health science network. Recent empiri-cal studies have shown that the scientific contribution tocoronavirus related research from individual countrieshas been very uneven; often framing it as a scientificrace [5, 6]. [7] investigate the international coronaviruscollaboration network, and find that it has become more‘elitist’ with the pandemic.Our empirical analysis adds the insight that the con-tribution of countries to coronavirus research is closelyrelated to their contribution in the broader domain ofhealth sciences, and that the structure of the interna-tional coronavirus research network rapidly convergedto the structure of the global, international health sci-ence network. Before we discuss the implications of thisfinding, let us first turn to the empirical analysis.
Data
We proxy scientific activity in the health sciencesthrough peer-reviewed articles in journals indexed byMEDLINE. The restriction to MEDLINE indexed jour-nals ensures that papers in the sample fall into ourscope of biomedical research and are of (minimum) sci-entific quality. Coronavirus related papers are identifiedthrough a text search query suggested by PubMed Cen-tral Europe on the papers’ title, abstract, and MESHterms.The analysis is based on the papers’ submission datesto stay close to the actual research activity. Our workingsample includes papers submitted in the pre-COVID-19period (Jan.–Dec.2019), as well as in the early phase ofCOVID-19 (Jan.–Apr.2020). In detail, we downloadedall papers appearing in MEDLINE journals from thePubMed database as of December 2020. Due to thetime lag from submission to acceptance, the number ofpaper submissions in our sample starts to drop in May;a data artifact that may bias statistics. Therefore, weend the analysis period in April 2020. a r X i v : . [ ec on . GN ] J a n ig 1. Countries take on the same role in coronavirus research as in the global healthsciences. (A) Coronavirus and non-coronavirus papers by month. (B) Top 10 countries in coronavirus-relatedresearch during COVID-19. (C) Correlation of country rankings by coronavirus and non-coronavirus research bymonth. (D) Surface plot of a local regression of (log of) joint coronavirus papers during COVID-19 on (log of)joint coronavirus and non-coronavirus papers pre-COVID-19. (E) Country centrality based on s-coredecomposition of the coronavirus and non-coronavirus network by month.Our final working sample consists of 837,427 papers.Distinguishing coronavirus related research from non-coronavirus related research, and pre-COVID-19 period(Jan.–Dec.2019) from COVID-19 period (Jan.–Apr.2020)yields four categories: 614,141 non-coronavirus, pre-COVID-19 papers, 571 coronavirus, pre-COVID-19 pa-pers, 210,171 non-coronavirus, COVID-19 papers, and12,544 coronavirus, COVID-19 papers. Results
We first count papers per month (Fig 1A). In the pre-COVID-19 period, coronavirus research output is rela-tively stable, at roughly 50 papers per month. Startingwith the January 2020 outbreak, coronavirus researchgrows exponentially up to 8,159 submissions in April2020. Other research output is stable throughout, atabout 51,000 papers, and even increases slightly with thepandemic. Apparently, many (male) researchers tookadvantage of the lockdown period to finish off researchthat piled up already before COVID-19 [8]. Potentiallynegative effects of the pandemic due to frictions in the All data and scripts are available ongitHub https://github.com/P-Pelletier/Global-health-sciences-response-to-COVID-19 . research machinery, or crowding out of non-coronavirusresearch are not yet visible in this early period. National scientific production
Next, consider the contribution of individual countries tocoronavirus research. We employ a full-count assignmentscheme – i.e., each paper with at least one affiliationin a given country counts fully (one) for that country.The distribution is highly skewed: the 10 most prolificcountries generate 70 percent of coronavirus researchduring COVID-19; ranging from the US signing 2,686to Spain with 381 papers (Fig 1B). All these countriesare big players in the health sciences, but not all bringin a strong track record in coronavirus research.So, how important was coronavirus-specific researchcapacity compared to general health science capacity inthe early response? Simple linear regressions of coron-avirus papers in Jan.– Apr.2020 on pre-pandemic coron-avirus and other paper counts provide some indication(Table 1). All variables in the regression are transformedinto logs, and normalized to zero mean and unit varianceto facilitate direct interpretation of coefficient estimates.We find that pre-pandemic coronavirus research is highly(and significantly) correlated with coronavirus researchin January 2020, while other research is not. However, his pattern reverses within the next three months whencoronavirus research takes off. Note also that variationsin the outcome variable are increasingly well explained,with R from 0.7 in January to 0.9 in April 2020; mostlydue to prior non-coronavirus research. Table 1.
Coronavirus papers (2020) on priorpapers (2019).
Jan. ‘20 Feb. ‘20 Mar. ‘20 Apr. ‘20 Total
Coronav. ‘19 0.859 0.727 0.431 0.262 0.231(0.131) (0.077) (0.044) (0.036) (0.031)Others ‘19 0.008 0.198 0.564 0.732 0.765(0.067) (0.055) (0.044) (0.039) (0.035) R Notes: 200 observations. All variables in logs, with zeromean and unit variance. Standard errors in parenthesis.
By the same token, countries take rapidly very similarpositions in rankings on coronavirus papers as they doin rankings on other health papers (see Fig 1C). Wecalculate the rank correlation coefficient τ X of [10]. Itis similar to Kendall’s τ , but handles ties the sameas dominant relationships (i.e. entering 1 and not 0in the dominance matrix). This is favorable in caseof many ties in the rankings, as we have in corona pre-Covid-19 research, but does not really change the results.The 90 percent confidence interval around τ X has beenobtained through a traditional jackknife, or leave-one-out, approach (see e.g. [11]). Fig 1C shows the result.Until the outbreak in January 2020 (vertical dashed linein Fig 1C) rank correlations are rather low at around0.2. After the outbreak, the (monthly) scientific outputof countries in corona aligns with non-corona researchoutput until a (high) correlation of 0.8 in April 2020.We summarize the first part of the analysis. Beforethe pandemic, leading countries in the health scienceshave not necessarily led coronavirus research. Within afew months after the COVID-19 shock, leading countriesin the health sciences also led coronavirus research. Thesecond part of the analysis establishes the same dynamicfor the international collaboration networks. International scientific collaboration
We construct international scientific collaboration net-works based on co-authorship of papers in our sample. Anode corresponds to a country. Edge weights correspondto the number of joint papers (full-accounting scheme).First, we consider link formation. Are prior collabo-rations on coronavirus research (i.e. same topic), or anyprior ties (i.e. different topics) replicated for coronavirusresearch during the pandemic? Fig 1D provides someindication. The surface plot is obtained from a localregression with least-squares cross-validated bandwidthsfor the local constant estimator. It shows the expected(log of) joint coronavirus papers during COVID-19, con-ditional on (log of) joint coronavirus and non-coronaviruspapers pre-COVID-19. Looking at Fig 1D, we first notethat most country-pairs (dots) had no pre-COVID-19joint coronavirus research. Their number of joint coro-navirus papers during COVID-19 increases with thenumber of other joint papers before the pandemic. As we increase from zero prior coronavirus papers, (ex-pected) joint coronavirus papers during COVID-19 (thesurface in Fig 1D) also increases. Yet, it is evident thatbi-national collaboration on coronavirus related researchafter the shock largely reflects bi-national collaborationon non-coronavirus research before the shock.Consequently, countries’ network centrality in thecoronavirus research network aligns with their central-ity in the overall health science network. We capturenetwork centrality through (normalized) s-core decom-position [9]. The s-core ranges from 0 for isolates inthe network, to 1 for (highest) core members. Fig 1Eprovides the monthly s-cores on the non-coronavirus net-work (left panel), coronavirus network (middle panel),and the difference of s-cores in coronavirus and non-coronavirus networks (right panel). The figure showsthe 60 most central countries in the non-coronavirusnetwork and applies that same ordering across all threepanels. The remaining countries are highly peripheralin the considered networks. The left panel shows thatthe global network hierarchy is very stable. The core isformed by (mostly large) countries of the global north,China being the exception. Centrality in the coronavirusnetwork is more dynamic (middle panel). Pre-COVID-19, most countries are not involved in coronavirus re-lated collaborations and, hence, in the extreme periphery(white). The core of the coronavirus network includesonly a few countries leading other health sciences. SaudiArabia stands out as it is part of the core in the coro-navirus network, but peripheral in the overall healthscience network. (Variations in core membership overtime may be explained by lower research activity overallwhich leads to more erratic signals.) After the shock,the structure of the coronavirus network shifts rapidlytowards the hierarchy in health science at large. Thisis easily seen in the right panel that shows the differ-ence between the s-core centrality in the coronavirusand non-coronavirus network. Prior to the shock, s-coredifferences range from -1 in dark blue (for countries atthe extreme periphery in the coronavirus network andin the core in the other network), over 0 in white (sames-core in both networks), up to 1 in red (for countriesin the coronavirus network core and peripheral in thenon-coronavirus network). After the shock, the globalcore rapidly takes its role in coronavirus related research,and so does the global periphery (all countries appearin light colors with an s-core difference of around zeroin April 2020).Fig 2 pictures the networks in form of adjacency matri-ces. In order to facilitate a comparison across networks,adjacency matrices of the pre-pandemic coronavirus net-work (A), pre-pandemic non-coronavirus network (C),and pandemic coronavirus network (D) are all orderedby eigenvector centrality in (C).The pre-pandemic coronavirus network (A) is rela-tively sparse and best described by a block model (B):A regional middle east community and a communityof mostly developed countries, connected through USA,China, and Saudi Arabia. The block model is obtainedby minimizing the absolute difference of the number ofpapers in logs over blocks. This finding is robust to al-ternative algorithms. Essentially the same (community) RQTUNUGALBNJOREGYSAUCHNKORSWEAUSBELCHECANNLDFRAITADEUGBRUSA 01251020joint papers (log−scale)
A B l l l l l l l l l l l l l l l l c o rr . c oe ff i c i en t C DE
Fig 2. Coronavirus research network converges to global health science network.
A) to D)are (accumulated) adjacency matrices of A) and B) coronavirus, pre-COVID-19 network, C) non-coronavirus,pre-COVID-19 network, D) coronavirus, COVID-19 network. In A), C) and D) countries ordered by eigenvectorcentrality in non-coronavirus, pre-COVID-19 network (C). In B) countries ordered through generalized blockmodeling. E) correlation coefficient of the (monthly) coronavirus and non-coronavirus networks. All correlationsobtained are highly significant in QAP test (p < Conclusion
More elitist science in the COVID-19 era? Novel waysof organizing science? Not really. The structural shifttowards a highly hierarchical system in coronavirus re- lated research mostly picks up established structures ofthe broader, globalized health sciences. Thus, we con-jecture that coronavirus research during the pandemicwill further sustain, not break with, long-term trends ininternational collaborations. The pandemic feeds intoan ongoing global stratification in science that reducessystematically learning opportunities and social capitalof scientifically peripheral countries. Policy should there-fore aim at a more inclusive science landscape — notonly during crises, but even more in its ‘normal modeof operation’.
Acknowledgments
The research leading to the results of this paper hasreceived financial support from the CNRS through theMITI interdisciplinary program Enjeux scientifiques etsociaux de l’intelligence artificielle-AAP 2020 [reference:ARISE]. eferences
1. J Adams, The fourth age of research.
Nature , 557-560 (2013).2. BF Jones, S Wuchty, B Uzzi, Multi-universityresearch teams: Shifting impact, geography, andstratification in science.
Science , 1259-1262(2008).3. RK Pan, K Kaski, S Fortunato, World citationand collaboration networks: uncovering the roleof geography in science.
Sci Rep , 902 (2012).https://doi.org/10.1038/srep009024. E Horlings, P Van den Besselaar, Conver-gence in science growth and structure of world-wide scientific output, 1993-2008. Working Pa-per. The Hague: Rathenau Instituut; (2013)Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6064471 Accessed19.01.2021.5. S Aviv-Reuven, A Rosenfeld, Publication pat-terns’ changes due to the Covid-19 pandemic: Alongitudinal and short-term scientometric analy-sis, arXiv , 2010.02594, (2020)6. P Radanliev, D De Roure, R Walton, M VanKleek, O Santos, F Mantilla Montalvo, M LaTreall, What country, university or research in-stitute, performed the best on COVID-19? Bib-liometric analysis of scientific literature. arXiv ,2005.10082, (2020).7. CV Fry , X Cai, Y Zhang, CS Wagner, Consolida-tion in a crisis: Patterns of international collab-oration in early COVID-19 research. PLoSONE , e0236307 (2020).8. ML Bell, KC Fong, Gender differences infirst and corresponding authorship in publichealth research submissions during the Covid-19 pandemic.
AJPH , 159-163, (2021).https://doi.org/10.2105/AJPH.2020.305975.9. M Eidsaa, E Almaas, s-core network decompo-sition: A generalization of k-core analysis toweighted networks.
Phys. Rev. E , , 062819(2013).10. EJ Emond, DW Mason, A new rank correla-tion coefficient with application to the consensusranking problem. J. Multi-Crit. Decis. Anal. ,17-28 (2002).11. H Abdi, LJ Williams, “Jackknife” in Encyclope-dia of Research Design , NJ Salkind, Eds. (SAGE,2010), pp. 655-660.12. A Lancichinetti, F Radicchi, JJ Ramasco, S For-tunato, Finding Statistically Significant Com-munities in Networks. PLoS ONE 6(4): e18961(2011) doi:10.1371/journal.pone.0018961.13. VD Blondel, J-L Guillaume, R Lambiotte, ELefebvre (2008) Fast unfolding of communitiesin large networks. J. Stat. Mech. (2008) P10008.14. MD K¨onig, CJ Tessone, Y Zenou, Nestedness innetworks: A theoretical model and some appli-cations.
Theor. Econ. , , 695-752 (2014). 15. D Krackhardt, QAP partialling as a test of spu-riousness. Social Networks , , 171-186 (1987)., 171-186 (1987).